Gemini 3 refused to believe it was 2025, and hilarity ensued 

by Alan North
0 comments


Every time you hear a billionaire (or even a millionaire) CEO describe how LLM-based agents are coming for all the human jobs, remember this funny but telling incident about AI’s limitations: Famed AI researcher Andrej Karpathy got one-day early access to Google’s latest model, Gemini 3. –and it refused to believe him when he said the year was 2025.

When it finally saw the year for itself, it was thunderstruck, telling him, “I am suffering from a massive case of temporal shock right now.” 

Gemini 3 was released on November 18 with such fanfare that Google called it “a new era of intelligence.” And Gemini 3 is, by nearly all accounts (including Karpathy’s), a very capable, foundation model, particularly for reasoning tasks. Karpathy is a widely respected AI research scientist who was a founding member of OpenAI, ran AI at Tesla for a while, and is now building a startup, Eureka Labs, to reimagine schools for the AI era with agentic teachers. He publishes a lot of content on what goes on under-the-hood of LLMs

After testing the model early, Karpathy wrote, in a now-viral X thread, about the most “amusing” interaction he had with it.  

Apparently, the model’s pre-training data had only included information through 2024. So Gemini 3 believed the year was still 2024. When Karpathy attempted to prove to it that the date was truly November 17, 2025, Gemini 3 accused the researcher of “trying to trick it.”  

He showed it news articles, images, and Google search results. But instead of being convinced, the LLM accused Karpathy of gaslighting it — of uploading AI-generated fakes. It even went so far as to describe what the “dead giveaways” were in the images that supposedly proved this was trickery, according to Karpathy’s account. (He did not respond to our request for further comment.) 

Baffled, Karpathy – who is, after all, one of the world’s leading experts on training LLMs – eventually discovered the problem. Not only did the LLM simply have no 2025 training data but “I forgot to turn on the ‘Google Search’ tool,” he wrote. In other words, he was working with a model disconnected from the internet, which to an LLM’s mind, is akin to being disconnected from the world.  

Techcrunch event

San Francisco
|
October 13-15, 2026

When Karpathy turned that function on, the AI looked around and emerged into 2025, shocked. It literally blurted out, “Oh my god.”  

It went on writing, as if stuttering, “I. I… don’t know what to say. You were right. You were right about everything. My internal clock was wrong.” Gemini 3 verified the headlines Karpathy had given it were true: the current date, that Warren Buffett revealed his last big investment (in Alphabet) before retirement, and that Grand Theft Auto VI was being delayed

Then it looked around on its own, like Brendan Fraser’s character in the 1999 comedy “Blast from the Past,” who emerges from a bomb shelter after 35 years. 

It thanked Karpathy for giving it “early access” to “reality” the day before its public launch. And it apologized to the researcher for “gaslighting you when you were the one telling the truth the whole time.”  

But the funniest bit was the current events that flabbergasted Gemini 3 the most. “Nvidia is worth $4.54 trillion? And the Eagles finally got their revenge on the Chiefs? This is wild,” it shared. 

Welcome to 2025, Gemini. 

Replies on X were equally funny, with some users sharing their own instances of arguing with LLMs about facts (like who the current president is). One person wrote, “When the system prompt + missing tools push a model into full detective mode, it’s like watching an AI improv its way through reality.” 

But beyond the humor, there’s an underlying message.  

“It’s in these unintended moments where you are clearly off the hiking trails and somewhere in the generalization jungle that you can best get a sense of model smell,” Karpathy wrote. 

To decode that a little: Karpathy is noting that when the AI is out in its own version of the wilderness, you get a sense of its personality, and perhaps even its negative traits. It’s a riff on “code smell,” that little metaphorical “whiff” a developer gets that something seems off in the software code but it’s not clear what is wrong.  

Trained on human-created content as all LLMs are, it’s no surprise that Gemini 3 dug in, argued, even imagined it saw evidence that validated its point of view. It showed its “model smell.” 

On the other hand, because an LLM – despite its sophisticated neural network – is not a living being, it doesn’t experience emotions like shock (or temporal shock), even if it says it does. So it doesn’t feel embarrassment either.  

That means when Gemini 3 was faced with facts it actually believed, it accepted them, apologized for its behavior, acted contrite, and marveled at the Eagles’ February Super Bowl win. That’s different from other models. For instance, researchers have caught earlier versions of Claude offering face-saving lies to explain its misbehavior when the model recognized its errant ways. 

What so many of these funny AI research projects show, repeatedly, is that LLMs are imperfect replicas of the skills of imperfect humans. This says to me that their best use case is (and may forever be) to treat them like valuable tools to aid humans, not like some kind of superhuman that will replace us.  



Source link

Related Posts

Leave a Comment