A Month Ago, I Challenged LLM Models – They Had to Predict Whether a Fort Knox Audit Would Happen

2025-04-08 | By Mariusz Jażdżyk

A Month Ago, I Challenged LLM Models – They Had to Predict Whether a Fort Knox Audit Would Happen

A month ago, I challenged several LLM models to predict whether a Fort Knox audit would take place. Interestingly, all models predicted the course of events, though they differed in tone and approach.

Does AI truly infer accurately, or is it simply winning due to its cautiousness? Let’s revisit the test...

I published a comparison of responses from several AI models – both free and paid – to questions about a possible audit of gold at Fort Knox and its potential impact on the market. They differed in the way they formulated their answers, but all models agreed that the audit was unlikely to occur. Source: Link

Today, we know: [for now] the audit hasn’t happened.

Some models signaled possible market volatility, even short-term increases. And indeed – the dollar weakened to 1/3000th of an ounce of gold. This is no longer just "volatility" – it’s a breakthrough.

What does this say about the models themselves?

Generative AI is used in many areas – from coding to writing texts. We mainly evaluate models based on style, coherence, and creativity. However, we rarely test them as predictive tools that analyze the dynamically changing reality.

Yet, true intelligence should be about more than just reproducing the past – it should understand where the world is heading.

Conclusions: - The audit didn’t happen – the models were right. - Gold increased – some models predicted the growth potential. - Deepseek and OpenAI’s O1 reasoning stood out for originality and accuracy.

It was a quick, cheap, and interesting experiment. Should we continue testing models for real-time predictions? In my opinion – not necessarily. Models see more and more, but it is humans who make the decisions. Even if they’re wrong, it’s crucial they don’t lose.


Author: Mariusz Jażdżyk

Lecturer at Kozminski University, author of the book “Chief Data Officer,” specializing in building data-driven organizations. He supports startups in the practical implementation of data strategies and AI solutions.