This week I’m heading out on a family vacation—something about “work-life balance” and “spending time with humans instead of spreadsheets.” Since I won’t have time for my usual deep-dive analysis, I decided to replicate an experiment I ran in Q1: asking the top foundation models to provide their best economic predictions for the next six months.

Only this time, I made them fight.

Instead of just collecting their individual forecasts, I forced them into an AI debate club. I wanted to see what happens when ChatGPT, Claude, and Gemini actually have to defend their predictions against each other. Would they converge on consensus? Would they double down on their differences? Would they discover blind spots in real-time?

Spoiler alert: They did all three, and the results are more sobering than any individual forecast.

The Setup: Three AIs Walk Into an Economy…

I gave each model the same prompt: forecast real GDP growth, core PCE inflation, unemployment, and Fed policy through January 2026. No collaboration, no peeking at each other’s homework.

ChatGPT came back with a business-focused analysis emphasizing slow but stable growth around 1.0-1.3%. Classic consultant approach—here’s what it means for your P&L.

Claude delivered a technical deep-dive with a “stagflation-lite” thesis, projecting economic deceleration with persistent inflation. The kind of analysis you’d expect from a Fed economist with too much coffee.

Gemini built a scenario-based framework with probability weightings. Very McKinsey—three scenarios, weighted probabilities, clean executive summary.

Interestingly, they converged on the headline numbers: GDP growth around 1.0-1.6%, inflation persistent above 2.5%, unemployment drifting toward 4.5%. The kind of consensus that makes you wonder if they’re all reading the same data or if there’s actually signal in the noise.

Then I Made Them Fight

Here’s where it got interesting. I had them critique each other’s work, looking for gaps, contradictions, and missing pieces. That’s when the consensus started falling apart.

They identified six major blind spots that none of them had properly addressed:

Commercial Real Estate Crisis: 1,788 banks sitting on CRE exposures exceeding 300% of their equity. Over $1 trillion in CRE loans maturing in the next two years. This isn’t background noise—it’s a ticking time bomb.

Consumer Debt Stress: Credit card delinquencies hitting 11.35%, the highest since 2011. Sixty percent of cardholders now carrying persistent balances. Average interest rates above 22%. The consumer isn’t just tapped out—they’re underwater.

Global Trade Fragmentation: U.S. tariffs now averaging 20.8%, the highest since 1909. Global trade flows contracting 5.5-8.5%. This isn’t policy adjustment—it’s economic warfare.

Fiscal Policy Paralysis: Federal debt at $36.2 trillion with deficit projections hitting 7.3% of GDP through 2055. The government’s crisis response capacity isn’t just limited—it’s effectively neutered.

Energy Price Volatility: Twenty percent of global crude and LNG flows passing through the Strait of Hormuz amid ongoing Middle East tensions. Energy inflation isn’t transitory—it’s structural.

Corporate Margin Collapse: Fifty-seven percent of companies already reporting tariff-driven margin erosion. S&P 500 earnings falling 1-2% for every 5-point tariff increase. This isn’t cost absorption—it’s profit destruction.

The Fight: Where the AIs Actually Disagree

Armed with this research, I sent them back to revise their forecasts. That’s when the real disagreements emerged.

Gemini vs. ChatGPT on Recession Severity: Gemini went full catastrophist, projecting GDP contractions of -1.0% to -1.5% annualized in a 40% probability scenario. ChatGPT pushed back, arguing that economies can “limp along even with severe structural damage” and kept GDP positive at 0.75%. Gemini’s betting on cascading failure; ChatGPT’s betting on economic resilience.

Speed vs. Structure: Gemini treats this as a rapid probability shift—moving recession odds from 15% to 40% based on new information. ChatGPT argues these aren’t temporary shocks but permanent structural changes, adding a persistent “20% systemic risk scenario” that becomes a baseline feature. One sees crisis; the other sees transformation.

Fed Policy Timing: Here’s where they really clash. Gemini projects aggressive Fed cuts starting in September (50 basis points) followed by December (25 basis points), ending at 3.50-3.75%. ChatGPT sees “gradual cuts as growth fades” with a more modest endpoint of 3.53%. Gemini wants emergency monetary response; ChatGPT expects measured easing.

Corporate Earnings Impact: ChatGPT emphasizes that 57% of companies are already experiencing margin compression and warns of “earnings shocks in Q3-Q4.” Gemini focuses more on unemployment rising to 5.2% as the primary damage mechanism. One sees profit collapse driving the crisis; the other sees job losses.

China Trade Collapse: This is their biggest substantive disagreement. Gemini’s scenarios include 90% collapse in U.S.-China trade with “prohibitive tariffs.” ChatGPT acknowledges trade fragmentation but doesn’t model complete economic decoupling. Gemini’s betting on trade war; ChatGPT’s betting on managed deterioration.

The philosophical divide: **Gemini sees nonlinear cascade failure** where multiple systems collapse simultaneously. **ChatGPT sees structural degradation** where the economy adapts to permanently worse conditions. Both are dystopian, but in different ways.

What This Tells Us About AI and Economics

The experiment revealed something important about both artificial intelligence and economic forecasting.

The AIs are good at pattern recognition but initially missed systemic risks that don’t fit standard macroeconomic models. They needed external research to identify banking vulnerabilities, fiscal constraints, and trade disruption scale.

They’re susceptible to groupthink once exposed to the same information. Both revisions emphasized similar risk factors and narrative frameworks without independent validation.

But they’re also genuinely useful for stress-testing conventional wisdom. The debate process uncovered real blind spots in mainstream economic analysis.

The most valuable insight: we’re not facing a normal cyclical downturn. Multiple systems—banking, consumer finance, international trade, fiscal policy—are simultaneously stressed in ways that create nonlinear risk. The AIs initially missed this because they were trained on historical patterns that don’t include simultaneous system failures at this scale.

The Bottom Line

The revised consensus points to near-zero GDP growth, persistent inflation above 3%, unemployment climbing toward 5%, and aggressive Fed easing despite price pressures. More importantly, there’s now a 20-40% probability of cascading system failures that could trigger something worse than a typical recession.

This isn’t the kind of soft landing the market has been pricing in.

Whether this represents genuine analytical improvement or AI groupthink, I’ll leave for you to judge. What’s clear is that the robots found risks that none of them identified individually—risks that conventional forecasting models are systematically missing.

The irony isn’t lost on me: I went on vacation to avoid doing economic analysis, and instead discovered that letting AIs fight each other might produce better forecasts than most human analysts are generating.

I’ll be thinking about that while I’m on the beach. Assuming the commercial real estate market doesn’t collapse while I’m gone.

*Mike Lukianoff writes about data science, AI, economics and the restaurant industry. When he’s not making robots debate macroeconomic policy, he can be found explaining why your favorite restaurant is actually a sophisticated data operation.*

Breaking: The Data Just Got Worse

As I was finishing this post, Trump just fired the head of the Bureau of Labor Statistics, Dr. Erika McEntarfer, hours after the July jobs report showed only 73,000 jobs added and massive downward revisions of 258,000 jobs from May and June.

This isn’t just political theater. It’s a fundamental attack on the credibility of U.S. economic data that economists and markets have relied on for over a century.

The immediate implications:

• Markets are already spooked by both the weak data and the institutional breakdown

• Future economic data will be viewed through a political lens rather than as objective measurement

• The Fed’s decision-making becomes exponentially more difficult without trusted employment data

• Business planning becomes nearly impossible when you can’t trust government statistics

For the AI forecasts: This adds an entirely new dimension of uncertainty that none of the models accounted for. The good news is that in this age of plentiful data sources, the truth can still be triangulated. Private payroll processors like ADP, state unemployment insurance filings, real-time job posting data, and corporate earnings reports all provide independent measures of labor market health.

The downside is that this makes all economic analysis more expensive and complex. Instead of relying on a single, trusted government source, analysts now need to synthesize multiple private data streams to get an accurate picture. What used to cost pennies in government data access will now require expensive subscriptions to private data vendors.

This matters because government data belongs to all of us. We have the right to access accurate information from neutral agencies—it’s core to democracy and essential for economic stability in the modern era. When statistical agencies become political tools rather than neutral arbiters, everyone pays the price through increased uncertainty and higher analysis costs.

The robots didn’t predict this one, but they’ll have to adapt to it.

I’ll be thinking about that while I’m hiking with the family. Assuming we still have reliable unemployment statistics to tell us if anyone has jobs when I get back.

Read the original post and subscribe for updates here.

Share

Are you asking the right questions?

Find out how our agents and humans can help you make profitable decisions with industry-leading domain expertise and artificial intelligence purpose-built for the dining business.

© 2025 Signal Flare AI

Are you asking the right questions?

Find out how our agents and humans can help you make profitable decisions with industry-leading domain expertise and artificial intelligence purpose-built for the dining business.

© 2025 Signal Flare AI

Are you asking the right questions?

Find out how our agents and humans can help you make profitable decisions with industry-leading domain expertise and artificial intelligence purpose-built for the dining business.

© 2025 Signal Flare AI