AI That Can Prove It’s Right: Verification as the Missing Layer in AI — Carina Hong

The MAD Podcast with Matt Turck · Advanced ·🛡️ AI Safety & Ethics ·2mo ago

Skills: AI Alignment Basics80%

What if AI didn’t just sound right — but could prove it? In this episode of the MAD Podcast, Matt Turck sits down with Carina Hong, a 24-year-old former math olympiad competitor and Rhodes Scholar, and the founder/CEO of Axiom Math, to unpack how AxiomProver earned a perfect 12/12 on the Putnam 2025 and why formal verification (via Lean) may be the missing layer for reliable reasoning. Carina argues we’re entering a “math renaissance” where verified reasoning systems can tackle problems that currently take researchers months — and potentially push beyond math into verified code, hardware, and high-stakes software. They go inside the “generation + verification” loop, what it means to build AI that can be trusted, and what this approach could unlock on the road to superintelligent reasoning. Carina Hong LinkedIn - https://www.linkedin.com/in/carina-hong/ X/Twitter - https://x.com/CarinaLHong Axiom Math Website - https://axiommath.ai X/Twitter - https://x.com/axiommathai Matt Turck (Managing Director) Blog - https://mattturck.com LinkedIn - https://www.linkedin.com/in/turck/ X/Twitter - https://twitter.com/mattturck FirstMark Website - https://firstmark.com X/Twitter - https://twitter.com/FirstMarkCap Listen on: Spotify - https://open.spotify.com/show/7yLATDSaFvgJG80ACcRJtq Apple - https://podcasts.apple.com/us/podcast/the-mad-podcast-with-matt-turck/id1686238724 00:00 Intro 01:25 Why the World Needs an AI Mathematician 02:57 Scoring 12/12 on the World's Hardest Math Test (Putnam) 04:05 The First AI to Solve Open Research Conjectures 06:59 Does AI Solve Math in "Alien" Ways? (The Move 37 Effect) 08:59 "Lean": The Programming Language of Proofs Explained 10:51 How Axiom's Approach Differs from DeepMind & OpenAI 16:06 Formal vs. Informal Reasoning (And Auto-Formalization) 17:37 The AI "Reward Hacking" Problem 20:18 Building an AI That is 100% Correct, 100% of the Time 23:23 Beyond Math: Verified Code & Hardware Verification 25:12 The Brutal Reality of Competitive M

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: AI Alignment Basics

View skill →

Interpretable machine learning applications: Part 5

Interpretable machine learning applications: Part 5

GenAI news from Weights & Biases CEO, Lukas Biewald

GenAI news from Weights & Biases CEO, Lukas Biewald

Weights & Biases

Responsible AI Winners, 2020 PyTorch Summer Hackathon

Responsible AI Winners, 2020 PyTorch Summer Hackathon

Near Real-Time Analytics to GenAI Centralized Observability | Amazon Web Services

Near Real-Time Analytics to GenAI Centralized Observability | Amazon Web Services

Amazon Web Services

Kiro Hooks | Event-Driven Automation for Your IDE | Amazon Web Services

Kiro Hooks | Event-Driven Automation for Your IDE | Amazon Web Services

Amazon Web Services

Get Started with Raven AGI

Get Started with Raven AGI

Related AI Lessons

Google’s top differential-privacy scientist tells the EU its data-sharing plan can be reversed in two hours

Google's top scientist warns the EU that its data-sharing plan can be reversed in 2 hours, compromising user privacy

The Next Web AI

Cybersecurity in the Age of AI: Opportunities, Threats, and the Battle for Digital Trust

Learn about the intersection of AI and cybersecurity, including opportunities, threats, and the battle for digital trust, and why it matters for protecting against AI-powered attacks

Medium · Cybersecurity

From Exams to Escape Rooms: How We Learned to Test AI

Learn how to test AI models using innovative methods inspired by exams and escape rooms

Medium · Data Science

The AI Model That Changed the Economics of Hacking…And What It Means for Investment Firms

Discover how AI models are transforming the economics of hacking and what it means for investment firms' cybersecurity strategies

Medium · Cybersecurity

Chapters (12)

Intro

1:25 Why the World Needs an AI Mathematician

2:57 Scoring 12/12 on the World's Hardest Math Test (Putnam)

4:05 The First AI to Solve Open Research Conjectures

6:59 Does AI Solve Math in "Alien" Ways? (The Move 37 Effect)

8:59 "Lean": The Programming Language of Proofs Explained

10:51 How Axiom's Approach Differs from DeepMind & OpenAI

16:06 Formal vs. Informal Reasoning (And Auto-Formalization)

17:37 The AI "Reward Hacking" Problem

20:18 Building an AI That is 100% Correct, 100% of the Time

23:23 Beyond Math: Verified Code & Hardware Verification

25:12 The Brutal Reality of Competitive M

Why Language Models Are Inherently Biased #ai #podcast

The MAD Podcast with Matt Turck