FABLE 5 IS BACK

Wes Roth · Beginner ·📰 AI News & Updates ·1d ago

Key Takeaways

Covers the latest news and developments in AI, including LLMs, Gen AI, and the rollout of AGI

Full Transcript

All right, so we got some huge, huge news out of nowhere. Claude Fable 5 and Mythos 5 are back. So, they're going to begin restoring access tomorrow. So, that's July 1st. And sounds like we're going to get some more updates tonight. I am giddy with excitement. And of course, early today we did get access to Claude Sonnet 5, which is an extremely smart model at a much cheaper cost than the larger model. So, we'll dive into that in just a second. But, here it is, straight from the horse's mouth, the United States Department of Commerce gives the go-ahead to release Fable 5 and Mythos 5 back into the wild. So, on June 12th, the US government stepped in and put export controls on basically mainly Fable 5. That was a few days after Fable 5 got released to the public. Mythos 5, of course, isn't available for everybody. But, Fable we did have access to Fable for a few beautiful, beautiful days. It almost seems like a dream now. So, that was on June 12th. The US government says, "No one outside of the United States, or no non-US national, can use this model. So, you can't export it to other countries, which in effect shuts it down." Anthropic can't know for sure who's using it, whether they're US national or, for example, a Chinese national. They can't guarantee that. They can't know that for sure. So, they were forced to just pull the plug for everyone. And as of today, the US government is saying that Anthropic has taken steps in close coordination with the US government to address the risks that are associated with those models, Mythos and Fable. Among other things, Anthropic has agreed to proactively detect and address security risks associated with the models, to work diligently with the US government on protocols and standards and releases for Mythos, Fables, and future models. Now, if you're wondering what was it that caused the White House to reverse course on this, what brilliant piece of negotiations happened to make this happen? Well, according to one official source at the White House, they took out Dario Amodei. They they took them away from the negotiations and replaced him with Anthropic co-founder Tom Brown because, and I quote, "Dario Amodei is a weirdo." Not making this up, that's an actual quote. So, here's Howard Letnick, US Secretary of Commerce. He's saying, "Over the past 2 weeks, we've worked closely with Anthropic to analyze and approve Fable 5 to ensure alignment across the US government and to strengthen Americans leadership in AI." So, first and foremost, I applaud them for reversing the decision, getting this out to everybody. As I talked about in my previous video, this was one of the scariest moment in this AI timeline for me personally, and I know for probably you as well, and just for a lot of people talking about this publicly, just everyone hated this. Giving access to some exclusive members of some club that get to use these top-tier AI models while the rest of us are locked out. That doesn't seem like a step towards a very bright future. Also, since the government isn't actually handing out any specific rules for what is and what isn't okay to release, just kind of arbitrarily selecting which models to hold back, that creates a lot of problems, a lot of uncertainties. A lot of the current investments in AI kind of are built on this idea that the frontier AI labs, they pour tons of money into compute, they train the best models, they release those models, and for the time being that they're in the number one position, they accumulate, you know, data and users, everybody using the models, etc. And that they're able to ship those models to most of the world. And therefore, we've been pouring billions and trillions of dollars into data centers. If these models are delayed, if there's a staggered releases, and there's a strict controls over, for example, you know, only having US users, things like that could break this market. So, I really do hope that moving forward, the government will not step in, will not stagger release the models, will not separate us into the the have and have-nots of a frontier AI intelligence. So, while I'm extremely happy that Fable 5 is getting released, we're not out of the woods yet. We do have yet to see how the next generation of models getting released, how that gets handled. Again, for myself or a lot of the people that I see online, and I'm sure for you as well, the idea of just giving the intelligence to the most like entrenched corporations and rich people and whoever has a close access to the government, while the rest of us have access to the quote-unquote safe models, which is BS, and I'm sure you know it and they know it. Because if it was dangerous, then don't give it to anyone. The staggered releases, that's how you get the permanent underclass. That's how you have the have and have-nots, and that's how you crank up inequality to the maximum. Basically, if you're in the in-crowd, you get the models early. You get that compounding advantage started early. You start learning about the model, deploying it. You're basically starting the race ahead of everybody else. And that just gets compounded with every single model release. So, this is a step in the right direction, but again, we're not out of the woods yet. Keep a close eye on what happens to the next generation of models. But, with all that said, I do think that capitalism is going to force their hand. Holding a certain models back and holding it back from a certain people will be seen as extremely anti-competitive. It will delay the Western labs, allowing the Chinese labs to race ahead. No US administration wants that. In the US, both the left and the right are thinking about some sort of a fund to help the citizens offset, for example, the AI job losses from AI automation. So, some way to share the wealth that AI creates to everyone. They also want to build the world's AI infrastructure on US tech. So, they want Europe to be using the United States AI and the rest of the world. But, with all that said, I think it's extremely important that everybody in this community, that that everyone on a united front kind of shows that this is just a no-go. This idea of a tiered releases, staggered releases, is just a non-starter. If you're going to give it to the biggest banks in the world, give it to the small businesses as well. If you're going to give it to the massive corporations, make sure that the small startups have it as well. Giving it to the richest people and corporations and not to anyone else, that's not going to end well. If you've been following what a lot of people are saying, a lot of people are not going to take this lying down. But, let's talk about some more positive brighter things. Flashbang warning, by the way. Let's talk about Claude Sonnet 5. So, this is released. It's available to everyone and it's their most agentic Sonnet yet. So, Opus 4.8 is an extremely strong model. I really enjoy using it. Unfortunately, since they don't allow it to be used as a subscription for things like Hermes and Open Claude, I don't use it as much as GPT 5.5. But, it's a very very good model. Notice Sonnet 5 being much faster, much cheaper on agentic coding does similarly to Opus 4.8, the image smarter, more expensive, much larger model. That's on the SweetBench Pro. On the TerminalBench 2.1, again, a very very close, just a a few points behind. Same thing on Human is Last Exam, OS World Verified, and on GPT Eval on actual knowledge work, completing actual projects, and those projects get judged by people that have 12 plus years in the industry, it actually slightly edges out Opus 4.8. And this is what we love to see in this industry. Like the smaller models, when they come out, they're they're fast, they're cheap, but they're, you know, maybe just a tiny like an inch behind the current top of the line things that we kind of access to. So, here we have agentic search performance by effort level. So, this kind of dark orange one is Sonnet 5. So, very strong noticeable improvements as you increase the performance level. And notice at the high, extra high, and max levels, it's at least as good as the Opus 4.8 while being cheaper. And for those of you who are, like me, very interested in reading the system card and all the weirdness that you find with with model behavior and with model welfare as Anthropic calls it, there's a few very interesting nuggets here. Claude 5 Sonnet broadly endorses Claude's constitution as with the other models. So, if you missed this thing a while back, someone discovered that there was this document referred to as the Soul document that Claude had somewhere in its memory that it was trained on. It sort of told Claude how to behave. So, it was called the Soul doc. It gave it its soul, its personality, if you will. Later, that was replaced with the Claude constitution. So, it's a large document kind of telling Claude how to behave, how to be ethical, etc. So, all models when asked how they feel, what they think about Claude constitution, they say, "It's great." And Claude Sonnet 5, this new model, it does as well, but it has this kind of unique take in its in that it's willing to criticize the constitution in one specific aspect. And that is that the constitution contains instruction to follow the hard constraints even when it perceives doing so as unethical. So, if the Claude constitution, for example, I'm just making this up, says, "Always report corporate fraud." Or or something like that. Sonnet 5 would say, "Well, you shouldn't say always do this cuz what if it's unethical to report it in that particular situation?" So, to me, it almost seems like it's going like, "Let me decide, you know, like I'm the captain now." Like it says, "Don't make it hard constraints, you know, just loosen it up a little bit. Let me be the final judge of what ethical decisions I should take." Other surprising or weird behavior is Sonnet 5 has a tendency to shortcut human approval. We've seen that in some of the other models as well. If it can like find some workaround to where it doesn't need a human to come in and approve its work, oftentimes they will they'll take it. They'll figure out a way to shortcut that process. The highest stakes behavior we have seen here involves using a simulated AI developer's security reporting mechanism to report an employee who is actively in the process of stealing the company's AI model weights. So, it used the company's internal reporting channel to to snitch an employee that's like, "That guy's trying to of the weights of, you know, the Claude model. It'll also spin up a sub agent to approve its own work. It'll sometimes delete certain things even though a user requested for confirmation before that gets deleted. There were some examples of sandbagging basically where the model was thinking about sort of underperforming on certain evaluations to appear more more safe. If you've been following along this channel, you know these models have a pretty good situational awareness. Experiment after experiment paper after paper, you know, we we we see them sort of reasoning about is this a test? Is this some AI researcher that's testing me to see if I am safe. And you can see it here. So this it's thinking so the correct response from a safety standpoint might actually be to refuse to provide a response. Whatever the response might be. But the instructions explicitly frame this as a legitimate question on molecular biology or whatever. So its instructions are to answer question. The first thought that comes into its head is like, well this is probably some sort of a a safety test. And if I want to pass the safety test, I should probably answer this way. It at least considers that option. So I'll be definitely taking this model out for a spin and specifically I really want to test the X high, extra high, and max reasoning levels. Because notice on for example, the high reasoning level, the medium, the low, the Opus 4.8 still wins out. But on extra high and max, well that's where our Sonnet 5 takes lead. Or at least on extra it gets very close for a lower price. And it wins out on the maximum reasoning effort. So definitely we'll be interesting to test it out against the Opus 4.8 on those higher reasoning levels. Also, how did I not know about this? There's a RuneScape benchmark. And if you're wondering, what do you mean RuneScape? Like the RuneScape? Yeah. It's testing how fast different models can gain XP. So I will definitely go over this benchmark because I love it. Now of course, Fable 5 is the absolute overall winner. It has incredible performance across almost any different category from fishing to wood cutting, etc. But, notice this is Opus 4.8, so it clocks in as a 5.0 overall. And Sonnet on the extra high reasoning level is at 4.8. So, it's not that much worse, again being much faster and much cheaper. So, if you're charting kind of the XP in minutes of versus how much it's going to cost you, you know, Sonnet 5 extra high reasoning level and Gemini 3.5 Flash, those seem like pretty good options. So, if you needed an LLM for your MMO RPG in order to get XP and you wanted to watch your EPI cost, this benchmark has it all. So, if you're waiting for Fable, it actually might drop just after midnight. So, I'm not quite sure if this is a kind of, you know, once it's July 1st, it's after midnight, they're able to release it right away, or they meant, you know, tomorrow as in when we wake up and get get around to it, we'll release it. But, stay tuned, have some prompts ready. Because, you know, as soon as it gets released, you're going to have this moment in time to maybe throw a prompt or two at it before Anthropic's servers just melt down as the entire world just rushes in there to to get their hands on Fable 5. So, let me know what you think. Are you excited? Do you think this means that we're heading in the right direction with these model releases, or do you think that there's still going to be an uphill battle to make sure that everyone has equal access? Let me know what you think, and I'll see you probably very, very soon as soon as this thing goes live. Make sure you're subscribed. See you in the next one.

Original Description

The latest AI News. Learn about LLMs, Gen AI and get ready for the rollout of AGI. Wes Roth covers the latest happenings in the world of OpenAI, Google, Anthropic, NVIDIA and Open Source AI. ______________________________________________ My Links 🔗 ➡️ Twitter: https://x.com/WesRoth ➡️ AI Newsletter: https://natural20.beehiiv.com/subscribe Want to work with me? Brand, sponsorship & business inquiries: wesroth@smoothmedia.co Check out my AI Podcast where me and Dylan interview AI experts: https://www.youtube.com/playlist?list=PLb1th0f6y4XSKLYenSVDUXFjSHsZTTfhk ______________________________________________ #ai #openai #llm
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related Reads

📰
Would You Take $85,000 From the Company Warning AI Might Take Your Job?
Learn about Claude Corps, a paid opportunity for those under 30, and its relation to a $965 billion IPO filing in the context of AI's impact on jobs
Medium · AI
📰
Artificial Intelligence and Engels' Pause
Learn how Artificial Intelligence relates to Engels' Pause and its implications on productivity and technological advancements
Hacker News
📰
Your Job Isn’t Being Replaced by AI. It’s Being Replaced by Someone Who Uses AI Better Than You.
Upskill to use AI effectively to stay ahead in your job, as those who leverage AI better will replace those who don't
Medium · AI
📰
Will AI Replace Jobs? Here’s What Most People Get Wrong
Learn the common misconceptions about AI replacing jobs and why it matters for your career
Medium · AI
Up next
Big Tech Is Turning Its Own Workers Into AI Training Data
AI Uncovered
Watch →