Nvidia Waves and Moats | Stratechery by Ben Thompson

Stratechery · Intermediate ·📰 AI News & Updates ·2y ago

Key Takeaways

The video discusses Nvidia's strategic position in the tech industry, focusing on the concepts of waves and moats as described by Ben Thompson on Stratechery.

Full Transcript

Nvidia waves and molz was published on Tuesday March 19th 2024 from The Wall Street Journal the Nvidia frenzy over artificial intelligence has come to this chief executive Jensen hang unveiled his company's latest chips on Monday in a sports arena at an event one analyst dubbed the quote AI Woodstock end quot customers partners and fans of the chip company descended on the sap Center the home of the National Hockey League San Jose Sharks for hang's keynote speech at an annual Nvidia conference that this year has a seating capacity of about 11,000 professional wrestling's WWE Monday Night Raw event took place there in February Justin Timberlake is scheduled to play the arena in may even Apple's much watched launch events for the iPhone and iPad didn't feel a venue this large at the center of the tech world's attention is Hong who has gone from a semiconductor CEO with a devoted following among video game enthusiasts to an AI impressario with broad enough appeal to draw thousands to a corporate event or as Nvidia research manager Jim Fawn put it on X Jensen Hong is the new tayor Swift I'm disappointed that the Wall Street Journal used this lead for their article about the event but not because I thought they should have talked about the actual announcements rather they and I had the exact same idea it was the spectacle even more than the announcements that was the most striking takeaway of Wang's keynote I do think contr to the Wall Street Journal that iPhone announcements are a relevant analogy Apple could have particularly in the early days of the iPhone easily filled an 11,000 seat Arena perhaps an even better analogy though was the release of Windows 95 Lance ulanoff wrote a retrospective on medium in 2001 it's hard to imagine an operating system by itself garnering the kind of near Global attention the Windows 95 launch attracted in 1995 journalists arrived from around the world on August 24 1995 settling on the lush green and still relatively small Microsoft campus in redm Washington there were tickets I still have mine featuring the original Windows start button quote unquote start was a major theme for the entire event granting admission to the invite only Carnival like event it was a relatively happy and innocent time in technology perhaps the last major launch before the internet dominated everything when a software platform and not blog post or a piece of Hardware could change the world one can Envision an article in 2040 looking back on the quote relatively happy and innocent time in technology end quot as we witnessed qu perhaps the last major launch before AI dominated everything end quot when a chip quote could change the world end quote perhaps retrospectives of the before times will be the last Refuge of human authors like myself gtc's of old what is interesting to a once- and future old fogy like myself who has watched multiple hango is how relatively focused this event was yes hang talked about things like weather and Robotics and Omniverse and cars but this was first and foremost a chip launch the Blackwell b200 generation of gpus with a huge chunk of the keynote talking about its various features and permutations performance Partnerships Etc I thought this stood in Market contrast to GTC 2022 when hang announced the hopper h100 generation of gpus that had a much shorter section on the chip /system architecture a by a lot of talk about potential use cases and a list of all the various libraries inid was developing for Cuda this was normal for GTC as I explained a year earlier quote this was frankly a pretty overwhelming keynote Liberty thinks this is cool robots and digital Twins and games and machine learning accelerators and data center scale Computing and cyber security and self-driving cars and computational biology and Quantum Computing and metaverse building tools and trillion parameter AI models yes please something hang emphasized in the introduction to the keynote though is that there is a rhyme and reason to this volume end quote I then went on an extended explainer of Cuda and why it was essential to understanding nvidia's long-term opportunity and concluded quote this is a useful way to think about nvidia's stack writing shaders is like writing assembly as in it's really hard and very few people can do it well Cuda abstracted that away into a universal API that was much more generalized in approachable it's the operating system in this analogy just like with operating systems though it is useful to have libraries that reduce duplicate of work amongst programmers freeing them to focus on their own programs so it is with Cuda and all those sdks that hang referenced those are libraries that make it much simpler to implement programs that run on Nvidia gpus this is how it is that a single keynote can cover quote robots and digital Twins and games and machine learning accelerators and data center scale Computing and cyber security and self-driving cars and computational bi ology and Quantum Computing and metaverse building tools in trillion parameter AI models end quote most of those are new or updated libraries on top of Cuda and the more that Nvidia makes the more they can make this isn't the only part of the Nvidia stack the company has also invested in networking and infrastructure both on the hardware and software level that allow applications to scale across an entire data center running on top of thousands of chips this two requires a distinct software plane which reinforces that the most important thing to understand about Nvidia is that it is not a hardware company and not a software company it is a company that integrates both end quote those gtc's were in retrospect put on by a company before it had achieved astronomical product Market fit sure Wang and Nvidia knew about Transformers in GPT models hang referenced his hand delivery of the first dgx supercomputer to open AI in 2016 in his opening remarks but notice how his handdrawn slide of computing history seems to exclude a lot of the stuff that used to be at GTC suddenly all that matter matters in those intervening years was Transformers I am not to be clear short changing hang or Nvidia in any way quite the opposite what is absolutely correct is that Nvidia had on their hands a new way of computing and the point of those previous gtcs was to experiment and push the world to find use cases for it today in this post chat GPT world the largest use case generative AI is abundantly clear and the most important message for hang to deliver is why Nvidia will continue to dominate that use case for the seeable future Blackwell so about Blackwell itself from Bloomberg Nvidia Corp unveiled its most powerful chip architecture at the annual GPU technology conference dubbed Woodstock for AI by some analysts chief executive officer Jensen Hong took the stage to show off the new blackwall Computing platform headlined by the b200 chip a 208 billion transistor Powerhouse that exceeds the performance of nvidia's already class leading AI accelerator the chip promises to extend nvidia's lead on Rivals at a time when major businesses and even nations are making AI development a priority after writing Blackwell's predecessor Hopper to surpass a valuation of more than $2 trillion Nvidia is setting high expectations with its latest product the first thing to know about Blackwell is that it is actually two dies fused into one chip with what the company says is full coherence what this means in practice is that a big portion of Blackwell's gains relative to Hopper is that it is simply much bigger here is hang holding a hopper and Blackwell chip up for comparison the Blackwell is bigger theme holds for the systems Nvidia is building around it the fully integrated gb200 platform has two blackw chips with one gray CPU chip as opposed to hopper one to1 architecture hang also unveiled the gb200 nvl 72 a liquid cooled rack size system that includes 72 gpus interconnected with a new generation of NV link which the company claims provides a 30X performance increase over the same number of h100 gpus for llm inference thanks in part to Dedicated hardware for Transformer based inference with a 25x reduction in cost and energy consumption one set of numbers I found notable were on these slides on Hopper to train a GPT 4 level model with 1.8 trillion parameters it takes 90 days 8,000 gpus and 15 megawatts on Blackwell to train the same siiz model it also takes 90 days but only 2,000 gpus and 4 megaw what is interesting to note is that both training runs take the same amount of time 90 days this is because the actual calculation speed is basically the same this makes sense because Blackwell is like Hopper fabbed on tsmc's 4 nanometer process and the actual calculations are fairly serial in nature and thus primarily governed by the underlying speed of the chip accelerated Computing though isn't about serial speed but rather parallelism and every new generation of chips combined with new networking enables ever greater amounts of effic efficient parallelism that keeps those gpus full that's why the big Improvement is in the number of gpus necessary and thus the over amount of power drawn that by extension means that a hopper sized Fleet of Blackwell gpus would be capable of building that much larger of a model and given that there appears to be a linear relationship between scale and model capability the path to GPT 6 and Beyond remains clear GPT 5 was presumably trained on Hopper gpus GPT 4 was trained on Amper A1 100s what is interesting to note is that there are reports that while the B100 costs twice as much as the h100 to manufacture Nvidia is increasing the price much less than expected this explains the somewhat lower margins the company's expecting going forward the report which has since disappeared from the internet perhaps because it was published before the keynote speculated that Nvidia is concerned about preserving its market share in the face of AMD being aggressive in price and its biggest customers trying to build their own chips there is needless to say tremendous incentives to find Alternatives particularly for inference Nvidia inference microservices Nim I think this provides useful context for another GGC announcement from the Nvidia developer Blog the rise in generative AI adoption has been remarkable catalyzed by the launch of open AI chat GPT in 2022 the new technology amassed over 100 million users within months and drove a surge of development activities across almost every industry by 2023 developers began poc's proof of Concepts using apis and open source Community models for meta mraw stability and more entering 2024 organizations are shifting their focus to fulls scale production deployments which involve connecting AI models to existing Enterprise infrastructure optimizing system latency and throughput logging monitoring and security among others this path to production is complex and timec consuming it requires specialized skills platforms and processes especially at scale Nvidia Nim part of Nvidia AI Enterprise provides a streamlined path for developing AI powered Enterprise applications and deploying AI models in production Nim is a set of optimized Cloud native microservices designed to shorten time to Market and simplifi deployment of generative AI models anywhere across cloud data center and GPU accelerated workstations it expands the developer pool by abstracting a the complexities of AI model development and packaging for production using industry standard apis Nims are preu containers that contain everything an organization needs to get started with model deployment and they are addressing a real need not just today but in the future Wang laid out a compelling scenario where companies use multiple NIMS in an agent type of framework to accomplish complex tasks think about what an AI API is an AI API is an interface that you just talk to and so this is a piece of software in the future that has a really simple API and that API is called human and these packages incredible bodies of software will be optimized and packaged and we'll put it on a website and you can download it you could take it with you you could run it in any Cloud you can run it in your own data center you can run in workstations if it fit and all you have to do is come to ai. nvidia.com we call it NV inference microservice but inside the company we all call it Nims okay just imagine you know one some someday there there's going to be one of these chat Bots and these chat Bots is going to just be in a Nim and you you'll uh you'll assemble a whole bunch of chatbots and that's the way software is going to be be built someday how do we build software in the future it is unlikely that you'll write it from scratch or write a whole bunch of python code or anything like that it is very likely that you assemble a team of AIS there's probably going to be a super AI that you use that takes the mission that you give it and breaks it down into an execution plan some of that execution plan could be handed off to another Nim that Nim would maybe uh understand sap the language of sap is abap it might understand service now and it go retrieve some information from their platforms it might then hand that result to another Nim who that goes off and does some calculation on it maybe it's an optimization software a combinatorial optimization algorithm maybe it's uh you know some just some basic calculator maybe it's pandas to do some numerical analysis on it and then it comes back with its answer and it gets combined with everybody else's and it because it's been presented with this is what the right answer should look like it knows what answer what what right answers to produce and it presents it to you we can get a report every single day at you know top of the hour uh that has something to do with a bill plan or some forecast or uh some customer alert or some bugs database or whatever it happens to be and we could assemble it using all these Nims and because these Nims have been packaged up and ready to work on your system so long as you have Nvidia gpus in your data center in the cloud this this Nims will work together as a team and do amazing things did you notice the catch Nims which Nvidia is going to both create itself and also spur the broader ecosystem to create with the goal of making them freely available will only run on Nvidia gpus this takes this article full circle in the before times I before the release of chat GPT Nvidia is building quite the free software motor on its gpus the challenge is that it wasn't entirely clear who was going to use all that software today meanwhile the use cases for those gpus is very clear and those use cases are happening at a much higher level than Cuda Frameworks I.E on top of models that combined with a massive incentives towards finding cheaper alternatives to Nvidia means both the pressure to and the possibility of escaping Cuda is higher than it has ever been even if it is still distant for lower level work particularly when it comes to training Nvidia has already started responding I think that one way to understand dgx cloud is that it is nvidia's attempt to capture the same Market that is still buying Intel server chips in a world where AMD chips are better because they already standardized on them Nims are another attempt to build lockin in the meantime though it remains noteworthy that Nvidia appears to not be taking as much margin with Blackwell as many may have expected the question as to whether they will have to give back more in future Generations will depend on not just their Chip's performance but also on redigging a software moat increasingly threatened by the very wave that made GTC such a spectacle for more analysis like this please like And subscribe and visit cher.com and listen to the sharptech podcast also check out the asianometry channel on YouTube to learn more about the technology changing our world

Original Description

Read the Article: https://stratechery.com/2024/nvidia-waves-and-moats/ Links: Stratechery: https://stratechery.com Sign up for Stratechery Plus: https://stratechery.com/stratechery-plus Sharp Tech website: https://sharptech.fm
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

The video analyzes Nvidia's position in the tech industry using the waves and moats framework, providing insights into the company's strategic advantages and challenges. This lesson is useful for understanding market trends and strategic positioning in the tech industry. By watching this video, viewers can gain a deeper understanding of the tech industry and improve their analysis skills.

Key Takeaways
  1. Read the article on Stratechery
  2. Understand the concept of waves and moats
  3. Analyze Nvidia's strategic position
  4. Identify market trends and challenges
💡 Nvidia's strategic position in the tech industry can be understood through the framework of waves and moats, which highlights the company's advantages and challenges.

Related Reads

📰
We Taught Machines to Talk. We Forgot to Teach Ourselves to Listen.
The development of fluent machines has outpaced human listening skills, eroding our capacity to understand each other
Medium · AI
📰
Is the AI bubble about to burst? A data scientist’s honest take
A data scientist shares their honest take on whether the AI bubble is about to burst, providing an informed perspective on the technology's potential and limitations
Medium · AI
📰
Is the AI bubble about to burst? A data scientist’s honest take
A data scientist shares their honest take on whether the AI bubble is about to burst, providing a grounded perspective on the technology
Medium · Machine Learning
📰
Is the AI bubble about to burst? A data scientist’s honest take
A data scientist shares their honest take on whether the AI bubble is about to burst, providing a grounded perspective on the technology
Medium · Data Science
Up next
Tackling Malaria in Africa with Technology at the Huawei ICT Competition
Huawei
Watch →