Google's AI stack for developers

Google for Developers · Intermediate ·🧠 Large Language Models ·1y ago

Skills: LLM Foundations80%AI Workflow Automation60%

Key Takeaways

Builds immersive experiences with Google's AI stack, including JAX, TPUs, Gemini APIs, and AI Studio

Full Transcript

Hello everyone. My name is Ron Kashka and I lead developer relations at Google Deep Mind. Hi everyone. I'm Josh and we're very excited to welcome you to our session Google's AI stack for developers. We'll start by giving you a quick overview of Google's AI stack. Who's at IO for the first time? Can I see some hands up? Oh, okay. Welcome to Google IO. It's a pleasure to have you with us today. So, we'll start by giving you an overview of Google's end to end ecosystem of AI. And as you know, we've been leading the way in AI for decades since we open source TensorFlow in 2015 from when we we published our fielding research with Transformers in 2017 to Gemini. And we are now in the Gemini era. So we've been releasing a lot relentlessly as it's been called today. We've been shipping many features, many new products. And in our talk, we're actually going to give you an overview of everything that's new for developers throughout the AI stack. Our mission is to empower every developer and organization to harness the power of AI. And Google stack is so good and flexible because it combines very robust infrastructure with state-of-the-art research. And all of these enables realworld applications come to life that change entire fields, industries, and companies. We'll start by uh discussing foundation models touching upon our Gemini Gemma and some of our domain specific models. After foundation models, we'll take a look at AI frameworks that we use to build them. So we'll talk about Jax, which is really great for researchers. We'll talk about Keras, which is really amazing for applied AI. Later on, we'll even talk a little bit about the work we're doing with PyTorch. We'll also touch upon some developer tools for all types of experience from beginners to advanced. Then we'll talk a little bit about infrastructure. And this talk is about software, not hardware. Our hardware infrastructure is TPUs, which you've probably heard a lot of. But in this talk, I'll briefly talk about XLA, which is a machine learning compiler. And I'll talk about some of the work we're doing for inference. So making it possible to serve models at scale super efficiently with really cool new things with XLA um for Jax and PyTorch. Okay. And then one more thing to mention, I went too fast. So sorry about that. So a lot of this talk is about these huge foundation models. Towards the end of the talk, I'll talk about Google AI edge and we'll talk about deploying small models on device, which is also super important for many reasons. Awesome. Okay, let's start by exploring our core intelligence within our stack. We'll start with our Gemini models, which are our most capable and versatile model family. And our core philosophy here at Google is to provide developers with state-of-the-art models and tools that you can use to build powerful applications all throughout. And our Gemini models, they are known for being multimodel, have a long context window, and having very powerful reasoning. But we've built a variety of models for different use cases. So depending on what you're trying to build, Google will have a model that is tailored for your use case. And I would like to just give you a quick walkthrough of these models. I know you've heard it during the keynote, but just very quickly, Gemini 2.5 Pro, which is our most advanced model yet, especially for high complex tasks that benefit from deep reasoning. It's really good at coding and also more complex prompts. It leads the it leads coding benchmarks including webdev arena leaderboard and it's really our our most powerful model Gemini 2.5 flush which developers love it because of its efficiency and speed and it's now even better at almost every single dimension. So we improved all the benchmarks across reasoning coding multiod modality and also long context. Then we have our Gemini 2.0 know flash which is fast and cheap works fine and our Gemini Nano which is optimized for ondevice tasks and as you've heard we've been shipping relentlessly and I would like to give you just a quick highlight of everything that we've been shipping in AI studio and the Gemini API there's a talk tomorrow that I would like to invite you to attend which is by Sha Bazumalik a group product manager on on on Gemini API and Luciano Martinez our technical lead for Gemini API from Devril and they're going to do a deep dive into everything that's new within the Gemini API. So you definitely don't want to miss that session tomorrow. But for now just a glimpse to get you excited about what's new in AI Studio. We've built a new tab that is called build that instantly generates web apps and it's really cool because it enables developers and and builders alike to prototype very quickly with natural language. We have a new uh generative media experience in AI studio as well. And I'm going to demo all of this so you can see how it actually works. And we are always listening to the community. We listen to your feedback and we always build with developers in mind and that's why that some of these features were actually requested by community and that's what happened with the built-in new dashboard. You request it, we built it and we also have some new native audio and TTS support in AI studio. On the Gemini API side of things, we're also new capabilities for text to speech, allowing you to control emotion and style for a more expressive and dynamic audio. And it's both available on the live API and also on generary API for real generating audio. And some of the use cases here that we had in mind when we built this were um more dynamic audio books, some more engaging podcasts or even for those of you in customer support being uh producing more natural voices into your workflows. We also have enance tooling which is really cool because now you can use rounding with Google search also with code execution together in just one API call and URL context which you heard during the keynote which provides the model with depth content from web pages and since you can chain it to other tools it's actually really powerful to build search agents. So really cool stuff. Lastly, just to call out that we now offer J uh Gemini SDK support for MCP which reduces a lot of developer friction and simplifies building um agent capabilities. So you don't want to miss tomorrow's talk talk to learn more about this and Google AI studio who here uh uses Google AI studio can I see some hands okay awesome awesome we have a lot to chat after this session then Google AI studio is the perfect place for anyone to start developing with AI it's the simplest way to test the latest my models we typically bring them to Google AI studio. So you can start prototyping and playing with with the models and you don't need Google Cloud knowledge in order to set up um your environment. It's free of charge and you can create and test and save your prompts and there's also starter apps that will inspire you and that's exactly what I'm going to demo here today for you. I am going to show you some of the work that the team is is been doing. So if we go to Google AI Studio here, we go into build and mumble Jumble is something that we've literally just created and it's one of my favorite apps. So mumble jumbo is is one of those applications that with just natural language allows you to interact with the model so you can build those more dynamic and interactive audio experiences. So let's have a look. So first of all it uses 2.5 uh preview native audio dialogue. You can choose if it's easy to interrupt or hard to interrupt. And then you can come and pick your your voice, high pitch. So there's multiple things that you can play with. And then you can customize. I really like the fox. So let's go ahead and pick the fox. Let's get the cowboy. Happy because we're at Google IO. We're all super happy to be here. And let's get a microphone because I have one. Howdy, partner. Finley Flicker Fox here. ready to charm the boots off you. Hello. What's the weather like today in Mountain View? Well, now it's a fine day in Mountain View. Sunny and quite and pleasant. Perfect for mosying on out and about. I agree. Are there any hiking trails near the Google campus? Why, sure as shooting. There are plenty of trails near the Google campus. Perfect for a grand adventure and some fresh air. Well, now the best way to get to those trails is to head towards the foothills, following the marked paths. Y'all find them right along the way. You betcha. There are a few cozy cafes nearby. Perfect for a spell of rest and a good cup of coffee after your hike. Thank you so much. It was my pleasure, partner. Happy trails and enjoy your coffee. Thank you. So, as you could see, there's some really cool experiences that we're bringing into AI Studio. Audio is getting better, more natural experiences with voice. And in case you didn't notice, I even changed the language in how I interacted with the model. I spoke in Portuguese, my mother tongue, and it actually replied with very good information. So, what I did here, Josh is going to show you exactly what what's happening on the API side of things in just one second. I have one one prompt that I just want to very quickly show you. Sorry. Roll a dice twice. And what's the probability of the result being seven? Okay, let's just run this very quickly because I just want to show you one thing before I head it over to Josh. So as you can see thought summaries the model is actually showcasing how it how it thinks and you can see the summaries here we have the result and then basically what is available in the UI in AI studio is also available in the API and Josh is going to show you that right now. Yes. Okay. Great. So very briefly we have something called the Gemini developer API which is really great. It's the easiest possible way to develop with Google's foundation models. Uh the best place to get started is a.google.dev. There is a whole lot of capabilities in the API. It's got code execution. It's got function calling. I remember um sitting down with a team to build this from a blank piece of paper. Starting about two years ago, we had basically you could prompt it with text and now we have image understanding, video understanding, but now we can also generate images and videos that Joanna will show you later. Very very briefly, AI. google.dev has all of our developer documentation. There's lots of really great guides. There's information about the models, everything you need to get started. We also have the Gemini API cookbook. We have a link to this at the end. It's basically Google/Cookbook go.gle/cookbook. And this will take you to a whole slew of notebooks that the team has put together. And basically all these notebooks are endto-end examples that show you one thing that you might be interested in like what's the best way to do code execution, what's the best way to do function calling. You'll find that in the cookbook. I also very, very quickly want to show you how easy it is to get started with the API. So, basically in Google AI Studio, you don't need a credit card or anything like that. In about a minute, you can just click get API key, create your key. Now, if you're doing this for the first time, behind the scenes, this will automatically create a cloud project for you, but that detail is not important. Basically, now I have an API key and I'm ready to install the SDK and call the model. If you open up any of the notebooks in the cookbook, well, let's just say it's in a different directory here, but let's just say we've opened up eh, we'll just say we opened up this one, which is in the quick starts directory. And this shows you exactly what Joanna showed how to get the thinking summaries. You can add your API key in Google Collab. If you zoom in, you can hit add new secret. And in this particular notebook, it's called Google API key, but you could call it whatever you like. So you would add Google API key there. You would paste your key there. And now you're ready to run this. So if you do runtime and run all, you're calling the API and you're running all the examples. You can also directly in Google Collab, we have this thing where you can grab an API key straight inside Google Collab. So it's just really quick and easy to do. Okay, we can go back to the slides. So very very quickly as a recap, Gemini developer API is the easiest way to get started. It's super lightweight. It's fast to install and you can get up and running like honestly in about a minute. Okay, I will use the clicker. This is the flow to get started with Google AI Studio. Go to Google AI Studio, get your key. Um try one of the code examples on ai.google.dev or in the codebook. I see people taking pictures. That makes me happy. Please try this. We spent so much time on making this easy and um I hope it works for you. If not, please file an issue and we'll we'll get on it. This is the Genai SDK for the Gemini API. And this is something we've been rolling out. It's our latest SDK. We've been rolling it out gradually over the course of the last like six months. It's super userfriendly. It's really easy to use. Really, the only point I want to make here because I don't want to read the uh the code examples or the documentation to you. You can call the API in a few lines of code. Basically, add your key, select a model, write a prompt, you can go ahead and call it. You can also get access to advanced functionality in like one line of code. So if you'd like to get the thinking summaries that Joanna showed you, you can just add a thinking config, say include the thoughts. Now you've got the thinking summaries. And a good use case for this could be anytime you need to explain the model's reasoning. Particularly like you can imagine if you're building like an education app or a tutoring app, you can get the thinking summaries. In addition to really cool things that you can do with a single line of code, there's some more advanced stuff that you can do with the SDK as well. So I know there's a lot of code on this slide, but we've talked a lot about building agents and agentic experiences. In this example, you could imagine that you have a Python function on your laptop called like weather function and maybe that calls your own weather server to get the weather. What you can do is you can pass the definition of that function to the Gemini API in JSON including like the function name and the parameters that it takes. Then what you can do is you can write a prompt. So here the prompt happens to be what's the temperature in London. When you send the prompt and the function to the model, what the model will do is assess whether it makes sense to call that function based on your prompt. If so, it won't actually call it. But you can see in the function call.name that it returns and the function call.orgs, it returns the name of the function and the arguments to pass to it. So if you want, you're ready to call this function on your laptop. And we have code that you can copy and paste to do that. What's really cool, too, is this works with multiple functions at the same time. So you can imagine you have a function like schedule a meeting or something like that and you can very easily well with some work you can build an agent to actually do that. So function calling is super important and uh it it works extremely well. So now Joanna's going to talk about gen media. Awesome. So as you could see where you can build in the UI within AI studio also available in DPI and and also just building on the capabilities of our foundation models. Our core intelligence also encompasses a powerful suite of generative media models and they are designed to transform creative experiences ac across uh content generation uh across different modalities like images, video and audio. And I would also like you to to demo one of one of these new apps that we have in AI studio. So I'm going back to the laptop and I'm going to show you something that is uh that the team also just created. So AI studio also got a facelift and has some new new features and the new um the chat is interface still the same but you've seen the talk to Gemma alive live during the the keynote we have the new generative media console which allows you to to create and interact with our most creative models and then we have the build which is where all these new apps are coming to. So, I just wanted to show you very quickly. It's one. There we go. And then we basically can can choose here what are the the sounds that that we want. And this is all powered by by Lyria, our music generation [Music] model. And just for for the interest of time, I'm not going to keep playing it, but you can see some of the the capabilities of these models that we're bringing to AI Studio. We'll continue to the slides in this in the console. As you could see the you have access to our image generation, our video generation and music generation models with applets to get you started. And so that's a really cool thing for you to play with after this session. Some of our videos very realistic images with a really good understanding of real world physics and dynamics. Uh improved quality and and more and more capabilities coming to these models. And this is the example that I just showed. We've made LIA real time our interactive music generation model which powers music FXDJ. It's available in the API and AI studio. And you can check our API documentation for more information. This also allows everyone to interact to create and to perform generative music in real time. It's really it's really cool. You might remember that in the show before the first keynote, you might have seen this console. That's exactly why I wanted to show you this particular app in this session, but there's a lot more that that you can that you can try afterwards. And shifting the gears towards Gemma. Early this year we released Gemma 3 which is our most advanced model and it comes in four sizes 1 4 12 and 27B and offers developers the flexibility to optimize performing performance for diverse applications from efficient ones ondevice in inference to also scalable cloud deployment and in particular 42 and 27B is multimodel multilingual and has a long context window up to 128,000 tokens. And the fact that is available in more than 140 languages is really cool because 80% of our users are actually outside the United States. And you heard during the keynote as well that Med Gemma is our most capable collection of open models for multimodel medical text and image comprehension. It's a really good starting point for building medical application and it's available in 4B and 27B. You can download the model and adapted to your use case via prompting, fine-tuning or agentic workflows. And we also announced Gemma 3N. It's optimized for ondevice uh operation on phones, tablets, and laptops. And as you can see, the Gemmaverse is booming with all these new variants coming and being developed all the time. Chill Gemma, Dolphin Gemma, now Mad Gemma, Sign Gemma, so many different capabilities and and option that it's truly exciting to see. And one last thing that we are really excited about is the fact that we now we brought to to AI studio the possibility to deploy the Gemma models directly from AI studio into cloud run with one click. So you can use the Gen AI SDK to call it and just requires a twoline change. Change API key change base URL and you're set. That's the easiest deployment. And now Josh is going to tell you all about frameworks. Thanks. Okay, so we've talked a lot about foundation models, Gemini and Gemma. Now I'll talk a little bit about the frameworks that Google and the community use to build them. So a lot of cool stuff to cover. Uh let's start with the easiest possible way to get started to fine-tune a model. So in the developer keynote, Gus showed a version of Gemma that speaks emoji. And this is a language that he came up with his uh daughter. One way to do that is you could just prompt the model to speak emoji. And in a lot of cases, you can get away with the prompt. But if you have a very large amount of data or maybe you're building a really serious application like something in healthcare or medicine, what you can do is you can fine-tune the model to work even better with your data. And this a really really great thing about this is the truth is it sounds complicated, but it's not in practice. All you really need is a two column CSV file. And here what you're looking at is something with a prompt and a response. And if you've got a couple thousand rows using our framework, Keras and Keras is my favorite way by far of doing just applied AI. That means using AI in practice. You can tell I care a lot about both of us care a lot about healthcare and medicine. So there's a lot of wonderful like more than you could ever count um opportunities to do good in the world in those fields using technologies like this. You can train the model to do something really useful. So we have a really great tutorial about this. It's honestly about five key lines of code. You import a model of Gemma from Keroshub. This model is already instruction tuned. You can prompt it in a line of code and you can also do Laura fine tuning in about a line of code which also sounds fancy but it's not. So Keros is great for applied AI. If you're doing research, we have a really wonderful framework called Jax. Jax is a Python machine learning library. And I guess I have two things to say about it. One is that at the highest scales, Jax is the best place to go. So, it scales really easily to tens of thousands of accelerators. It's super powerful. We use it to build uh Gemini and Gemma. The community uses it to build a bunch of really large awesome foundation models as well. But one thing I like about Jax because I'm operating at a much simpler level. At its core, Jax, it's a Python machine learning library with a NumPy API. And when a new model comes out or new paper, it takes me a long time to understand it. What I like to do is basically implement it line by line in numpy. And I very carefully just understand the input, the output, the shapes, debug it just in numpy. And what's really wonderful if you use jacks, you can do that in numpy. There's transforms that you can read about. You can add a line of code like grad to get the gradients. You can add a line of code JIT to JIT compile your model. And now without changing anything else, you can run it on GPUs and TPUs. So Jack's core gives you this really good way to think very carefully through different techniques in machine learning and then when you're ready, you can scale them up without really changing your code. And that's really really awesome. On top of Jax, which is out of scope for this talk, there's a huge ecosystem of libraries. So there's great libraries for Google and the community for things like optimizers and checkpoints and implementing neural networks. You don't have to do that from scratch if you want, but just as I'm learning things, if you do it totally from scratch once, you really can. At least it helps me get my head around it, even though it takes a little while. Um, if you want to skip that part and you want to go straight to just show me a super optimized uh large language model implemented in Jax that's ready to scale to hundreds or even thousands of accelerators, then there's two really cool GitHub libraries that I point you to. Max Text, as you might guess, has reference implementations of large language models, and Max Diffusion has, as you might guess, reference implementations of models that you can use to generate beautiful images and stuff like that. Um, those can take some work, but we're we're working on making them super user friendly. Um, but right now they're designed for I I think the way I think about it is like well anyway, they take some work to uh to scale, but they're great. Um, using Jack, this just came out yesterday. I wanted to point you to new really amazing work from the community. And so we've been talking about Google's foundation models. This is a new foundation model that Stanford University just released. This is called Marin. Uh it happens to be built with Jackson TPUs which is which is great. But what's really special about it is that Marin is a fully open model. And so in addition to sharing the weights in the architecture, they've shared the data sets that they used to train it, the code they used to filter the data sets, the experiments uh that worked, the experiments that didn't work. So this is a really great foundation for open science and building these really cool models in the open. And um they train this model using Google's TPU research cloud. And this is a collection of TPUs that if you're a researcher, you can apply for access to. And um it's basically a free of charge uh cluster of TPUs that you can use to do really cool research like this. Very briefly, we talked about uh doing Laura training or excuse me, Laura post training in Keras. And now I'll show you a little bit about what we're working on for tuning in Jax. So we're working on a new library called Tunix. And the vision here, it's very very early stage. We're building it with the community. So we're working with researchers from uh these great universities. And the vision is to make it a really easy to use library for developers, but also a really good framework for researchers to implement like the latest post-training algorithms and jacks. And uh we're working on a bunch now. I think it's going to be really good. and stay tuned. So that's Tunix. In addition to the libraries, very briefly just want to talk about infrastructure. So TPUs, hardware out of scope. Um, but there's a really cool software package that I wanted to briefly mention called XLA. And XLA, it's basically a compiler for your machine learning code. The way this works is that when you use a library like Jax or Keras or TensorFlow or even PyTorch, what you're doing is you're writing code in Python and then somehow if you it gets compiled and optimized and run on GPUs and TPUs and XLA is the compiler that we use at Google to do that. It powers our entire production stack. It's used by some of the largest large language model builders in the world. And what it does is it takes your Python code, does a whole bunch of optimizations, and gets it ready to run on accelerators. One thing that's really cool about XLA is it's portable. So if you run an XLA, you're never locked into TPUs. You can use your exact same code to run on GPUs and other types of accelerators. So it's really it's really great for that. Uh we like it a lot. The important thing here is that PyTorch now also works with XLA. So if you're a PyTorch developer, it has a wonderful ecosystem, really great libraries. If you want, you can use PyTorch XLA to train your models on TPUs and get all the really good price performance benefits uh that come with that. Um in addition to training models, we're work we've done a great work with the VLM community. So now you can also serve your PyTorch models using VLM on TPUs and VLM is a super popular uh inference engine. We've added TPU support. So that's available to PyTorch developers now as well. And we're also working on adding Jack support to VLM. Here's some more really great work that's happening with community. So this is a new partnership between Red Hat, Nvidia, and Google. And it's working on a project called LLMD. And the vision here, this is for distributed serving. The vision here is to bring the very best of serving into open source and make it available to everybody and to have this work with both Jax and TPU, excuse me, Jax and PyTorch. So really cool new project. There's some more sophisticated stuff which you can check out and stay tuned for this. It's going to be really good. Okay, so at warp speed, we've talked about basically foundation models Google has uh different frameworks that we use to train them, different ways that you can serve them on the cloud. Now, let's briefly look at how you can deploy them on mobile devices. The way that you would do this is using Google AI edge, which is basically a framework for deploying machine learning models on things like Android, iOS, get them running in the browser, and also on embedded devices. And I know it's Google IO. A lot of you are mobile developers, so a lot of this is probably intuitive to you. But if you're coming from like I'm a Python machine learning developer, I I work in the back end. This is all like really awesome points. There's many good reasons why you might want to deploy on mobile. Like one is latency. So you can imagine if you're doing something like sign language recognition and maybe the user is holding up their hand and they're signing, you don't want to drop frames. And if you're sending those frames to a server on the cloud, unless you happen to have like the world's fastest internet connection, you're probably going to drop frames. But if you have that gesture recognition model running locally, you're not going to. So that's one huge advantage. Others, of course, are privacy. Data doesn't need to leave the device. I mean, a lot of this like offline. I know this is obvious to mobile folks, but if you're working on an airplane, you know, maybe you want to run your machine learning model there. Cost savings is a really important one, too. So if you're serving a model to lots of users on the cloud, you might be paying for the compute that you need to serve it. But of course, if it's running on the phone, the compute's happening locally, so you don't you don't need to bother with serving infrastructure. This is there's a lot of really cool new stuff in Google AI edge. Um, on our side, we've added support for things like the latest Gemma models. And by the way, this is for both classical machine learning, well, deep learning, which is now suddenly becoming classical, like things like gesture recognition, which were state-of-the-art like four years ago. Now that's classical ML because we're talking about large language models and generative AI. But you can run small large language models on device. We have a new really awesome community with hugging face and there's a lot of really smart people putting together models that are ready to run pre-optimize on device and we have a private preview. This is coming soon for uh AI edge portal which is basically a testing service. So you submit your model to a cloud service and it runs it on a fleet of real devices of different sizes just to verify that it works really well. So um if you're interested in mobile development, check out Google Edge. Google AI Edge. It's really cool. And with that, I'll hand it over to Joanna to talk about what's next. Awesome. Thank you, Josh. And you've heard it in the keynotes in a previous session with with Deis and Sergey. We're pushing the boundaries of what's possible to build with AI here at Google and Google Deep Mind. And we're really excited to bring all this innovation and put it in the hands of developers, in the hands of a community. and has never been a better time to build and co-create together. So, we really believe in a in a future where AI is changing various fields across scientific discovery, healthcare, and so many more. And we're going to achieve this radical abundance in a safe and responsible way. And we want to get there with you, with the community. So let's have a look at some of the domains that we believe that have a huge potential for developers and humanity at scale. Alpha Evolve, a Gemini powered coding agent for designing advanced algorithms, a self-improving coding agent that and we all know that large language models can can summarize documents. They they can generate code. You can even brainstorm with them. But with Alpha Evolve, we're really expanding these capabilities and we are targeting fundamental and highly complex problems on mathematics and coding. Alpha Evolve leverages Gemini Flash and Pro and it's one of the big promises for the future. Another one and I'm really excited about AI co-scientists. It's it's another um another scientific breakthrough that we we're seeing especially in the medical in in medicine and and research fields and our goal is to accelerate the speed of discovery and drug development. And with AI co-scientist you literally give a a scientist can give a research goal to the agent in natural language. And then the AI co co-scientist is designed to give you an overview an a hypothesis and a methodology. So in order to do so he uses a coalition of different agents that can that that work together and we have the generation agent review ranking evolution proximity and meta review that uh are all created within the inspiration and driven from the scientific method in itself. So it's another huge breakthrough and another domain that we'll continue to see evolving here at Google deep mind. And lastly, an area where we're seeing tremendous progress and we expect to continue having more future breakthroughs is in domain specific models and Gemini robotics models which are currently in private early access are advanced vision language uh action models with the addition of physical actions as a new output modality specifically for controlling robots. These models are robot agnostic and it uses multi-mbbodiment which is a technique that it can be used on anything from humanoid from humanoids to largecale industrial machinery. So this is really really exciting and Gemini robotics has been fine-tuned to be dextrous and that's why you can see so many different cool use cases and applications here on stage from folding an origami which is something a bit more complex and and just holding a sandwich a sandwich bag. So many new innovations are coming to you, are coming to life and we'll continue pushing the boundaries of what's possible across all these different domains. And now if you want to learn more, there's many ways that you can keep engaging with us that you can keep get giving us feedback. We're also uh active on on social media and we have a developer forum where you can interact directly with Googlers. So in order to learn more Josh, what do our developers have to do? We have just a few links for you. So no, no problem. But we talked about a lot of different tools in the stack. So I don't want to read the slide, but let me just point you to a couple highlights. AI.google.dev is the best place to go to get started with Gemini and Gemma. We have a cookbook for Gemini. We have a cookbook for Gemma. Uh Google AI Studio is AI.goolestudio.google.com. If you're interested in Jackson Keros, there are the links. If you happen to be interested in XLA, please check it out. open XLA.org. Google AI Edge is at the very bottom. If you're a mobile developer and you're interested in mobile deployment, um and there's just to be clear, there's so many amazing things in the Google AI stack we didn't have time to talk about today. Um Vertex has really amazing tools for enterprise developers, but please start here, have fun, and uh yeah, we're round after the talk. Yes, absolutely. And the developer relations team is just outside. We have some really cool demo stations that you can experience. Engage with the team. Check out the sessions tomorrow, especially on the Gemini API, Gemmaverse and robotics. We have a lot of cool stuff that we want to put in the hands of developers, many early access programs as well. Stay in touch, stay engaged, and let's co-create the future of AI together. Thank you so much. Thanks a lot. Thanks. [Applause]

Original Description

Explore core technologies, from the high-performance flexibility of JAX and TPUs to the power of Gemini APIs and the intuitive ease of AI Studio. Witness the stunning creative potential of image and video generation with models like Imagen and Veo, and learn to use these tools to build immersive experiences. Beyond tech, we'll celebrate the vibrant and inclusive AI community, demonstrating how developers like you are driving innovation and shaping the future of AI with Google DeepMind. Speakers: Joana Carrasqueira, Josh Gordon Check out the AI session track from Google I/O 2025 → https://goo.gle/io25-ai-yt Check out all the keynote sessions from Google I/O 2025 → https://goo.gle/IO25-Keynotes Check out all of the sessions from Google I/O 2025→ https://goo.gle/io25-sessions-yt Subscribe to Google for Developers → https://goo.gle/developers Event: Google I/O 2025 Products Mentioned: AI/Machine Learning

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Google for Developers · Google for Developers · 0 of 60

← Previous Next →

Developer Journey - Sunnyvale DSC Summit ‘19

Developer Journey - Sunnyvale DSC Summit ‘19

Google for Developers

How Google is working with students - Sunnyvale DSC Summit ‘19

How Google is working with students - Sunnyvale DSC Summit ‘19

Google for Developers

Starting your career in the Cloud - Sunnyvale DSC Summit ‘19

Starting your career in the Cloud - Sunnyvale DSC Summit ‘19

Google for Developers

The Solution Challenge - Sunnyvale DSC Summit ‘19

The Solution Challenge - Sunnyvale DSC Summit ‘19

Google for Developers

Firebase - Sunnyvale DSC Summit ‘19

Firebase - Sunnyvale DSC Summit ‘19

Google for Developers

Cloud Hero - Sunnyvale DSC Summit ‘19

Cloud Hero - Sunnyvale DSC Summit ‘19

Google for Developers

Panel discussion - Sunnyvale DSC Summit ‘19

Panel discussion - Sunnyvale DSC Summit ‘19

Google for Developers

The art of negotiation - Sunnyvale DSC Summit ‘19

The art of negotiation - Sunnyvale DSC Summit ‘19

Google for Developers

Courage to care, solve and share - Sunnyvale DSC Summit ‘19

Courage to care, solve and share - Sunnyvale DSC Summit ‘19

Google for Developers

Version 9 of Angular, Glass Enterprise Edition 2, path to DX deprecation, & more!

Version 9 of Angular, Glass Enterprise Edition 2, path to DX deprecation, & more!

Google for Developers

[DEPRECATING] Introducing a new series (Assistant for Developers Pro Tips)

[DEPRECATING] Introducing a new series (Assistant for Developers Pro Tips)

Google for Developers

Detecting memory bugs with HWASan, Bazel 2.1, Next ‘20 session guide, & more!

Detecting memory bugs with HWASan, Bazel 2.1, Next ‘20 session guide, & more!

Google for Developers

Why Podcast.app chose a .app domain name

Why Podcast.app chose a .app domain name

Google for Developers

Machine Learning Bootcamp Jakarta 2019

Machine Learning Bootcamp Jakarta 2019

Google for Developers

Android Studio 3.6, Android 11 Developer Preview, Kubeflow 1.0, & more!

Android Studio 3.6, Android 11 Developer Preview, Kubeflow 1.0, & more!

Google for Developers

[DEPRECATING] Importance of community (Assistant on Air)

[DEPRECATING] Importance of community (Assistant on Air)

Google for Developers

Why the Flutter team switched from .io to a .dev domain name

Why the Flutter team switched from .io to a .dev domain name

Google for Developers

3 website-building tips from .dev creators

3 website-building tips from .dev creators

Google for Developers

Why NimbleDroid chose a .app domain name

Why NimbleDroid chose a .app domain name

Google for Developers

Android Platform Codelab, Bazel 2.2, Maps Android Utility Library v1.0, & more!

Android Platform Codelab, Bazel 2.2, Maps Android Utility Library v1.0, & more!

Google for Developers

Google for Games Developer Summit: A free, digital experience for game developers

Google for Games Developer Summit: A free, digital experience for game developers

Google for Developers

Inspecting Home Graph (Assistant for Developers Pro Tips)

Inspecting Home Graph (Assistant for Developers Pro Tips)

Google for Developers

Google for Games Developer Summit Keynote

Google for Games Developer Summit Keynote

Google for Developers

Stadia Games & Entertainment presents: Keys to a great game pitch (Google Games Dev Summit)

Stadia Games & Entertainment presents: Keys to a great game pitch (Google Games Dev Summit)

Google for Developers

Empowering game developers with Stadia R&D (Google Games Dev Summit)

Empowering game developers with Stadia R&D (Google Games Dev Summit)

Google for Developers

Supercharging discoverability with Stadia (Google Games Dev Summit)

Supercharging discoverability with Stadia (Google Games Dev Summit)

Google for Developers

Stadia Games & Entertainment presents: Creating for content creators (Google Games Dev Summit)

Stadia Games & Entertainment presents: Creating for content creators (Google Games Dev Summit)

Google for Developers

Bringing Destiny to Stadia: A postmortem (Google Games Dev Summit)

Bringing Destiny to Stadia: A postmortem (Google Games Dev Summit)

Google for Developers

Live Captioning in Google Slides

Live Captioning in Google Slides

Google for Developers

[DEPRECATING] User engagement for the Google Assistant

[DEPRECATING] User engagement for the Google Assistant

Google for Developers

TensorFlow Dev Summit ‘20, Google for Games Dev Summit, Cloud AI Platform Pipelines, & much more!

TensorFlow Dev Summit ‘20, Google for Games Dev Summit, Cloud AI Platform Pipelines, & much more!

Google for Developers

Top 5 from the TensorFlow Dev Summit 2020

Top 5 from the TensorFlow Dev Summit 2020

Google for Developers

Developer Student Clubs 2019 Turkey Leads Summit

Developer Student Clubs 2019 Turkey Leads Summit

Google for Developers

Building simpler payment experiences | Google Pay Plugin for Magento 2

Building simpler payment experiences | Google Pay Plugin for Magento 2

Google for Developers

Become A Developer Student Club Lead

Become A Developer Student Club Lead

Google for Developers

Firebase Kotlin Extensions, ARM apps on the Android Emulator, Angular v9.1, & more!

Firebase Kotlin Extensions, ARM apps on the Android Emulator, Angular v9.1, & more!

Google for Developers

Test suite for Smart Home (Assistant for Developers Pro Tips)

Test suite for Smart Home (Assistant for Developers Pro Tips)

Google for Developers

Google Play updates, Bazel 3.0, Business Console for Google Pay, & more!

Google Play updates, Bazel 3.0, Business Console for Google Pay, & more!

Google for Developers

How to use error logs (Assistant for Developers Pro Tips)

How to use error logs (Assistant for Developers Pro Tips)

Google for Developers

Contact Center AI, Android Studio 4.1 Canary 5, TensorFlow QAT API, & more!

Contact Center AI, Android Studio 4.1 Canary 5, TensorFlow QAT API, & more!

Google for Developers

WebView DevTools, Kotlin meets gRPC, Flutter CodePen support, & more! (Episode 200)

WebView DevTools, Kotlin meets gRPC, Flutter CodePen support, & more! (Episode 200)

Google for Developers

Offline handling for Smart Home (Assistant for Developers Pro Tips)

Offline handling for Smart Home (Assistant for Developers Pro Tips)

Google for Developers

Android 11 Dev Preview 3, Google Fonts for Flutter, Shielded VM, & more!

Android 11 Dev Preview 3, Google Fonts for Flutter, Shielded VM, & more!

Google for Developers

Machine Learning Foundations: Ep #1 - What is ML?

Machine Learning Foundations: Ep #1 - What is ML?

Google for Developers

Flutter web support updates, BigQuery materialized views, Cloud Spanner emulator, & more!

Flutter web support updates, BigQuery materialized views, Cloud Spanner emulator, & more!

Google for Developers

Computer vision by building a neural network with TensorFlow | Machine Learning Foundations

Computer vision by building a neural network with TensorFlow | Machine Learning Foundations

Google for Developers

Machine Learning Foundations: Ep #3 - Convolutions and pooling

Machine Learning Foundations: Ep #3 - Convolutions and pooling

Google for Developers

Android 11 Beta plans, Flutter 1.17, Dart 2.8, & much more!

Android 11 Beta plans, Flutter 1.17, Dart 2.8, & much more!

Google for Developers

Machine Learning Foundations: Ep #4 - Coding with Convolutional Neural Networks

Machine Learning Foundations: Ep #4 - Coding with Convolutional Neural Networks

Google for Developers

Google Developers ML Summit

Google Developers ML Summit

Google for Developers

Real-world image classification using convolutional neural networks | Machine Learning Foundations

Real-world image classification using convolutional neural networks | Machine Learning Foundations

Google for Developers

Adobe XD support for Flutter, Architecture Framework, temporary closures with Places API, & more!

Adobe XD support for Flutter, Architecture Framework, temporary closures with Places API, & more!

Google for Developers

Machine Learning Foundations: Ep #6 - Convolutional cats and dogs

Machine Learning Foundations: Ep #6 - Convolutional cats and dogs

Google for Developers

Machine Learning Foundations: Ep #7 - Image augmentation and overfitting

Machine Learning Foundations: Ep #7 - Image augmentation and overfitting

Google for Developers

Announcing Firebase Live, Flutter Day, Java 11 on Google Cloud Functions, & more!

Announcing Firebase Live, Flutter Day, Java 11 on Google Cloud Functions, & more!

Google for Developers

Machine Learning Foundations: Ep #8 - Tokenization for Natural Language Processing

Machine Learning Foundations: Ep #8 - Tokenization for Natural Language Processing

Google for Developers

Android 11 Beta, Google Play Asset Delivery, Firebase Crashlytics SDK, & much more!

Android 11 Beta, Google Play Asset Delivery, Firebase Crashlytics SDK, & much more!

Google for Developers

Natural Language Processing: Using sequencing APIs in TensorFlow | Machine Learning Foundations

Natural Language Processing: Using sequencing APIs in TensorFlow | Machine Learning Foundations

Google for Developers

Build a sarcasm classifier using NLP and TensorFlow | Machine Learning Foundations

Build a sarcasm classifier using NLP and TensorFlow | Machine Learning Foundations

Google for Developers

AR Realism with the ARCore Depth API

AR Realism with the ARCore Depth API

Google for Developers

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related AI Lessons

How We Translate 300-Page Books Using Claude Without Hitting Token Limits

Learn how to translate long documents using Claude without hitting token limits by breaking them into overlapping chunks

Dev.to · 龚旭东

Building HITL Feedback RAG: Embeddings, Retrieval, and Reranking

Learn to build a Human-in-the-Loop (HITL) Feedback RAG system using embeddings, retrieval, and reranking to improve model performance

Building HITL Feedback RAG: Embeddings, Retrieval, and Reranking

Learn to build a Human-in-the-Loop (HITL) Feedback RAG system using embeddings, retrieval, and reranking to improve LLM performance

A simple way to test model fallbacks with RouterBase

Learn to test model fallbacks with RouterBase using a simple fallback wrapper and OpenAI-compatible API surface

Dev.to · routerbasecom

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)