Learn best practices for multimodal prompting using Google's Gemini model family!

DeepLearningAI · Intermediate ·✍️ Prompt Engineering ·1y ago

Skills: Multimodal LLMs90%Prompting Basics70%LLM Engineering60%

Enroll now: https://bit.ly/3YPKAUa Introducing Large Multimodal Model Prompting with Gemini, a new short course built in collaboration with Google Cloud, and taught by Erwin Huizenga, Developer Advocate for Generative AI at Google Cloud. Large Multimodal Models (LMMs) represent a significant evolution from language models by integrating different data modalities, allowing for more comprehensive outputs based on varied input types such as text, images, and video. For LMMs, prompt structure becomes even more important. For example, placing text inputs, such as a patient’s medical history, before image inputs like an X-ray, can improve the model’s interpretation. Conversely, for tasks like image captioning, leading with the image may yield better results. In this course, you'll explore best practices for multimodal prompting, and learn how to properly set parameters for optimized results. Additionally, you’ll learn how to integrate Gemini with external APIs and databases using function calling, with the ability to infuse your applications with real-time data and dynamic content. In detail, you’ll explore: - Introduction to Gemini Models: Learn the differences and use cases for Gemini Nano, Pro, Flash, and Ultra. Understand how to select optimal models based on capability, latency, and cost. - Multimodal Prompting and Parameter Control: Learn techniques for structuring effective text-image-video prompts. Fine-tune key parameters like temperature, top_p, top_k to control model creativity vs determinism. - Best Practices for Multimodal Prompting: Get hands-on experience with prompt engineering for Gemini multimodal models, and role assignment, task decomposition, and formatting. - Creating Use Cases with Images: Build engaging multimodal applications like interior design assistants and receipt itemization tools. - Developing Use Cases with Videos: Implement "needle in the haystack" semantic video search powered by Gemini's large context window. - Integrating

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from DeepLearningAI · DeepLearningAI · 0 of 60

← Previous Next →

Forward and Backward Propagation (C1W4L06)

Forward and Backward Propagation (C1W4L06)

deeplearning.ai's Heroes of Deep Learning: Yuanqing Lin

deeplearning.ai's Heroes of Deep Learning: Yuanqing Lin

deeplearning.ai's Heroes of Deep Learning: Ruslan Salakhutdinov

deeplearning.ai's Heroes of Deep Learning: Ruslan Salakhutdinov

deeplearning.ai's Heroes of Deep Learning: Yoshua Bengio

deeplearning.ai's Heroes of Deep Learning: Yoshua Bengio

deeplearning.ai's Heroes of Deep Learning: Pieter Abbeel

deeplearning.ai's Heroes of Deep Learning: Pieter Abbeel

deeplearning.ai's Heroes of Deep Learning: Ian Goodfellow

deeplearning.ai's Heroes of Deep Learning: Ian Goodfellow

deeplearning.ai's Heroes of Deep Learning: Andrej Karpathy

deeplearning.ai's Heroes of Deep Learning: Andrej Karpathy

Using an Appropriate Scale (C2W3L02)

Using an Appropriate Scale (C2W3L02)

Gradient Checking (C2W1L13)

Gradient Checking (C2W1L13)

Gradient Checking Implementation Notes (C2W1L14)

Gradient Checking Implementation Notes (C2W1L14)

Learning Rate Decay (C2W2L09)

Learning Rate Decay (C2W2L09)

Understanding Mini-Batch Gradient Dexcent (C2W2L02)

Understanding Mini-Batch Gradient Dexcent (C2W2L02)

Mini Batch Gradient Descent (C2W2L01)

Mini Batch Gradient Descent (C2W2L01)

The Problem of Local Optima (C2W3L10)

The Problem of Local Optima (C2W3L10)

Exponentially Weighted Averages (C2W2L03)

Exponentially Weighted Averages (C2W2L03)

Tuning Process (C2W3L01)

Tuning Process (C2W3L01)

Understanding Exponentially Weighted Averages (C2W2L04)

Understanding Exponentially Weighted Averages (C2W2L04)

Bias Correction of Exponentially Weighted Averages (C2W2L05)

Bias Correction of Exponentially Weighted Averages (C2W2L05)

Gradient Descent With Momentum (C2W2L06)

Gradient Descent With Momentum (C2W2L06)

Normalizing Activations in a Network (C2W3L04)

Normalizing Activations in a Network (C2W3L04)

Hyperparameter Tuning in Practice (C2W3L03)

Hyperparameter Tuning in Practice (C2W3L03)

Adam Optimization Algorithm (C2W2L08)

Adam Optimization Algorithm (C2W2L08)

RMSProp (C2W2L07)

RMSProp (C2W2L07)

Fitting Batch Norm Into Neural Networks (C2W3L05)

Fitting Batch Norm Into Neural Networks (C2W3L05)

Why Does Batch Norm Work? (C2W3L06)

Why Does Batch Norm Work? (C2W3L06)

Batch Norm At Test Time (C2W3L07)

Batch Norm At Test Time (C2W3L07)

Softmax Regression (C2W3L08)

Softmax Regression (C2W3L08)

Deep Learning Frameworks (C2W3L10)

Deep Learning Frameworks (C2W3L10)

Neural Network Overview (C1W3L01)

Neural Network Overview (C1W3L01)

Training Softmax Classifier (C2W3L09)

Training Softmax Classifier (C2W3L09)

Why Deep Representations? (C1W4L04)

Why Deep Representations? (C1W4L04)

Gradient Descent For Neural Networks (C1W3L09)

Gradient Descent For Neural Networks (C1W3L09)

Neural Network Representations (C1W3L02)

Neural Network Representations (C1W3L02)

TensorFlow (C2W3L11)

TensorFlow (C2W3L11)

Activation Functions (C1W3L06)

Activation Functions (C1W3L06)

Explanation For Vectorized Implementation (C1W3L05)

Explanation For Vectorized Implementation (C1W3L05)

Getting Matrix Dimensions Right (C1W4L03)

Getting Matrix Dimensions Right (C1W4L03)

Understanding Dropout (C2W1L07)

Understanding Dropout (C2W1L07)

Building Blocks of a Deep Neural Network (C1W4L05)

Building Blocks of a Deep Neural Network (C1W4L05)

Why Non-linear Activation Functions (C1W3L07)

Why Non-linear Activation Functions (C1W3L07)

Computing Neural Network Output (C1W3L03)

Computing Neural Network Output (C1W3L03)

Backpropagation Intuition (C1W3L10)

Backpropagation Intuition (C1W3L10)

Train/Dev/Test Sets (C2W1L01)

Train/Dev/Test Sets (C2W1L01)

Deep L-Layer Neural Network (C1W4L01)

Deep L-Layer Neural Network (C1W4L01)

Random Initialization (C1W3L11)

Random Initialization (C1W3L11)

Other Regularization Methods (C2W1L08)

Other Regularization Methods (C2W1L08)

Normalizing Inputs (C2W1L09)

Normalizing Inputs (C2W1L09)

Derivatives Of Activation Functions (C1W3L08)

Derivatives Of Activation Functions (C1W3L08)

Parameters vs Hyperparameters (C1W4L07)

Parameters vs Hyperparameters (C1W4L07)

Vectorizing Across Multiple Examples (C1W3L04)

Vectorizing Across Multiple Examples (C1W3L04)

What does this have to do with the brain? (C1W4L08)

What does this have to do with the brain? (C1W4L08)

Dropout Regularization (C2W1L06)

Dropout Regularization (C2W1L06)

Vanishing/Exploding Gradients (C2W1L10)

Vanishing/Exploding Gradients (C2W1L10)

Basic Recipe for Machine Learning (C2W1L03)

Basic Recipe for Machine Learning (C2W1L03)

Bias/Variance (C2W1L02)

Bias/Variance (C2W1L02)

Forward Propagation in a Deep Network (C1W4L02)

Forward Propagation in a Deep Network (C1W4L02)

Weight Initialization in a Deep Network (C2W1L11)

Weight Initialization in a Deep Network (C2W1L11)

Numerical Approximations of Gradients (C2W1L12)

Numerical Approximations of Gradients (C2W1L12)

Regularization (C2W1L04)

Regularization (C2W1L04)

Why Regularization Reduces Overfitting (C2W1L05)

Why Regularization Reduces Overfitting (C2W1L05)

More on: Multimodal LLMs

View skill →

INSTALL NEW UNCENSORED FaceGen Ai WebUI LOCALLY in 1 CLICK!

INSTALL NEW UNCENSORED FaceGen Ai WebUI LOCALLY in 1 CLICK!

Google Veo 3 Tutorial: How to create AI Videos in Flow, Gemini or Google Vids?

Google Veo 3 Tutorial: How to create AI Videos in Flow, Gemini or Google Vids?

AI Tool Journey

NVIDIA Clara Guardian Virtual Patient Assistant

NVIDIA Clara Guardian Virtual Patient Assistant

NVIDIA Developer

Building Multimodal Search and RAG

Building Multimodal Search and RAG

Midjourney Trick: Consistent Character in Different Images

Midjourney Trick: Consistent Character in Different Images

Ollama Multimodal: EASILY setup Llava locally & Integrate API

Ollama Multimodal: EASILY setup Llava locally & Integrate API

Related AI Lessons

The missing layer in prompt engineering: thinking quality

Learn to improve prompt engineering by focusing on thinking quality, a crucial missing layer in current practices

Dev.to · Julien Avezou

The Complete Guide to Prompt Engineering: Unlock the Full Potential of AI

Learn to craft effective prompts to unlock the full potential of AI and achieve game-changing outputs

Medium · ChatGPT

Structuring Prompt Guide: Reusable Templates That Actually Work

Learn to structure effective prompts for AI using reusable templates and improve your prompt engineering skills

Medium · JavaScript

Prompt Engineering Room Walkthrough Notes | TryHackMe

Learn the basics of prompt engineering and its techniques to improve your AI interactions

Medium · Cybersecurity

Why AI keeps lying to you