Faster LLM Function Calling — Dynamic Routes

James Briggs · Intermediate ·🧠 Large Language Models ·2y ago

Key Takeaways

This video demonstrates how to use Semantic Router's dynamic routes to speed up LLM function calling, particularly for AI agents, using OpenAI's GPT-3.5 Turbo as an example, but also supporting Cohere and Llama.cpp for local deployment.

Full Transcript

today we're going to take a look at Dynamic routes in the semantic router library now Dynamic routes expand what we can do with this Library by quite a lot unlike a static route a dynamic route is able to dynamically generate the parameters based on a particular input that can then be taken into whatever you want to do with those parameters so the main use case here is function calling now the good thing with Dynamic routes is that they can generate this output but they're still very fast just like our static routes so they are fundamentally the same object and I think what would be best is to just take a look at how they differ which is not by a huge amount okay so we're going to start in the docks of the semantic router Library I'm going to go over to Dynamic routes and I'm just going to open that notebook in collab it's now on version 0.050 this is actually no longer necessary so I need to remove that so I'm going to insall the library first then I'm going to come down to here and I'm going to initialize a static route now these are just basic static routes and the reason we're loading those first is because we want to see what the difference is between these and a dynamic route so yes we initialize those and then we're going to initialize our route layer now the initialization of a route layer whether you have dynamic or static routes or both is exactly the same it doesn't change and again we can use go here we can use open AI there's also now a new Fast embed encoder as well if you want to run the embedding part locally I'm going to use open AI because we will also want to use the open AI llm as well so API key enter this and there we go okay we do also support that go here llm as well and soon enough you will also have local llms but for now I'm just going to use open AI it's the it's the easiest okay so we can test that it's working and this is purely static routs let's see how we might create a dynamic route so here is how we would set up our Dynamic route you don't need to do it like this directly you can actually soorry this is the actual definition of our Dynamic route what I'm doing before here is creating this the function schema that is required for our Dynamic route so the function schema I can just show you what that looks like maybe quickly so if I run this it looks like this this is our function schema now what I'm doing here with the get schema function here is I am taking an existing function and I'm formatting it in a particular way so we're using the Sphinx do string format here we're adding a lot of description as to how exactly this function should actually be used so finds the current time and a specific time zone okay that's like the description okay what does this function actually do we need this for our Dynamic gr to understand you know what this does and how it should be used then we specify okay we have our time zone the type of our time zone is a string and the description for it is this okay so the time zone to find the current timing it should be a valid time zone from the I time zone database and then we give some examples and then we specify do not put the place name like Rome or New York you must provide this particular IA format so we do that that is then going to okay we provide this format and it's going to give us the time in that particular place now we run that function get time that we just created here we put that through our get schema function here we get our function schema output and then this is what defines the difference between a static route and a dynamic route we simply pass in this function schema to our to our route definition so if I remove that this is now a static route if I add that back in it's a dynamic route okay and that's all there is to it so we have our new Dynamic route I'm going to add it to our route layer and then I'm going to ask a Time related question okay and that should trigger the time or the get time Dynamic route and it should hopefully get the right inputs for that route okay and we see that it does so we have function call and we have these inputs for our function so then I can connect this up to the function that we created so say out equals this and I want to say get time and it's out and it's the function call and then and I see that's the this okay let's see what we get okay six 16 and basically you can expect this to work with any function C that you'd expect an LM to normally be able to handle because we're using an LM here so what we're really doing is we're setting up that kind of like agentic workflow where an agent agent will decide what to do and then generate the input for whatever it decides to do we're taking away the decision part on you know which route to take or which tool to use and we're using semantics to make that decision but we still rely on the llm to generate the function call itself which you know we've seen it it does and it does pretty quickly now that's it for this video I hope this has been useful and interesting so thank you very much for watching and I will see you again in the next one bye

Original Description

LLM function calling can be slow, particularly for AI agents. Using Semantic Router's dynamic routes, we can make this much faster and scale to thousands of tools and functions. Here we see how to use it with OpenAI's GPT-3.5 Turbo, but the library also supports Cohere and Llama.cpp for local deployments. In semantic router there are two types of routes that can be chosen. Both routes belong to the Route object, the only difference between them is that static routes return a Route.name when chosen, whereas dynamic routes use an LLM call to produce parameter input values. For example, a static route will tell us if a query is talking about mathematics by returning the route name (which could be "math" for example). A dynamic route can generate additional values, so it may decide a query is talking about maths, but it can also generate Python code that we can later execute to answer the user's query, this output may look like "math", "import math; output = math.sqrt(64). ⭐ GitHub Repo: https://github.com/aurelio-labs/semantic-router/ 📌 Code: https://github.com/aurelio-labs/semantic-router/blob/main/docs/02-dynamic-routes.ipynb 🔥 Semantic Router Course: https://www.aurelio.ai/course/semantic-router 👋🏼 AI Consulting: https://aurelio.ai 👾 Discord: https://discord.gg/c5QtDB9RAP Twitter: https://twitter.com/jamescalam LinkedIn: https://www.linkedin.com/in/jamescalam/ 00:00 Fast LLM Function Calling 00:56 Semantic Router Setup for LLMs 02:20 Function Calling Schema 04:04 Dynamic Routes for Function Calling 05:51 How we can use Faster Agents
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from James Briggs · James Briggs · 0 of 60

← Previous Next →
1 Stoic Philosophy Text Generation with TensorFlow
Stoic Philosophy Text Generation with TensorFlow
James Briggs
2 How to Build TensorFlow Pipelines with tf.data.Dataset
How to Build TensorFlow Pipelines with tf.data.Dataset
James Briggs
3 Every New Feature in Python 3.10.0a2
Every New Feature in Python 3.10.0a2
James Briggs
4 How-to Build a Transformer for Language Classification in TensorFlow
How-to Build a Transformer for Language Classification in TensorFlow
James Briggs
5 How-to use the Kaggle API in Python
How-to use the Kaggle API in Python
James Briggs
6 Language Generation with OpenAI's GPT-2 in Python
Language Generation with OpenAI's GPT-2 in Python
James Briggs
7 Text Summarization with Google AI's T5 in Python
Text Summarization with Google AI's T5 in Python
James Briggs
8 How-to do Sentiment Analysis with Flair in Python
How-to do Sentiment Analysis with Flair in Python
James Briggs
9 Python Environment Setup for Machine Learning
Python Environment Setup for Machine Learning
James Briggs
10 Sequential Model - TensorFlow Essentials #1
Sequential Model - TensorFlow Essentials #1
James Briggs
11 Functional API - TensorFlow Essentials #2
Functional API - TensorFlow Essentials #2
James Briggs
12 Training Parameters - TensorFlow Essentials #3
Training Parameters - TensorFlow Essentials #3
James Briggs
13 Input Data Pipelines - TensorFlow Essentials #4
Input Data Pipelines - TensorFlow Essentials #4
James Briggs
14 6 of Python's Newest and Best Features (3.7-3.9)
6 of Python's Newest and Best Features (3.7-3.9)
James Briggs
15 Novice to Advanced RegEx in Less-than 30 Minutes + Python
Novice to Advanced RegEx in Less-than 30 Minutes + Python
James Briggs
16 Building a PlotLy $GME Chart in Python
Building a PlotLy $GME Chart in Python
James Briggs
17 How-to Use The Reddit API in Python
How-to Use The Reddit API in Python
James Briggs
18 How to Build Custom Q&A Transformer Models in Python
How to Build Custom Q&A Transformer Models in Python
James Briggs
19 How to Build Q&A Models in Python (Transformers)
How to Build Q&A Models in Python (Transformers)
James Briggs
20 How-to Decode Outputs From NLP Models (Python)
How-to Decode Outputs From NLP Models (Python)
James Briggs
21 Identify Stocks on Reddit with SpaCy (NER in Python)
Identify Stocks on Reddit with SpaCy (NER in Python)
James Briggs
22 Sentiment Analysis on ANY Length of Text With Transformers (Python)
Sentiment Analysis on ANY Length of Text With Transformers (Python)
James Briggs
23 Unicode Normalization for NLP in Python
Unicode Normalization for NLP in Python
James Briggs
24 The NEW Match-Case Statement in Python 3.10
The NEW Match-Case Statement in Python 3.10
James Briggs
25 Multi-Class Language Classification With BERT in TensorFlow
Multi-Class Language Classification With BERT in TensorFlow
James Briggs
26 How to Build Python Packages for Pip
How to Build Python Packages for Pip
James Briggs
27 How-to Structure a Q&A ML App
How-to Structure a Q&A ML App
James Briggs
28 How to Index Q&A Data With Haystack and Elasticsearch
How to Index Q&A Data With Haystack and Elasticsearch
James Briggs
29 Q&A Document Retrieval With DPR
Q&A Document Retrieval With DPR
James Briggs
30 How to Use Type Annotations in Python
How to Use Type Annotations in Python
James Briggs
31 Extractive Q&A With Haystack and FastAPI in Python
Extractive Q&A With Haystack and FastAPI in Python
James Briggs
32 Sentence Similarity With Sentence-Transformers in Python
Sentence Similarity With Sentence-Transformers in Python
James Briggs
33 Sentence Similarity With Transformers and PyTorch (Python)
Sentence Similarity With Transformers and PyTorch (Python)
James Briggs
34 NER With Transformers and spaCy (Python)
NER With Transformers and spaCy (Python)
James Briggs
35 Training BERT #1 - Masked-Language Modeling (MLM)
Training BERT #1 - Masked-Language Modeling (MLM)
James Briggs
36 Training BERT #2 - Train With Masked-Language Modeling (MLM)
Training BERT #2 - Train With Masked-Language Modeling (MLM)
James Briggs
37 Training BERT #3 - Next Sentence Prediction (NSP)
Training BERT #3 - Next Sentence Prediction (NSP)
James Briggs
38 Training BERT #4 - Train With Next Sentence Prediction (NSP)
Training BERT #4 - Train With Next Sentence Prediction (NSP)
James Briggs
39 FREE 11 Hour NLP Transformers Course (Next 3 Days Only)
FREE 11 Hour NLP Transformers Course (Next 3 Days Only)
James Briggs
40 New Features in Python 3.10
New Features in Python 3.10
James Briggs
41 Training BERT #5 - Training With BertForPretraining
Training BERT #5 - Training With BertForPretraining
James Briggs
42 How-to Use HuggingFace's Datasets - Transformers From Scratch #1
How-to Use HuggingFace's Datasets - Transformers From Scratch #1
James Briggs
43 Build a Custom Transformer Tokenizer - Transformers From Scratch #2
Build a Custom Transformer Tokenizer - Transformers From Scratch #2
James Briggs
44 3 Traditional Methods for Similarity Search (Jaccard, w-shingling, Levenshtein)
3 Traditional Methods for Similarity Search (Jaccard, w-shingling, Levenshtein)
James Briggs
45 3 Vector-based Methods for Similarity Search (TF-IDF, BM25, SBERT)
3 Vector-based Methods for Similarity Search (TF-IDF, BM25, SBERT)
James Briggs
46 Building MLM Training Input Pipeline - Transformers From Scratch #3
Building MLM Training Input Pipeline - Transformers From Scratch #3
James Briggs
47 Training and Testing an Italian BERT - Transformers From Scratch #4
Training and Testing an Italian BERT - Transformers From Scratch #4
James Briggs
48 Faiss - Introduction to Similarity Search
Faiss - Introduction to Similarity Search
James Briggs
49 Angular App Setup With Material - Stoic Q&A #5
Angular App Setup With Material - Stoic Q&A #5
James Briggs
50 Why are there so many Tokenization methods in HF Transformers?
Why are there so many Tokenization methods in HF Transformers?
James Briggs
51 Choosing Indexes for Similarity Search (Faiss in Python)
Choosing Indexes for Similarity Search (Faiss in Python)
James Briggs
52 Locality Sensitive Hashing (LSH) for Search with Shingling + MinHashing (Python)
Locality Sensitive Hashing (LSH) for Search with Shingling + MinHashing (Python)
James Briggs
53 How LSH Random Projection works in search (+Python)
How LSH Random Projection works in search (+Python)
James Briggs
54 IndexLSH for Fast Similarity Search in Faiss
IndexLSH for Fast Similarity Search in Faiss
James Briggs
55 Faiss - Vector Compression with PQ and IVFPQ (in Python)
Faiss - Vector Compression with PQ and IVFPQ (in Python)
James Briggs
56 Product Quantization for Vector Similarity Search (+ Python)
Product Quantization for Vector Similarity Search (+ Python)
James Briggs
57 How to Build a Bert WordPiece Tokenizer in Python and HuggingFace
How to Build a Bert WordPiece Tokenizer in Python and HuggingFace
James Briggs
58 Metadata Filtering for Vector Search + Latest Filter Tech
Metadata Filtering for Vector Search + Latest Filter Tech
James Briggs
59 Build NLP Pipelines with HuggingFace Datasets
Build NLP Pipelines with HuggingFace Datasets
James Briggs
60 Composite Indexes and the Faiss Index Factory
Composite Indexes and the Faiss Index Factory
James Briggs

This video shows how to use Semantic Router's dynamic routes to speed up LLM function calling, using OpenAI's GPT-3.5 Turbo as an example, and demonstrates how to create function schemas and deploy LLMs with dynamic routes.

Key Takeaways
  1. Install Semantic Router Library
  2. Initialize Static Routes
  3. Create Function Schema using Sphinx do string format
  4. Define Dynamic Route using Function Schema
  5. Add Dynamic Route to Route Layer
  6. Test Dynamic Route with Time-related Question
  7. Connect Dynamic Route to Function Call
💡 Dynamic routes can speed up LLM function calling by generating parameters based on input, allowing for faster and more scalable deployment of LLMs.

Related AI Lessons

Building HITL Feedback RAG: Embeddings, Retrieval, and Reranking
Learn to build a Human-in-the-Loop (HITL) Feedback RAG system using embeddings, retrieval, and reranking to improve model performance
Medium · AI
Building HITL Feedback RAG: Embeddings, Retrieval, and Reranking
Learn to build a Human-in-the-Loop (HITL) Feedback RAG system using embeddings, retrieval, and reranking to improve LLM performance
Medium · LLM
The 2026 AI Model Release Race: Every Major LLM Launch You Need to Know
Stay updated on the 2026 AI model release race, including major LLM launches like Claude Sonnet 5 and GPT-5.6, to leverage the latest advancements in AI technology
Dev.to AI
Call GPT, Claude, and Gemini from one API key — a 3-step setup
Access GPT, Claude, and Gemini through one API key with a 3-step setup using Modelishub
Dev.to AI

Chapters (5)

Fast LLM Function Calling
0:56 Semantic Router Setup for LLMs
2:20 Function Calling Schema
4:04 Dynamic Routes for Function Calling
5:51 How we can use Faster Agents
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →