Introducing text and code embeddings

📰 OpenAI News

OpenAI introduces text and code embeddings, a new API endpoint for natural language and code tasks like semantic search and classification

intermediate Published 25 Jan 2022
Action Steps
  1. Read the documentation on the OpenAI API embeddings endpoint
  2. Explore the examples of the embeddings API in action
  3. Use the embeddings endpoint to perform tasks like semantic search, clustering, and classification
Who Needs to Know This

Developers and data scientists on a team can benefit from this new endpoint to improve their machine learning models and algorithms, such as clustering or search

Key Insight

💡 Embeddings are numerical representations of concepts that can be readily consumed and compared by other machine learning models and algorithms

Share This
🚀 Introducing text and code embeddings by @OpenAI! 🤖 Improve your machine learning models with this new API endpoint 💻

Key Takeaways

OpenAI introduces text and code embeddings, a new API endpoint for natural language and code tasks like semantic search and classification

Full Article

# Introducing text and code embeddings | OpenAI

[Skip to main content](https://openai.com/index/introducing-text-and-code-embeddings#main)

[](https://openai.com/)

* [Research](https://openai.com/research/index/)
* Products
* [Business](https://openai.com/business/)
* [Developers](https://openai.com/api/)
* [Company](https://openai.com/about/)
* [Foundation(opens in a new window)](https://openaifoundation.org/)

[Try ChatGPT(opens in a new window)](https://chatgpt.com/?openaicom-did=e269fe9d-0fb1-4d52-ae4a-edbb5ff622fe&openaicom_referred=true)

* Research
* Products
* Business
* Developers
* Company
* [Foundation(opens in a new window)](https://openaifoundation.org/)

Introducing text and code embeddings | OpenAI

Table of contents

* [Text similarity models](https://openai.com/index/introducing-text-and-code-embeddings#text-similarity-models)
* [Text search models](https://openai.com/index/introducing-text-and-code-embeddings#text-search-models)
* [Code search models](https://openai.com/index/introducing-text-and-code-embeddings#code-search-models)
* [Examples of the embeddings API in action](https://openai.com/index/introducing-text-and-code-embeddings#examples-of-the-embeddings-api-in-action)

January 25, 2022

[Product](https://openai.com/news/product-releases/)

# Introducing text and code embeddings

We are introducing embeddings, a new endpoint in the OpenAI API that makes it easy to perform natural language and code tasks like semantic search, clustering, topic modeling, and classification.

[Read documentation(opens in a new window)](https://beta.openai.com/docs/guides/embeddings)[Read paper(opens in a new window)](https://arxiv.org/abs/2201.10005)

![Image 1: An abstract landscape painting featuring a vivid blue sky with textured clouds in orange and white above rolling hills in shades of purple, green, and gold.](https://images.ctfassets.net/kftzwdyauwt9/4xj3R5liKGPVwyuNJ0ghjP/69a47222ceb6325af88c8f5102b70005/introducing-text-and-code-embeddings.jpg?w=3840&q=90&fm=webp)

Listen to article

Share

Embeddings are numerical representations of concepts converted to number sequences, which make it easy for computers to understand the relationships between those concepts. Our embeddings outperform top models in 3 standard benchmarks, including a 20% relative improvement in code search.

Embeddings are useful for working with natural language and code, because they can be readily consumed and compared by other machine learning models and algorithms like clustering or search.

![Image 2](https://cdn.openai.com/embeddings/draft-20220124e/vectors-mobile-1.svg)![Image 3](https://cdn.openai.com/embeddings/draft-20220124e/vectors-1.svg)![Image 4](https://cdn.openai.com/embeddings/draft-20220124e/vectors-mobile-2.svg)![Image 5](https://cdn.openai.com/embeddings/draft-20220124e/vectors-2.svg)![Image 6](https://cdn.openai.com/embeddings/draft-20220124e/vectors-mobile-3.svg)![Image 7](https://cdn.openai.com/embeddings/draft-20220124e/vectors-3.svg)

Embeddings that are numerically similar are also semantically similar. For example, the embedding vector of “canine companions say” will be more similar to the embedding vector of “woof” than that of“meow.”

![Image 8: Graph Of Similar Embeddings](https://images.ctfassets.net/kftzwdyauwt9/6feca3be-2b6b-4a99-fc14ed78f1ee/3373feb41e1f9f49ba2c0f1ce3332b8b/Graphofsimilarembeddings.svg?w=3840&q=90)

The new endpoint uses neural network models, which are descendants of GPT‑3, to map text and code to a vector representation—“embedding” them in a high-dimensional space. Each dimension captures some aspect of the input.

The new[/embeddings⁠(opens in a new window)](https://beta.openai.com/docs/api-reference/embeddings)endpoint in the[OpenAI API⁠(opens in a new window)](https://beta.openai.com/)provides text and code embeddings with a few lines of code:


```python
import openai
response = openai.Embedding.create(
input="canine companions say",
engine=
Read full article → ← Back to Reads