Efficient training of language models to fill in the middle

📰 OpenAI News

OpenAI introduces efficient training of language models to fill in the middle, a technique that improves model performance without harming generative capability

advanced Published 28 Jul 2022

Action Steps

Apply a straightforward transformation to the dataset by moving a span of text from the middle of a document to its end
Train autoregressive language models with a large fraction of data transformed in this way
Evaluate model performance using perplexity and sampling evaluations
Run ablations on key hyperparameters to prescribe strong default settings and best practices

Who Needs to Know This

NLP engineers and researchers can benefit from this technique to improve language model performance, while product managers can consider its applications in AI-powered products

Key Insight

💡 Training models with fill-in-the-middle technique does not harm original left-to-right generative capability