Retrieval-augmented generation (RAG) is a technique used to “ground” large language models (LLMs) with specific data sources, often sources that weren’t included in the models’ original training. RAG’s three steps are retrieval from a specified source, augmentation of the prompt with the context retrieved from the source, and then generation using the model and the augmented prompt.
In an exercise in dogfooding, I asked the GPT-4 large language model “What is retrieval-augmented generation?” using its Browse plug-in, which is one implementation of retrieval-augmented generation.