The basics of Retrieval Augmented Generation

Foto del autor

By José Ignacio Orlando

January 31, 2024

As humans, we constantly seek information. Before computers, we avidly searched through books and dictionaries, flipping pages in pursuit of knowledge to answer our questions. With the internet, searching became much easier, using engines like Google or Bing, which are at the core of the richest technology companies in the world.

For decades, we’ve interacted with nearly identical interfaces: a box with a blinking cursor, we type a few words, hit Enter, and then await results from an engine scouring millions of websites to provide us with a list of related content. Then it’s always been our responsibility to sift through, discerning which are pertinent, and crafting our sought-after answers. So artisanal.

With the sudden introduction of ChatGPT in our lives a year ago, we began experimenting with an alternative approach. Yes, the text box remains, but now we have a chatbot capable of conversing in a human-like manner, furnishing answers without our need to construct them ourselves. This has been huge, to the extent that we devoted an entire episode of our podcast to it.

And yet, this tool has its own limitations: what if we seek answers about documents ChatGPT never encountered during its training? Without internet access, the answers can be simply wrong, rife with hallucinations and false claims that may mislead uncritical readers and be taken as real facts

That is when RAG comes into play.

RAG stands for Retrieval Augmented Generation, and is a one-of-a-kind tool that is reshaping knowledge retrieval. Given any document, a RAG system is able to answer any questions you could ask about them just like ChatGPT: you type the question on a box, and the RAG tool will answer you in plain, realistic English, just as a human would do, providing you even with the references of text it used to build that answer.

The way it works is simply fantastic (at least to me). Although we will delve into these details in an upcoming article, any RAG system is built on top of at least two AI components: an embedding model, and a Generative AI. The embedding model functions like a dictionary index, aiding a search engine in pinpointing relevant portions of the documents in the knowledge base. The Generative AI, conversely, takes your question and those excerpts as inputs, distilling them to produce the final answer.

Just think about the implications of systems like this. You can use them to empower managers to make more informed decisions without having to manually digest thousands of pages. You can connect them internally to Slack or Teams to ease access to information, without having to browse the information source yourself.  From delving into extensive corporate archives to unraveling the complexities of financial data, they can be used almost for any task that can be mapped to something as simple as asking questions about pieces of text. The applications are limitless.

At Arionkoder, our AI Labs have applied this technology in different applications, e.g. building a RAG-powered chatbot that fully integrates Slack and Confluence to enable our teams to get instant insights about the documentation of existing projects. 

Do you envision any other use cases? Want to give it a try within your company? Reach us out at hello@arionkoder.com for a free consultation. And stay tuned for more RAG content that we’ll release very soon!