From RAGs to vectors: How businesses are customizing AI models

ILLUSTRATION: THOMAS R. LECHLEITER/THE WALL STREET JOURNAL
ILLUSTRATION: THOMAS R. LECHLEITER/THE WALL STREET JOURNAL

Summary

Here’s a guide to popular tools and techniques businesses rely on to take generative AI to the next level.

Large language models, which are the AI algorithms that power chatbots like ChatGPT, are powerful because they are trained on enormous amounts of publicly available data from the internet. While they are capable of summarizing, creating, predicting, translating and synthesizing text and other content, they can only do so on the data they have been trained on, at a specific point in time.

That’s why businesses are looking to methods like retrieval-augmented generation, or RAG, and fine-tuning to bridge the gap between the general knowledge these LLMs have and the up-to-date and specific knowledge that makes them useful for enterprises. Here’s what to know about these techniques, and how they work:

Vector database: A database designed to store a massive amount of data as “vectors," which are numerical representations of the raw data. Depending on the amount and type of data—from images and text to tables—each vector can contain tens to thousands of dimensions grouped by similarity.

This format, which differs from a traditional database with columns and rows, allows AI models to quickly search for contextually similar vectors and identify the context of a user’s question. Traditional databases require specific text-matching searches, whereas vector databases can search for similarities in the data based on a user’s query—finding instances of “coffee" that likely relate to “beverage," for example.

Retrieval-augmented generation (RAG): A method used by developers to connect large language models with external data sources, such as a business’s private information, so that it can provide more personalized, accurate and relevant responses. The term originated from a 2020 paper by Meta Platforms AI researchers.

The RAG technique enables an AI model to reference any data stored in a vector database, which can include a company’s emails, documents and PDFs, spreadsheets and databases, images and audio files.

For instance, developers at Workday used RAG to link the cloud software provider’s corporate employee guidelines with an underlying AI model. When employees ask the AI a question like, “What’s our expense policy when I travel to New York City?", the system will return an answer with a list of its sources and suggestions for further research, said Jim Stratton, the company’s chief technology officer.

Fine-tuning: A method of customizing a large language model with private or company data, so that it becomes better at specific tasks like analyzing legal contracts and writing marketing documents. An LLM can also be fine-tuned for industries like healthcare and finance, which have specialized terminology and strict requirements around data handling.

Fine-tuning involves training a general, pretrained AI model like OpenAI’s GPT-3.5, on data related to the task. For the task of drafting marketing documents, that data might include information about a company’s products, its competitive positioning and brand voice.

Fine-tuning is generally more costly and time-consuming than RAG because it requires computing power and advanced AI expertise to change the underlying model. Roughly 80% of enterprises are using RAG, compared with 20% using fine-tuning, according to market research and consulting firm Gartner.

Prompt engineering: The process of asking questions and giving instructions to a large language model so it responds with more accurate, specific and relevant outputs.

Prompt engineering has emerged over the past year as an essential new skill for employees, so that they can generate better text summaries, data analyses and email drafts from AI chatbots and other applications. It is also used as a way to provide general large language models with specific company information, so that it provides more tailored responses.

Write to Belle Lin at belle.lin@wsj.com

Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.
more

topics

MINT SPECIALS