Member-only story

Fine-Tuning Your Embedding Model to Maximize Relevance Retrieval in RAG Pipeline

NVIDIA SEC 10-K filing analysis before and after fine-tuning embeddings

Wenqi Glantz

Published in

Better Programming

10 min readSep 12, 2023

Let’s continue from our previous article, Fine-Tuning the GPT-3.5 RAG Pipeline with GPT-4 Training Data. This time, let’s dive into fine-tuning the other end of the spectrum of our RAG (Retrieval Augmented Generation) pipeline — the embedding model.

By fine-tuning our embedding model, we enhance our system’s ability to retrieve the most relevant documents, ensuring that our RAG pipeline performs at its best.

We have been using OpenAI’s embedding model text-embedding-ada-002 for most of our RAG pipelines in our LlamaIndex blog series. However, OpenAI does not offer the feature to fine-tune text-embedding-ada-002, so let’s explore fine-tuning an open source embedding model in this article.

BAAI/bge-small-en

The current number 1 embedding model on HuggingFace’s MTEB (Massive Text Embedding Benchmark) Leaderboard is bge-large-en; it was developed by the Beijing Academy of Artificial Intelligence (BAAI). It is a pretrained transformer model that can be used for various natural language processing tasks, such as text classification, question answering, text generation, etc. The model is trained on a massive dataset of text and code, and it has been fine-tuned on the Massive Text Embedding Benchmark (MTEB).

For this article, we are going to use one of bge-large-en’s siblings, bge-small-en, a 384-dimensional small-scale model with competitive performance, perfect for running in Google Colab.

Fine-Tune Embedding Model vs. Fine-Tune LLM

From our last article on fine-tuning gpt-3.5-turbo, we gained a solid understanding of the steps involved in fine-tuning an LLM. Compared with LLM fine-tuning, the implementation of fine-tuning bge-small-en have some similarities and differences.

Similarities

Both types of fine-tuning follow the same approach of generating datasets for training and evals, fine-tuning the model, and finally evaluating the performances between the base and fine-tuned models.

Better Programming

Fine-Tuning Your Embedding Model to Maximize Relevance Retrieval in RAG Pipeline

NVIDIA SEC 10-K filing analysis before and after fine-tuning embeddings

BAAI/bge-small-en

Fine-Tune Embedding Model vs. Fine-Tune LLM

Similarities

Create an account to read the full story.

Published in Better Programming

Written by Wenqi Glantz

No responses yet