You're unable to read via this Friend Link since it's expired. Learn more
Member-only story
Refreshing Private Data Sources with LlamaIndex Document Management
Mastering an essential and practical feature to manage your private data sources

We’ve explored a few features of LlamaIndex in our articles in the last few weeks. Even though we have a few POC apps up and running, we are missing one essential feature — document management. In this article, we will dive into the detailed implementation of document management using LlamaIndex.
Before discussing what document management is, let’s take a look at our private data sources to understand why we need document management.
The Ever Changing Nature of Our Private Data Sources
After the initial excitement of getting our POC chatbot to query our private data sources, we start to explore and expand our use cases. Questions such as the following start to pop up:
- How to get our chatbot to answer questions from newly added documents?
- How to handle existing documents with revised content?
- How do we ensure our chatbot doesn’t provide outdated answers from documents already deleted from our private data source?
Our private data sources' ever changing nature is a fundamental and essential feature. If our chatbot can only handle static content from static documents, it will soon become outdated and less useful.
The Cost Effect
Even though our POC documents don’t use up that many tokens, you could have private data sources with large sizes, resulting in many tokens consumed and a big dollar amount spent. With each private data source change, such as a new document added, existing document updated, or deleted, you have to reload, re-embed, and re-index all of your data, and the costs can quickly add up.
Yikes! There has to be a better way!
This is where LlamaIndex document management comes into play. Let’s break it down.
What is Document in LlamaIndex
The document is one of the main building blocks of LlamaIndex during the data ingestion and indexing phase. See the diagram below for our…