What is grounding
LLMs generate answers based on input texts. They are general purpose, and will be able to answer most “normal” questions. It does not magically know specific information, and more importantly has no sense of “not knowing”.
To solve this we give it information, and tell it only to use that information to answer any given question. This is grounding, and it is one of the most important concepts when trying to make “domain specific LLMs”.
How is it normally solved
To ground an LLM we take the information we want it to be grounded to, and add it as part of the context. E.g. if a support agent bot gets asked about something, the text we send to the LLM to generate an answer would look like this:
<Document with specific knowledge>
Using the information given to you above, answer the question below.
<Question from the user>
What if the knowledge base is too large?
If we have too many documents, or too much text, it is not possible to send the whole knowledge base together with the question. In these cases we need to decide what part of the knowledge base to send. This is Retrieval Augmented Generation, known as RAG.
RAG
In RAG we take our documents and divide them into smaller chunks. Whenever a question comes in we do a search to find the most relevant chunks, and send those together with the question to the LLM.
If chunk no. 6, 19, and 120 is most relevant, this is what is sent to the LLM:
<Chunk no. 6>
<Chunk no. 19>
<Chunk no. 120>
Using the information given to you above, answer the question below.
<Question from the user>
The Problem
Chunking a document as described above loses its structure, and there is no way for the RAG system to know where in the document it is. This means analysing a document across itself, or comparing two documents, is not possible. It also decreases accuracy; the more chunks you have, the less likely it is to fetch the correct chunks.
The Solution
To solve this we propose two part grounding system:
Semantic Chunking
First comes chunking, but importantly semantic chunking, making sure the semantic elements of the document are kept. If there is a paragraph we make sure its chunk contains the full paragraph, if there is a text example relating to another description, we ensure it is all kept in one chunk, and so on. This ensures we don’t separate the important parts of a document.
Document Graph
The second part of our grounding system is the ability to build a document graph. When parsing a document, we store all metadata relating to a given chunk, and its relation to all other chunks. Doing this we build an abstraction layer on top of the chunks themselves, a 3D table of contents. This means that whenever we get a chunk returned in our search we can also examine its relevance within the document as a whole, and then decide whether we want to fetch more information than just this chunk specifically. In essence we engineer a system enabling an agent to easily traverse the totality of a document.
The product
Grouse is “Grounding as a Service”, giving any user access to a state of the art grounding service that can handle any size of knowledge base. To use Grouse you first upload your documents, Grouse will then generate a knowledge base as described above. When that is done you can query Grouse with any question, and it will return the relevant parts of your knowledge base.
It is integrated as a first party service available when using European cloud vendors, and is one of the few EU native options for creating high performing domain specific AIs.