Let's build a tool that understands human language and translates it into MongoDB queries, executes it and gives a human readable response. This text-to-Mongo agent can make it easy to interact with your database without needing to learn complex query languages.
Understanding the Challenge
While text-to-SQL tools are common, text-to-Mongo presents unique challenges. MongoDB's flexible schema and powerful query language, while advantageous, can complicate natural language translation.
We will try to solve this challenges using PyMongo library and power of LLMs and prompt engineering.
- Understanding the Schema
The first step involves connecting to the database and understanding the schema of the target collection.
from pymongo import MongoClient
class MongoAgent:
def _init_(self):
self.client = MongoClient(<MONGODB_URI>)
self.db_name = <DB_NAME>
self.db = self.client[self.db_name]
self.collection = self.db[<COLLECTION_NAME>]
self.llm = <YOUR_LLM_CONNECTION>
self.sample_record = self.collection.find_one()
self.sample_document_schema = self.randomize_sample_record(self.sample_record)
def randomize_sample_record(self, data):
if isinstance(data, dict):
for key, value in data.items():
data[key] = self.randomize_sample_record(value)
elif isinstance(data, list):
for i in range(len(data)):
data[i] = self.randomize_sample_record(data[i])
else:
# Randomize based on data type
if isinstance(data, str):
return ''.join(random.choice(string.ascii_letters + string.digits) for _ in range(10))
elif isinstance(data, int):
return random.randint(1, 100)
elif isinstance(data, float):
return random.uniform(0, 100)
else:
return data
return data
We connect to the collection, fetch a document, and randomize its values to protect PII. We only need the JSON keys as a reference to the collection schema. This approach is suitable for uniformly structured collections.
For non-uniform collections, you can iterate through documents and store unique keys.
2. Understanding the Context
With the schema reference, we can pass it to the LLM to identify relevant fields for the user's query.
def parse_question(self, question):
sample_document = str(self.sample_document_schema)
self.question = question
prompt = f"""
The Sample document in the collection is as follows, refer to this document to understand the schema of the collection:
{sample_document}
The user has asked the following question:
{question}
Your response should be in following JSON format
{{
"relavent_fields": [
<field1>,<field2>...
]
}}
Fields can be nested, if a relevant field is nested, return the full path to the field.
"""
response = self.llm.invoke(prompt)
self.parsed_question = JsonOutputParser().parse(response.content)
return self.parsed_questionAfter this step, we have the relevant fields for the asked question.
3. Generating PyMongo Code
With the relevant fields, we can generate PyMongo code to fetch results.
def generate_mongo_query(self):
sample_document = str(self.sample_document_schema)
relevant_fields = self.parsed_question["relavent_fields"]
prompt = f"""
You are a MongoDB data analyst, help users generate queries to answer questions about the database.
Follow the rules:
1. Don't include pymongo import, or connect to database, relevant collection is already connected and is present in collection variable.
Sample Document in the collection is as follows, refer to this document to understand the schema of the collection:
{sample_document}
The user has asked the following question:
{self.question}
Relevant fields for the question are:
{relevant_fields}
Generate the code to answer the user's question. Use variable name 'collection' to refer to the collection. and save the result in a variable named 'result'.
Generate the query in following JSON format
{{
"code": "<code>",
}}
"""
response = self.llm.invoke(prompt)
self.generated_code = JsonOutputParser().parse(response.content)
return self.generated_codeWe ask the LLM to generate code using specified variable names, which we will execute in an exec block. You can customize the rules for your use case.
4. Executing and Formatting Results
Let's execute the generated code and fetch results and formatting it into a human readable response (Optional).
def execute_pymongo_query(self):
try:
collection = self.collection
local_namespace = {
"collection": collection
}
exec(self.generated_code["code"], globals(), local_namespace)
self.result = local_namespace["result"]
return self.result
def format_result(self):
result = self.result
question = self.question
prompt = f"""
You are an AI assistant that formats database query results into a human-readable response. Give a conclusion to the user's question based on the query results
Result of the query is as follows:
{result}
The user had asked the following question:
{question}
"""
response = self.llm.invoke(prompt)
self.formatted_result = response.content
return self.formatted_resultThis will execute the generated code and store the results. Which can be converted to a human readable response and can be passed to other pipelines.
CAUTION: The above code is not production-ready, and needs exception handling for certain cases. and you will need to define limits for prompt containing results of the query which can be large and can shoot up the LLM cost.
By following these steps and leveraging the power of LLMs and prompt engineering, you can build a powerful text-to-Mongo agent that simplifies database interactions and empowers users to extract valuable insights. Remember, while this guide provides a solid foundation, continuous experimentation and refinement are key to optimizing your agent's performance and accuracy.