Generative AI for Analytics: Performing Natural Language Queries on Amazon RDS using SageMaker, LangChain, and LLMs
Learn to use LangChain’s SQL Database Chain and Agent with large language models to perform natural language queries (NLQ) of Amazon RDS for PostgreSQL database
To paraphrase analytics workflow product vendor YellowFin, “Natural language query (NLQ), also known as natural language search, is a self-service business intelligence (BI) reporting capability that enables analytics users to ask questions of their data. It parses for keywords and generates relevant answers sourced from related databases, with results typically delivered as a report, chart or textual explanation that attempt to answer the query, and provide depth of understanding.”
Using LangChain’s SQL Database Chain and SQL Database Agent, we can leverage large language models (LLMs) to ask questions of an Amazon RDS for PostgreSQL database using natural language. Questions will be converted into SQL queries and executed against the database. Assuming the generated SQL query is well-formed, the query results will be converted into a textual explanation. For example, we ask questions like, “How many customers have purchased in the last 12 months?” or “What were the total sales in May?” These will be converted into SQL
SELECT statements, like
SELECT sum(amount) AS sales FROM purchases WHERE MONTH(purchase_date) = 5 AND YEAR(purchase_date) = 2023; The answer is then composed into textual explanation, such as “A total of 384 customers made purchases in the last 12 months.”
In the following post, we will learn to use LangChain’s SQL Database Chain and SQL Database Agent with OpenAI’s
text-davinci-003 , an LLM instance within OpenAI’s GPT-3 series, to perform NLQ of an Amazon RDS for PostgreSQL database. We will also learn about the importance of using LangChain’s Prompt Template, Query Checker, few-shot promoting, and retrieval-augmented generation (RAG) to improve our results.
All the source code for this post’s demonstration is open-source and available on GitHub.