LLM + SQL: Deterministic Answers with Amazon Bedrock and Athena

Introduction

In today’s dynamic landscape of generative artificial intelligence, large language models (LLMs) have radically transformed how we interact with technology. These models have demonstrated exceptional capabilities in tasks such as text generation, sentiment analysis, and contextual understanding. However, when we face scenarios that require absolute precision and deterministic results, we encounter inherent limitations that need to be addressed in innovative ways.

The Challenge of Non-Deterministic Models

How LLMs Work Under the Hood

Large language models operate through a sophisticated probabilistic system. At their core, these models:

Contextual Prediction: They analyze prior context to predict the most probable next word or sequence.
Probability Distribution: They generate a probability distribution across different response options.
Temperature and Randomness: They use parameters like temperature to control creativity vs. determinism in their responses.

This probabilistic nature is precisely what makes LLMs so versatile for creative and analytical tasks, but it’s also what makes them less reliable for queries that demand numerical exactness or absolute precision.

Context: From a Failed POC to an Innovative Solution

Over the past few months, while giving multiple talks on Generative AI, one particular conversation caught my attention. A development team shared their frustration with a proof of concept (POC) they considered a failure. The problem: their generative AI implementation for support ticket analysis was producing inconsistent results.

As I dug deeper into the case, an interesting pattern emerged:

What Worked Well:

“Analyze support ticket X”
“What’s the summary of case Y?”
“What does this incident report suggest?”

These questions, which required contextual understanding and qualitative analysis, received precise and useful answers.

What Consistently Failed:

“Which department has the most open tickets?”
“How many tickets were handled last month?”
“What’s the average resolution time?”

Questions that required numerical precision and exact calculations never provided reliable results.

The Key Revelation

The reason for the failure became evident once we understood the fundamental nature of LLMs: they are inherently non-deterministic. Their strength lies in natural language processing and probability-based content generation, not in performing precise calculations or exact queries on structured data.

This insight led me to reformulate the key question:

How can we answer deterministic questions when an LLM, by its very nature, isn’t designed to do so?

The answer emerged when I recognized that we didn’t need to force the LLM to do something it wasn’t designed for. Instead, we could:

Use the LLM for what it does best: understanding the intent of the question.
Translate that intent into structured queries when necessary.
Use specialized tools for precise calculations.
Present the results in a coherent and natural way.

Bridging the Gap Between Precision and Probability: Implementing the Solution

Once I identified the core of the problem, I developed a proposal that first determines the nature of the query and then applies the appropriate processing.

Query Classification

Deterministic Queries:

Characteristics:

Require exact, reproducible counts.
Involve aggregations on specific ticket fields.
Operate on the schema defined in Athena.

Real-World Examples:

“Which department has the most open tickets?”
Generated SQL:

   SELECT departamento, COUNT(*) as total
   FROM tickets
   WHERE estado != 'CLOSED'
   GROUP BY departamento
   ORDER BY total DESC

“What is the leading cause of registered incidents?”
Generated SQL:

   SELECT causante, COUNT(*) as total_incidentes
   FROM tickets
   WHERE solicitudes = 'Incidentes'
   GROUP BY causante
   ORDER BY total_incidentes DESC
   LIMIT 1

Non-Deterministic Queries:

Characteristics:

Require contextual analysis of ticket content.
Benefit from natural language processing.
Are handled by the Bedrock Knowledge Base.

Examples:

Analysis of specific ticket content.
Case summaries.
Pattern interpretation in reports.

Processing Flow

The flow I decided to follow to tackle the challenge is divided into three simple steps.

Initial Evaluation
- Uses the defined prompt to determine if the query is deterministic. In this step, as we’ll see later, I use an LLM to figure out whether what the user is asking is deterministic by nature or not.
- When it is deterministic, the LLM generates the appropriate SQL within <SQL> tags. This is based on an Athena table and a data dictionary.
Processing
- Deterministic queries: Executed through Athena — we send a SQL query created by an LLM that satisfies the user’s question.
- Non-deterministic queries: Processed through Amazon Bedrock – Knowledge Base. This knowledge base contains the same CSV file we use in Athena.
Response Formatting
- Athena results are limited to 25 records (because we don’t want a single question to be able to return the entire database).
- The LLM is used to convert results into natural language responses.
- The language consistency of the original question is maintained.

Solution Architecture

The implemented architecture solves the deterministic query challenge through a strategic combination of AWS services and LLM processing. Let’s analyze each component and its detailed implementation.

1. Data Storage and Preparation Layer

1.1 Base Data Structure

The system operates on a CSV file hosted in S3 that contains ticket records. The preparation of this data is crucial and requires:

CREATE EXTERNAL TABLE IF NOT EXISTS `default`.`tickets` (
 `fechaResolucion` string,
 `asignado` string,
 `solicitudes` string,
 `producto` string,
 `departamento` string,
 -- [remaining fields]
)
COMMENT "Example tickets table"
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES ('field.delim' = ';')
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://MiBucket/'
TBLPROPERTIES ('classification' = 'csv');

This DDL is fundamental because:

It defines the exact structure that Athena will use for queries.
It specifies the ; delimiter for correct CSV interpretation.
It establishes the S3 location where the data resides.
It configures the input/output format to optimize processing.

1.2 Data Dictionary

Along with the structure, we maintain a detailed data dictionary that the LLM will use to understand the context of each field. For example:

fechaResolucion: Field indicating the ticket resolution date and time.
                Format: month/day/year hour:minute
causante: Categorical field indicating whether the ticket was raised by A or B
departamento: Calculated descriptive field of the department that handled it

2. Query Classification System

2.1 Classification Prompt

The first crucial step is determining whether a query is deterministic. We implement this through a specific prompt:

StringBuilder prompt = new StringBuilder(
   "You are an expert in ticket analysis, I need you to analyze " +
   "the question I provide and if that question cannot be answered " +
   "by an LLM (because it is deterministic) respond only with the phrase " +
   "'DETERMINISTIC' followed by a SQL inside a <SQL> tag that complies " +
   "with the definition of the following Athena table and its glossary..."
);

This prompt is critical because:

It defines the specific role for the model.
It establishes the exact expected response format.
It includes the schema context and data dictionary.
It forces a structured, processable response.

2.2 SQL Generation via LLM

Once the system has identified that the query is deterministic, it returns the SQL to be sent to Athena for execution. This is achieved because we included the table definition and data dictionary in the previous prompt.

In a previous article about using Bedrock with RDS, I explained how an LLM can be used to generate SQL — and that prior experience is part of this solution.

2.2.1 Model Configuration and Invocation

var message = Message.builder()
      .content(ContentBlock.fromText(prompt.toString()))
      .role(ConversationRole.USER)
      .build();

try {
  var client = BedrockRuntimeClient.builder()
      .credentialsProvider(DefaultCredentialsProvider.create())
      .region(Region.US_EAST_1)
      .build();

  // Send the message with a basic inference configuration.
  ConverseResponse response = client.converse(request -> request
          .modelId(FOUNDATIONAL_MODEL)
          .messages(message)
          .inferenceConfig(config -> config
                  .maxTokens(512)    // Enough for complex SQL queries
                  .temperature(0.5F) // Low for higher precision
                  .topP(0.9F)));     // High coherence in structure

  // Retrieve the generated text from Bedrock's response object.
  var responseText = response.output().message().content().get(0).text();
  client.close();

  return responseText;

} catch (SdkClientException e) {
  System.err.printf("ERROR: Can't invoke '%s'. Reason: %s", FOUNDATIONAL_MODEL, e.getMessage());
  return "Unable to answer that question";
}

2.2.2 Complete Flow Example

To illustrate the process, let’s consider the question: “Which department has the most open tickets?”

Input Processed by the Model:

[All previous context + schema + dictionary]
Question: Which department has the most open tickets?

Generated SQL:

SELECT
   departamento,
   COUNT(*) as total_tickets
FROM tickets
WHERE fechaResolucion IS NULL
GROUP BY departamento
ORDER BY total_tickets DESC
LIMIT 25

The generated SQL is sent directly to Athena for execution, leveraging the fact that the model already knows the exact table structure and the meaning of each field thanks to the provided context.

The key to this approach’s success lies in the precision of the context provided to the model and the consistency of the requested response format, enabling reliable generation of SQL queries that match our schema exactly.

3. Deterministic Query Processing

3.1 Athena Query Execution

Once a deterministic query is identified, the system executes the generated SQL:

public String executeAthenaQuery(String query, String database) {
 try (AthenaClient athenaClient = AthenaClient.builder()
   .region(Region.US_EAST_1) // Adjust region according to your configuration
   .credentialsProvider(DefaultCredentialsProvider.create())
   .build()) {

   // Configure the query request
   StartQueryExecutionRequest startQueryExecutionRequest = StartQueryExecutionRequest.builder()
     .queryString(query)
     .queryExecutionContext(QueryExecutionContext.builder()
       .database(database)
       .build())
     .resultConfiguration(ResultConfiguration.builder()
       .build())
     .build();

   // Start the query
   StartQueryExecutionResponse startQueryExecutionResponse = athenaClient.startQueryExecution(startQueryExecutionRequest);
   String queryExecutionId = startQueryExecutionResponse.queryExecutionId();

   // Wait for the query to complete
   waitForQueryToComplete(athenaClient, queryExecutionId);

   // Get the query results
   return getQueryResults(athenaClient, queryExecutionId);

 } catch (Exception e) {
   e.printStackTrace();
   throw new RuntimeException("Error executing Athena query", e);
 }
}

This code:

Establishes a secure connection with Athena.
Executes the query asynchronously.
Manages the execution ID for tracking.

4. Response Formatting

The final step involves transforming technical results into comprehensible responses:

StringBuilder prompt = new StringBuilder(
   "You are an expert in answering queries, you must respond " +
   "in a professional, concise, and clear manner. The question asked was " +
   preguntaUsuario + " and the database response is: " +
   respuestaBD);

This formatting:

Maintains the context of the original question.
Structures the response naturally.
Preserves the precision of the obtained data.

5. Handling Non-Deterministic Queries

When the system identifies a query as non-deterministic, it means the query requires contextual or interpretive analysis that cannot be resolved through a direct SQL query. In this case, the system uses the Anthropic model directly to process the query.

5.1 Identification and Processing

The identification happens in the first step of the process, when the model does not return the word “DETERMINISTIC” followed by SQL. In this case, the system proceeds to process the query using the Bedrock model directly.

5.2 Model Configuration

For these queries, we use the base configuration of the Anthropic Sonnet 3.5 v2 model:

RetrieveAndGenerateInput input = RetrieveAndGenerateInput.builder()
        .text(prompt)
        .build();

KnowledgeBaseRetrieveAndGenerateConfiguration knowledgeConfig = KnowledgeBaseRetrieveAndGenerateConfiguration
        .builder()
        .knowledgeBaseId(KNOWLEDGE_BASE_ID)
        .modelArn(MODEL_ARN)
        .build();

RetrieveAndGenerateConfiguration retrieveConfig = RetrieveAndGenerateConfiguration.builder()
        .knowledgeBaseConfiguration(knowledgeConfig)
        .type("KNOWLEDGE_BASE")
        .build();

RetrieveAndGenerateRequest request1 = RetrieveAndGenerateRequest.builder()
        .retrieveAndGenerateConfiguration(retrieveConfig)
        .input(input)
        .build();

RetrieveAndGenerateResponse response1 = bedrockAgentRuntimeClient.retrieveAndGenerate(request1);

5.3 Non-Deterministic Query Examples

The following queries are typical examples that the system processes interpretively:

Content Analysis:

  Question: "What are the common patterns in connection error tickets?"

Case Interpretation:

  Question: "How was a similar case resolved last time?"

Contextual Summaries:

  Question: "Summarize the main issue of ticket #12345"

In these cases, the system:

Does not attempt to generate SQL.
Processes the query directly through the model.
Provides a response based on context and available information.
Maintains a format and tone consistent with the original question.

The response is delivered directly to the user, maintaining the conversational nature and context of the original question.

Conclusions and Next Steps

The implementation of this hybrid system, which combines the precision of SQL queries with the interpretive capability of language models, represents just the beginning of what’s possible with Generative Artificial Intelligence in enterprise data analysis.

Key Takeaways

The automatic distinction between deterministic and non-deterministic queries allows us to leverage the best of both worlds: the exactness of relational databases and the contextual understanding of LLMs.
The implemented architecture demonstrates that it’s possible to maintain the precision required in enterprise environments while significantly improving the user experience.
Using modern services like Amazon Bedrock allows us to implement advanced AI solutions without needing to manage complex infrastructure, while having access to state-of-the-art LLMs.

I encourage you to take this example as a starting point for your own explorations. Whether you’re looking to improve your ticket analysis systems or want to apply these concepts to entirely different domains, the possibilities are enormous.

GenAI is transforming the way we interact with data, and I’m excited to be part of this transformation. Are you ready to be part of it too?