I’m building a production chatbot that uses an LLM + vector store + REST API calls. The issue is: whenever the API returns incomplete data or the vector search returns low similarity, the bot starts hallucinating answers.
What’s the best practice to strictly force the bot to say “No data available” instead of generating something?
Is this a prompt issue, retrieval design issue, or something related to scoring/similarity thresholds?I’m building a production chatbot that uses an LLM + vector store + REST API calls. The issue is: whenever the API returns incomplete data or the vector search returns low similarity, the bot starts hallucinating answers.
What’s the best practice to strictly force the bot to say “No data available” instead of generating something?
Is this a prompt issue, retrieval design issue, or something related to scoring/similarity thresholds?
