A RAG layer for a healthtech intake assistant, shipped in

The brief

The client had a patient-facing intake assistant that drifted. Asked something outside its documents, it answered anyway, from the model’s general knowledge. In healthtech, that is not a quirk. It is a liability. The brief: a retrieval layer that grounds every answer in approved source documents, with a hard refusal when the documents do not cover the question.

The constraint

The constraint was absolute. Zero answers from outside the source set. Not "rare." Zero. A confident guess about a medication interaction is the exact failure mode that cannot ship.

The approach

We built retrieval over the approved document set with citations on every passage. The model is instructed, and the system is structured, so an answer is only assembled from retrieved passages. If retrieval returns nothing above the relevance threshold, the assistant says it cannot answer and routes to a human.

Every answer the assistant gives carries the citations it was built from, so a clinician reviewing a transcript can see exactly which passage produced which sentence.

The build

Days 1 to 4: document ingestion, chunking, and the retrieval index. Days 5 to 11: the grounded answer loop, the refusal gate, and the adversarial eval harness. Days 12 to 14: deploy, README, handoff video.

The outcome

The RAG layer shipped on day 14. Across the adversarial eval set and the first month of real traffic, it gave zero answers from outside the source documents. When it does not know, it says so and hands off. That is the whole point.

0answers outside the source documents

0%answers carry citations

0 daysbrief to production

0%AI time saved on build

“The refusal behavior was the deliverable, not a side effect. Holdfast built an eval set designed to make the assistant fail and then made it pass. That is the rigor healthtech needs.”

HoHead of ProductHead of Product, Healthtech startup

A RAG layer for a healthtech intake assistant, shipped in 14 days