You are not going to fact-check every word of every AI response. That would defeat the purpose of using the tool in the first place.
But you do not need to. You need a proportionate approach. Check the things that matter most, in the way that costs the least time, with a focus on the errors that could cause real harm.
That is what this lesson is about. Five practical techniques for catching AI mistakes, a framework for deciding how much checking each task needs, and a red flags checklist you can keep in your head.
Technique 1: Check the core clinical facts
This is the single most important technique, and it takes less time than you think.
If the AI mentions a drug dose, check it in the BNF. Thirty seconds. If it cites a NICE guideline recommendation, look it up on the NICE website. Thirty seconds. If it states a diagnostic threshold, verify it against Clinical Knowledge Summaries. Thirty seconds.
You are not reading the entire guideline. You are spot-checking the specific facts the AI has given you. Did it get the dose right? Did it get the threshold right? Did it get the referral criteria right?
Let me give you a real example. I asked AI about the indications for starting a statin in primary prevention. It told me the QRISK threshold was 10%. The actual threshold in current NICE guidance is 10%. Correct. But it then said the recommended first-line statin was simvastatin 40mg. NICE actually recommends atorvastatin 20mg. A different drug at a different dose.
That check took me less than a minute. And it caught an error that, if embedded in a practice protocol, could have led to incorrect prescribing for every patient going through your cardiovascular risk pathway.
Technique 2: Watch for excessive confidence
Real clinical knowledge comes with nuance. Guidelines say “consider” rather than “always.” Evidence is graded as “moderate” or “limited” rather than “definitive.” Recommendations come with caveats, exceptions, and patient-group variations.
If an AI response presents everything as certain and straightforward, that itself is a warning sign.
Here is what this looks like. You ask about managing hypertension in pregnancy. A good clinical source would say: consider labetalol as first-line but note the contraindication in asthma, discuss alternatives including nifedipine, and refer to the NICE guideline on hypertension in pregnancy for the full treatment algorithm.
An AI response might say: “Prescribe labetalol 200mg twice daily.” Clean. Simple. Confident. And missing all the nuance that makes the recommendation safe.
When AI sounds too certain, slow down. Medicine is rarely that clean. If the response does not mention exceptions, contraindications, or clinical judgement, it is probably oversimplifying.
Technique 3: Check the context
Is this response UK-appropriate? This is a quick mental scan that should become automatic.
Check the drug names. Are they British generic names or American brand names? If you see acetaminophen, Tylenol, albuterol, or epinephrine — the response is based on American sources.
Check the units. Are blood glucose values in mmol/L or mg/dL? Is cholesterol in mmol/L or mg/dL? Is HbA1c in mmol/mol or percent?
Check the guidelines. Does it reference NICE, or the American College of Physicians? Does it mention GPs, or “primary care physicians”? Does it reference A&E, or the “emergency room”?
If anything looks American, treat the entire response with extra caution. If the context is wrong, the specific recommendations are likely wrong for your patients too.
Technique 4: Ask for the reasoning
If an AI gives you a recommendation that surprises you, do not just accept or reject it. Ask why.
Type: “Why did you recommend that?” or “What is the evidence for that recommendation?” or “Explain the reasoning behind that suggestion.”
The AI will generate an explanation. That explanation may itself contain errors. But it gives you more text to evaluate and more opportunities to spot where the reasoning breaks down.
If the reasoning does not make clinical sense to you, trust your instinct. You have years of clinical training and experience. An AI has patterns from text. When those two things conflict, investigate further before changing your practice.
Technique 5: Trust your clinical instinct
This is the simplest technique and, honestly, one of the most effective.
If something surprises you, verify it. If the AI tells you something you did not know, something that contradicts your clinical experience, or something that seems too neat — check it.
Your clinical instinct is built on years of training, examinations, clinical experience, and patient encounters. It is a powerful error-detection tool. When an AI response makes you think “really?” — that is your brain telling you to verify.
Sometimes you will discover the AI taught you something new and correct. Sometimes you will discover the AI is wrong and you were right. Either way, checking was the right decision.
Proportionate verification
Here is how to decide how much checking a task needs.
For a practice newsletter or a staff communication, a quick read-through for anything obviously wrong is probably sufficient. The stakes are low. An error in a newsletter is embarrassing but not harmful.
For a clinical protocol, every clinical recommendation needs to be verified against the relevant guideline. This is a document that other clinicians will follow. It needs to be right.
For a patient information leaflet, the clinical facts need checking and the language needs reviewing for clarity. Patients will rely on this information.
For anything that will directly influence a prescribing decision or a clinical pathway, every clinical fact must be independently confirmed against the BNF, NICE, or Clinical Knowledge Summaries. There is no shortcut here.
Think of it like checking blood results. You do not rerun every normal full blood count. But an unexpected result gets repeated. An abnormal result gets investigated. You apply proportionate scrutiny based on the clinical significance. The same principle applies to AI output.
The red flags checklist
Let me leave you with five red flags. If you see any of these in an AI response, verify before you use it.
1. Unusually specific statistics without a source. If the AI says “exactly 37% of patients respond to this treatment,” ask yourself where that number came from.
2. Perfect fluency on a topic you know to be complex. Real clinical topics have grey areas and competing evidence. If the AI makes it sound simple, something has probably been oversimplified or invented.
3. References you have never heard of. Not every unfamiliar reference is fake, but unfamiliar references deserve verification.
4. Recommendations that conflict with your training or experience. Check before you change your practice.
5. American terminology anywhere in the response. If you see one Americanism, the entire response may be based on American sources and guidelines.
What you have learned in Module 2
Let me take a moment to look back at what we have covered across these eight lessons.
You learned where your data goes when you type it into an AI tool, and why that matters for patient confidentiality. You learned the practical rules — the four-question checklist and what your practice can approve. You saw the green zone in action, with four worked examples you can use today. You explored the amber and red zones, with clinical scenarios that show where the boundaries are.
You learned how to write effective prompts using role, task, context, and constraints. You practised with three complete examples and learned to build a prompt library. You discovered the five types of AI error, with clinical examples of each. And in this lesson, you learned how to catch those errors proportionately and practically.
You now have the skills to use AI safely and effectively in your daily work. You know the rules, the boundaries, the techniques, and the pitfalls.
In Module 3, we will take these skills into the consultation room — how AI can help with clinical documentation, patient communication, and managing the complexity of modern general practice. The practical applications that save you time and protect your patients.
Key Takeaway
Apply proportionate verification: spot-check clinical facts against the BNF and NICE, watch for excessive confidence, check for American context, ask for reasoning, and trust your clinical instinct. The five red flags — unsourced statistics, false simplicity, unknown references, conflicting recommendations, and American terminology — tell you when to look closer.
Reflect on Your Learning
These questions are designed for your CPD appraisal portfolio. Use them to reflect on what you have learned in this module and how it applies to your practice. You can copy or screenshot your answers as evidence of self-certified CPD.
- Think of a task you did this week. Would it fall in the green, amber, or red zone?
- How will you verify AI output before using it in your practice?
- What conversation do you need to have with your ICB digital team about AI governance?
Approximate CPD time for Module 2: 2 hours (including listening, reading, and reflection).