Measuring What Matters | Module 4

A practice told me they saved two hundred hours in the first month of using AI scribing. When I asked how they calculated that, they said the supplier had given them the figure based on the number of consultations processed. That is not measurement. That is marketing.

If you cannot measure the impact of AI in your practice, you cannot know whether it is helping. You certainly cannot justify the investment of time, money, and change management effort. And you cannot identify problems before they become serious.

But measurement in general practice needs to be practical. You are not running a research trial. You are trying to answer a simple question: is this working?

The three dimensions of impact

AI documentation affects three things in your practice, and you need to measure all three.

Time. This is what everyone focuses on, and it matters. But measure it properly. Do not rely on supplier estimates. Do not use "I feel like I am finishing earlier" as data. Time your documentation before and after. Use real numbers.

Quality. Are the clinical notes as good as, or better than, what clinicians were producing manually? Quality is harder to measure than time, but it is more important. A tool that saves twenty minutes but produces notes that miss safety-netting advice is not an improvement.

Experience. How do clinicians feel about using the tool? How do patients feel about it? Staff satisfaction and patient satisfaction are leading indicators. If clinicians find the tool frustrating or patients are uncomfortable, those problems will eventually show up in quality and safety metrics.

Measuring time

Here is a practical method for measuring the time impact of AI documentation.

Baseline. Before the AI tool is introduced, ask three or four clinicians to record how long they spend on documentation for one week. This means timing from the end of each consultation to the point where the note is complete and filed. Include any after-hours documentation time.

During use. After two weeks of using the AI tool, repeat the same timing exercise. Same clinicians, same types of sessions. Compare like with like — a Tuesday morning surgery against a Tuesday morning surgery, not a routine clinic against a duty session.

The honest calculation. The time saving is not just "consultation note time minus AI note time." Include the time spent reviewing and correcting AI notes. Include the time spent explaining the tool to patients. Include the time spent handling the occasional patient who declines. The net saving is what matters.

In my experience, the typical net time saving is 8 to 15 minutes per surgery session for an experienced user. New users often see no time saving or even a slight increase in the first two weeks while they learn the review workflow. Tell your team this upfront — it prevents frustration.

Measuring quality

Quality measurement requires a structured approach. Here is one that works in general practice.

Monthly note audit. Select five AI-assisted consultation notes per clinician per month. Compare them against the same criteria you would use for any documentation audit: Are the clinical details accurate? Is safety-netting documented? Is the management plan clear? Is the note something you would be comfortable defending at a significant event review?

Error tracking. Keep a simple log of AI errors found during note review. Categorise them: omission (something left out), commission (something added that did not happen), hallucinated negative (recorded the opposite of what was said), formatting error (wrong template or structure). Track the rate per consultation. A good benchmark: fewer than one clinically significant error per twenty consultations.

Before and after comparison. If you have the capacity, compare AI-assisted notes with manually written notes from the same clinicians. Are the AI-assisted notes more complete? More structured? Do they capture more safety-netting information? In some practices, AI-assisted notes are actually better than manual notes because the AI captures details the clinician would not have bothered to type.

Quality measurement is not about catching people out. It is about continuous improvement. Frame it that way with your team. The audit exists to improve the process, not to judge individuals.

Measuring experience

Experience is the dimension most practices neglect, and it is the one that predicts long-term success.

Staff survey. At one month and three months, ask clinicians a simple set of questions. Do you find the tool helpful? Do you trust the notes it produces? Does it change how you consult? Would you go back to manual documentation? Use a five-point scale and include a free-text box. The comments are usually more informative than the scores.

Patient feedback. Add a question to your routine patient satisfaction survey: "During your consultation, an AI tool was used to help document your appointment. How did you feel about this?" Options: comfortable, neutral, uncomfortable, unaware. Track the responses over time.

Opt-out rate. Monitor what proportion of patients decline AI documentation when offered the choice. A rising opt-out rate may indicate a communication problem or a genuine patient concern that needs addressing.

In most practices, patient acceptance is high — surveys consistently show 85 to 90 percent of patients are comfortable with AI documentation once it is explained to them. But the 10 to 15 percent who are not comfortable deserve respect and a clear alternative workflow.

Putting it together: your measurement dashboard

You do not need a spreadsheet with fifty metrics. You need a simple dashboard that you review monthly.

Time saved per session (hours per week across the practice)

Error rate (clinically significant errors per hundred consultations)

Clinician satisfaction (average score, one to five)

Patient opt-out rate (percentage)

Notes audited this month (number, with summary of findings)

Present this at your monthly clinical governance meeting. Five metrics, one page, five minutes of discussion. This is what sustainable measurement looks like in a busy practice.

If the numbers are good, share them with your team — it reinforces adoption. If they are concerning, address them early. A rising error rate at month two is a training problem. A rising error rate at month six might be a tool problem or a complacency problem. The data tells you where to look.

Key Takeaway

Measure three things: time (actual documentation time before and after, including review), quality (monthly note audits and error tracking), and experience (staff surveys, patient feedback, opt-out rates). Present five key metrics monthly at your clinical governance meeting. Good measurement turns opinions into evidence and catches problems before they become serious.