After implementing AI chatbots for our medical scheduling at AZ IV Medics, I learned to focus on **precision for high-stakes queries** - specifically tracking how often the bot correctly identifies when medical concerns are embedded in seemingly simple requests. Our 6,000+ customer interactions taught me this the hard way. The key metric I watch is precision rate for our "medical concern" intent class. When someone asks "where's my IV appointment AND I'm feeling dizzy," our bot needs to flag the medical symptom, not just send a booking confirmation link. We had cases where patients mentioned side effects from previous treatments while asking about scheduling, and the bot was only catching the scheduling part. I set our threshold at 95% precision for any query containing potential medical flags, even if it hurts our deflection rate. In healthcare, misrouting a patient who mentions symptoms to an automated "your appointment is confirmed" response can create liability issues and erode trust. One missed medical concern costs us more than handling fifty routine booking questions manually. Our SpruceHealth integration now escalates any query with medical keywords to human review, even if the primary intent seems administrative. Better to have our nurses handle 20% more tickets than miss a patient who needs immediate attention while asking about their appointment status.