Voice-AI Customer Service Agent for Medical Lab Appointments

Voice AI Systems Applied AI LLM Prompt Engineering Knowledge Base Construction API Integration Workflow Automation Testing & Evaluation

This project developed a production-ready voice AI customer-service agent for Access Medical Labs, a large diagnostic testing facility offering bloodwork, COVID-19 testing, hormone panels, and wellness screening services. The AI agent functioned as a virtual receptionist, handling fully automated telephone interactions including appointment booking, rescheduling, cancellation, frequently asked questions, and call routing.

The agent combined conversational intelligence with strict operational rules, enabling it to manage real appointment workflows and patient questions while maintaining a professional and medically appropriate tone. The system demonstrates end to end applied AI development, including knowledge base construction, structured prompt design, workflow orchestration, evaluation of model behavior, and integration of language models with real operational systems.

Background & Problem Statement

Medical laboratories receive a high volume of routine calls related to appointment scheduling, rescheduling, cancellations, and basic questions about tests, hours, and location. These calls are time consuming for front-desk staff and often occur outside peak staffing windows. At the same time, lab interactions must respect privacy constraints and maintain a clear, professional tone for patients who may already be stressed or anxious about testing.

Problem Statement: How can a voice AI system be designed to reliably automate medical-lab customer service tasks while respecting operational constraints, HIPAA sensitive interactions, and natural conversational flow, with the objective of reducing staff workload and improving patient access to scheduling and information?

More specifically, the project asked:

Can a voice AI agent handle complex scheduling workflows across APIs (checking appointment availability, booking, rescheduling, and canceling)?
Can LLMs be prompt engineered to maintain a medically appropriate tone, guardrails, and escalation behavior?
Can automated callers be given domain accurate responses using live or curated knowledge bases without exposing sensitive information?

Model Development & System Design

The voice agent was developed using the Retell AI platform, which provides real time speech recognition, synthesized voice output, and low latency language model inference for phone calls. The agent’s behavior was governed by a structured system prompt and a multi-part workflow that controlled turn taking, question sequencing, error recovery, and escalation.

Access Medical Labs website serves as a knowledge base for caller inquiries

Knowledge Base Construction

Curated key pages from the Access Medical Labs website into a structured knowledge base.
Ensured consistent retrieval of available tests, lab location and hours, and appointment preparation information.
Designed the content to avoid disclosure of protected health information, restricting the agent to public operational information only.

Prompt engineering for Access Medical Labs voice AI agent in Retell AI interface

Prompt Engineering & Conversation Design

The final system prompt was organized into multiple sections (Role, Skills, Objectives, Guardrails, Rules, and Stepwise Flows) to structure the agent’s behavior.

Maintained a friendly, concise, and medically professional tone while ensuring the agent asked only one question at a time.
Supported HIPAA conscious logic, such as never revealing test results and escalating sensitive items to human customer service staff.
Enforced deterministic flow control: greet caller, classify intent, verify name and test type, and then follow appointment booking, rescheduling, or cancellation logic.
Imposed clarity rules for reading numbers, lists, explanatory parentheses, and content after colons to improve the intelligibility of synthesized speech.

This prompt architecture combined natural language reasoning with deterministic decision graphs, creating a controlled inference environment that minimized hallucinations and enforced reproducible behavior across callers. The design explicitly prevented over generation, managed ambiguous utterances, and restricted outputs to approved knowledge sources.

API Integration with n8n & Cal.com

To operationalize scheduling tasks, the agent integrated with n8n workflows and the Cal.com scheduling API.

n8n served as an orchestration layer for call metadata, request validation, and webhook handling.
Cal.com API calls handled appointment availability lookup, booking with dynamic slot selection, and cancellation and rescheduling workflows.
The agent collected and validated required patient information, invoked the appropriate Cal.com endpoint, confirmed success or failure to the caller, and escalated to human staff when logic branches exceeded complexity thresholds.

Hybrid Reasoning & Control

From a modeling perspective, the system used the language model for intent recognition, natural language understanding, and response generation, while deterministic policies enforced state transitions for appointment flows, privacy handling, and human escalation. This hybrid design ensured that the agent remained both flexible and predictable across a wide range of caller behaviors.

Testing, Evaluation & Results

The agent was evaluated using Retell AI’s call sandbox and analytics dashboard, with multiple rounds of iterative testing and refinement. Evaluation focused on both qualitative behavior and quantitative metrics.

Post-call sentiment analysis of voice AI conversation in Retell AI dashboard

Testing & Evaluation

Measured sentiment analysis across test calls to quantify caller experience and tone.
Monitored latency and token usage to ensure responses remained within acceptable real time thresholds.
Assessed call classification reliability for booking, rescheduling, cancellations, and general questions.
Evaluated task completion accuracy, intent classification stability, conversational flow, and error recovery under hesitant or ambiguous caller inputs.

Across dozens of test calls, the agent achieved consistent task classification, sub two second response latency, and natural conversation with minimal misunderstandings. These metrics guided iterative refinements to the workflow logic and system prompt.

Results

As a proof of concept, the system demonstrated that a fully automated voice agent can manage medical lab scheduling tasks and routine inquiries with reliability and professionalism. Key outcomes included:

Accurate execution of booking, rescheduling, and cancellation workflows.
Natural conversation with clear, concise responses and low confusion rates.
Robust compliance aware behavior guided entirely by the system prompt and approved knowledge base.
Consistent call outcomes confirmed through Retell AI analytics, with a stable distribution of successful task completion events.

From a data science perspective, the project demonstrates practical experience in building real world voice AI systems that combine language models, retrieval based knowledge integration, workflow automation, and evaluation of model behavior in real time settings.

Impact & Actionable Insights

If deployed at scale, this type of voice AI system would provide measurable operational benefits and serves as a reusable pattern for AI driven customer service in healthcare and beyond.

Operational Efficiency: Reduce staff call volume by automating routine scheduling and FAQ calls, allowing front-desk staff to focus on higher value interactions.
Extended Availability: Enable patients to book or reschedule appointments outside standard business hours while maintaining consistent quality of interaction.
Consistency & Compliance: Guardrails enforce reproducible, policy aligned behavior and prevent deviation from approved scripts and knowledge sources.
Scalability: The architecture can be extended to support insurance checks, multi location routing, multilingual support, and other healthcare workflows.
Foundation for AI Powered Customer Service: The system provides a reusable blueprint for designing intelligent assistants that combine statistical reasoning, rule based logic, and generative AI, aligned with the evolving responsibilities of modern data scientists and AI engineers.