Overview
Architected and deployed a fully local, privacy-preserving hospital front-desk voice AI agent on NVIDIA Jetson AGX Orin. Every component of the conversational pipeline — speech recognition, language understanding, response generation, and text-to-speech — runs entirely on-device. No patient data ever reaches the cloud. This proof-of-concept demonstrated the feasibility of state-of-the-art conversational AI at hospital reception and wayfinding kiosks, providing patients with a natural spoken interface while maintaining full healthcare data compliance.
The Challenge
Healthcare environments have strict data privacy requirements that make cloud-dependent AI deeply problematic. Traditional voice AI systems require constant connectivity and transmit patient interactions to remote servers — a HIPAA liability and a practical bottleneck in connectivity-constrained facilities. Deploying state-of-the-art conversational AI entirely on a resource-constrained edge device requires significant optimization work for every component in the pipeline. Speech recognition, LLM inference, and neural TTS each carry substantial compute demands that must be carefully balanced to achieve acceptable real-time latency on embedded hardware without any cloud offload.
Technical Approach
- NVIDIA Jetson AGX Orin as the primary compute platform — chosen for its GPU acceleration, power efficiency, and edge deployment form factor
- Faster-Whisper: ported and optimized for Jetson ARM/GPU architecture, enabling accurate on-device speech recognition with low latency
- Kokoro TTS: built and optimized for Jetson, achieving natural-sounding low-latency neural speech synthesis entirely on-device
- Ollama for local LLM inference — providing language understanding and context-aware response generation without any external API calls
- Livekit for audio capture, session management, and the real-time audio pipeline connecting microphone input to the processing chain
- Full air-gap capability — the complete system operates without any network connectivity, enabling deployment in the most restrictive environments
- Demonstrated use cases for hospital reception assistance and wayfinding kiosks with contextual knowledge of facility layout and services