Overview
Developed a real-time voice assistant for smart hospital rooms using Whisper and BERT running entirely on edge devices. The system provides ambient listening that detects verbal distress from patients and enables voice-triggered nurse calls — while ensuring complete audio privacy since no audio ever leaves the room. This always-on intelligence layer augments caregiving capacity without requiring additional staff, and gives patients a natural, spoken interface to reach their care team instantly.
The Challenge
Hospital rooms need always-on intelligence without constant manual monitoring. Cloud-based voice processing introduces unacceptable latency and privacy risks in healthcare — patient audio sent to remote servers is a compliance liability and a patient trust issue. The system needed to run entirely locally, detect specific verbal cues accurately in noisy hospital environments (with medical equipment hum, ambient hallway noise, and overlapping voices), and integrate seamlessly with existing nurse call infrastructure. The challenge of running continuous, accurate speech recognition on resource-constrained edge hardware without cloud assistance required significant optimization work.
Technical Approach
- Whisper (OpenAI) for continuous on-device speech recognition — optimized for low-latency, always-on processing without cloud upload
- BERT-based intent classifier for keyword and phrase detection, distinguishing meaningful clinical utterances from ambient conversation
- Local inference pipeline with no cloud audio upload — all speech data processed and discarded on-device
- Detection categories including patient distress calls, verbal abuse toward staff, specific patient request phrases, and emergency keywords
- Always-on low-power listening mode using voice activity detection to trigger full inference only when speech is present
- Integration triggers wired into nurse call system and staff alert notification infrastructure for immediate escalation