Ashok Sravanam - Machine Learning Engineer

Vocal AI is a real-time feedback experience for singers, built during the UC Berkeley AI Hackathon to deliver practical coaching signals with low response latency. I built the end-to-end prototype flow for demo day: model integration, real-time inference loop wiring, and product-facing feedback surface.

Problem and Constraints

Singers needed immediate, understandable feedback while practicing, not delayed analysis after recordings.

Hackathon timeline forced fast technical decisions and a stable demo path.
Feedback had to be immediate enough to use during practice, not after a session.
Model output had to be translated into plain coaching cues for non-technical users.

My Approach

Considered: I considered batching analysis after recorded clips to simplify model execution.

Chose: I chose a near real-time inference loop so users could adjust technique in-session.

Rejected: I rejected post-record analysis because delayed feedback weakens practical training value.

Considered: I considered exposing raw model scores directly in the UI.

Chose: I chose interpretable coaching cues tied to pitch, timing, and vocal control behaviors.

Rejected: I rejected raw score output because users could not act on it quickly without translation.

Considered: I considered spending most time on feature depth.

Chose: I chose architecture stability and demo reliability first, then layered user-facing refinements.

Rejected: I rejected high feature churn because unstable flow would fail under presentation pressure.

System Design

Loading diagram…

Audio input is captured, transformed for inference, and mapped into practical coaching signals before it reaches the user interface. This keeps latency low while making technical output understandable in live practice.

Implementation Highlights

Low-latency analysis flow optimized for near real-time response.
Feedback loop that translates model output into actionable singing cues.
Hackathon-ready deployment path to demo stability under pressure.
Clear UX framing so technical output stays useful to non-technical users.

Tech Stack

TensorFlowVAPI APIDockerAWS

Outcomes

Won 'Most Ambitious VAPI Project' at UC Berkeley AI Hackathon 2025.
Demonstrated responsive feedback loop with practical UX direction.
Validated rapid prototyping approach for real-time AI products.

Delivered a low-latency feedback loop for live practice.Translated model output into actionable coaching cues.Integrated real-time inference under hackathon pressure.Shaped non-technical UX around technical signals.

Retrospective

Problem: Fast model output alone was not enough if users could not act on it quickly.

What I tried: I prioritized low-latency response and clear feedback phrasing so users can immediately adjust performance.

What I'd do differently: I would instrument richer per-session analytics earlier to measure learning improvement over time.

Vocal AI - Real-Time Singing Feedback