Ashok Sravanam - Machine Learning Engineer

StoryAware is a domain-aware book intelligence platform built to keep answers grounded in source text with citations, structured retrieval depth, and graph-aware context. I designed and built the core architecture end-to-end: ingestion flow, RAG orchestration, GraphRAG pipeline, auth boundaries, and deployment-ready backend structure.

Problem and Constraints

Generic chat systems were hallucinating details and missing the right retrieval depth for dense philosophical and structured texts.

I had to keep answers source-grounded while supporting both micro-detail and high-level conceptual questions.
The retrieval path had to stay explainable so users could trace every answer back to evidence.
I needed an architecture that could add graph workflows without destabilizing core QA performance.

My Approach

Considered: I evaluated a single-vector retrieval pipeline for all question types.

Chose: I chose hierarchical retrieval with intent routing so detail, structural, and conceptual prompts get the right evidence shape.

Rejected: I rejected one-size retrieval because it produced weak grounding for precise verse-style lookups and noisy context for broad synthesis.

Considered: I considered generating answers first and attaching citations afterward.

Chose: I chose evidence assembly before generation to force source-first reasoning.

Rejected: I rejected post-hoc citation attachment because it increases hallucination risk and weakens traceability.

Considered: I considered open graph extraction from free-form model output.

Chose: I chose ontology-constrained GraphRAG extraction with evidence links and refinement loops.

Rejected: I rejected unconstrained extraction because it creates brittle relations that are hard to validate against source text.

System Design

Loading diagram…

Each question is routed by intent, matched to chunk/chapter/book retrieval depth, and assembled into evidence before generation. Graph context can be injected only after evidence selection, so outputs stay grounded while still supporting concept-level reasoning.

This project runs on my self-hosted stack - HAProxy, DNS, TLS, and PostgreSQL. See how I run projects →

Implementation Highlights

Hierarchical retrieval flow that combines chunk-, chapter-, and book-level evidence.
Question-type routing for detail, structural, conceptual, and mixed queries.
Graph-aware extraction path constrained by ontology and evidence links.
Admin-governed access model with curated corpus and session-aware conversations.

Tech Stack

FastAPIPostgreSQLpgvectorGeminiPyJWTDocker

Outcomes

Answers are grounded with source-first context and reduced hallucination risk.
Retrieval quality improves by matching abstraction level to question type.
Architecture supports extending into graph workflows without losing traceability.

Routed questions by intent before generation.Grounded answers with hierarchical retrieval layers.Constrained GraphRAG relations with evidence links.Governed corpus access with session-aware conversations.

Retrospective

Problem: A single-vector-index approach could not reliably serve both micro-fact and big-picture questions.

What I tried: I introduced multi-level retrieval and question routing before generation so evidence shape matches user intent.

What I'd do differently: I would add a dedicated automated evaluation harness earlier to quantify grounding quality continuously.

StoryAware