AS.
In Progress

StoryAware

StoryAware is a domain-aware book intelligence platform built to keep answers grounded in source text with citations, structured retrieval depth, and graph-aware context. I designed and built the core architecture end-to-end: ingestion flow, RAG orchestration, GraphRAG pipeline, auth boundaries, and deployment-ready backend structure.

StoryAware preview

Problem and Constraints

Generic chat systems were hallucinating details and missing the right retrieval depth for dense philosophical and structured texts.

  • I had to keep answers source-grounded while supporting both micro-detail and high-level conceptual questions.
  • The retrieval path had to stay explainable so users could trace every answer back to evidence.
  • I needed an architecture that could add graph workflows without destabilizing core QA performance.

My Approach

Considered: I evaluated a single-vector retrieval pipeline for all question types.

Chose: I chose hierarchical retrieval with intent routing so detail, structural, and conceptual prompts get the right evidence shape.

Rejected: I rejected one-size retrieval because it produced weak grounding for precise verse-style lookups and noisy context for broad synthesis.

Considered: I considered generating answers first and attaching citations afterward.

Chose: I chose evidence assembly before generation to force source-first reasoning.

Rejected: I rejected post-hoc citation attachment because it increases hallucination risk and weakens traceability.

Considered: I considered open graph extraction from free-form model output.

Chose: I chose ontology-constrained GraphRAG extraction with evidence links and refinement loops.

Rejected: I rejected unconstrained extraction because it creates brittle relations that are hard to validate against source text.

System Design

Loading diagram…

Each question is routed by intent, matched to chunk/chapter/book retrieval depth, and assembled into evidence before generation. Graph context can be injected only after evidence selection, so outputs stay grounded while still supporting concept-level reasoning.

This project runs on my self-hosted stack - HAProxy, DNS, TLS, and PostgreSQL. See how I run projects →

Implementation Highlights

  • Hierarchical retrieval flow that combines chunk-, chapter-, and book-level evidence.
  • Question-type routing for detail, structural, conceptual, and mixed queries.
  • Graph-aware extraction path constrained by ontology and evidence links.
  • Admin-governed access model with curated corpus and session-aware conversations.

Tech Stack

FastAPIPostgreSQLpgvectorGeminiPyJWTDocker

Outcomes

  • Answers are grounded with source-first context and reduced hallucination risk.
  • Retrieval quality improves by matching abstraction level to question type.
  • Architecture supports extending into graph workflows without losing traceability.
Routed questions by intent before generation.Grounded answers with hierarchical retrieval layers.Constrained GraphRAG relations with evidence links.Governed corpus access with session-aware conversations.

Retrospective

Problem: A single-vector-index approach could not reliably serve both micro-fact and big-picture questions.

What I tried: I introduced multi-level retrieval and question routing before generation so evidence shape matches user intent.

What I'd do differently: I would add a dedicated automated evaluation harness earlier to quantify grounding quality continuously.