โ† Back to Portfolio

๐Ÿ“š Bookshelf

Multi-Agent Inventory Planner for Malaysian Bookstores

Multi-Agent System Google ADK A2A Protocol Gemini 2.5 Flash FastAPI pandas

๐Ÿ“‹ Project Overview & Problem Statement

Challenge: Malaysian SME bookstore owners sit on tens of thousands of sales rows but rarely turn the data into stocking decisions. Pareto analysis, dead-stock detection, seasonal pre-stocking, and aging clearance each require hours of pivot-table work. Most owners run on intuition and miss obsolete syllabi (UPSR was abolished in 2021), dormant religious titles, and pre-school-season restocking windows.

Solution: Bookshelf is a 4-agent decision-support system where the owner asks any business question in plain Malaysian English and receives a focused brief grounded in their actual sales data. A deterministic pandas tool computes exact metrics; a Pydantic-typed Judge gates the research; a Content Builder turns the validated metrics into a 600-word actionable brief.

What the owner can do

๐Ÿง  Why a Multi-Agent System Beats a Chat-on-Document Tool

Tools like Google NotebookLM let an owner upload a CSV and ask questions. They are fast, polished, and free. Bookshelf is slower (~140 seconds per question vs sub-10 seconds) and operationally heavier. The trade is intentional โ€” chat-on-document tools hit a wall on numerical aggregation and quality validation. Bookshelf is built for decisions worth thousands of ringgit, where exact numbers and a quality gate matter more than speed.

Question type Chat-on-document tool (e.g. NotebookLM) Bookshelf multi-agent system
Exact aggregates ("RM impact of dropping these 8 SKUs") Approximates from chunks โ€” often wrong by 5โ€“20% pandas computes exactly
Ranked lists by computed metric ("top 10 by velocity ร— margin") Cannot sort the whole 100k rows in one prompt data_tool sorts deterministically before LLM sees it
Quality validation before recommendations Single LLM call, no validation Judge agent gates the brief; loops up to 3ร— on fail
Speed for casual exploration Sub-10s response ~140s end-to-end
Multimodal output (audio overview, mind map, quizzes) Built-in Not in Phase 1

Where Bookshelf is structurally better

The real value of building a system isn't beating NotebookLM at chat. It's that a system can do four things NotebookLM cannot:

The Phase 1 build proves the architecture. The wedges above (Phase 4+) are where the system pays back the complexity premium.

๐Ÿ—๏ธ Architecture & Execution Flow

USER question โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Web App (FastAPI, port 8000) โ”‚ โ”‚ Thin HTTP/SSE client โ€” does NOT import ADK โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ POST /run_sse โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Orchestrator agent (port 8004) โ”‚ โ”‚ SequentialAgent: โ”‚ โ”‚ LoopAgent(max_iterations=3) [ โ”‚ โ”‚ Researcher โ†’ Judge โ†’ EscalationChecker โ”‚ โ”‚ ] โ†’ Content Builder โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ RemoteA2aAgent over HTTP (A2A protocol) โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Researcher โ”‚ โ”‚ Judge โ”‚ โ”‚ Content Builder โ”‚ โ”‚ (port 8001) โ”‚ โ”‚ (8002) โ”‚ โ”‚ (8003) โ”‚ โ”‚ pandas โ”‚ โ”‚ Pydantic โ”‚ โ”‚ Markdown brief โ”‚ โ”‚ data_tool โ”‚ โ”‚ verdict โ”‚ โ”‚ + 6-class SKU โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Why split it this way

๐Ÿ–ฅ๏ธ Application Features

๐Ÿ—จ๏ธ Single ask-anything bar

One text input. Owner types any inventory question โ€” broad ("what should I do?") or specific ("should I drop Naruto Vol 5?"). No menu navigation, no upload step.

๐Ÿ“Š Deterministic data tool

pandas computes per-SKU revenue, margin, velocity, Pareto rank, last-sale date, aging class, seasonal indices, channel breakdown. Output JSON-serialised so the Judge can validate it byte-for-byte.

โš–๏ธ Quality-gated loop

Judge runs after Researcher and emits status="pass" or "fail" with structured feedback. EscalationChecker breaks the loop on pass; on fail, Researcher reruns with the feedback. Max 3 iterations.

๐Ÿท๏ธ 6-class SKU taxonomy

Content Builder classifies SKUs into push / hold / drop / restock-seasonal / discontinue / source-similar โ€” not just "good" vs "bad". Each action maps to a different shop-floor decision.

๐Ÿ‡ฒ๐Ÿ‡พ Malaysian retail context

Built-in awareness of KSSR/KSSM workbook tiers, SPM/UPSR exam syllabi (UPSR abolished 2021), Ramadan dip, January back-to-school spike, school-bulk vs in-store channel mix.

๐Ÿ“… Aging detection

Each SKU gets an aging_class (fresh / slowing / stale / stuck) based on days since last sale. Aging-stale SKUs are preserved in the LLM context even when they're mid-revenue, so dormant titles surface.

๐Ÿงช Sample Outputs (from real agent runs)

Five questions captured live from the running pipeline against 101,990 rows of Malaysian bookstore sales data (Jan 2024 โ€“ Dec 2025). Click Launch Demo to browse them.

1. What's selling best?

Top 6 SKUs are SPM/Form-5 workbooks & the Casio fx-570 calculator โ€” RM 693k revenue, 39.6% margin on the top SKU. Surfaced as push actions.

2. Which to drop?

Caught the entire UPSR-syllabus cluster (4 SKUs, RM 16,980 stuck) as discontinue. UPSR was abolished in Malaysia in 2021.

3. When to stock for school season?

SPM workbooks peak January (index 1.31) and December (1.18). Form 5 KSSM peaks January (1.59) and December (1.38). Pre-stock by November.

4. Aging clearance?

3 stale SKUs surfaced โ€” Tuhan Manusia (152 days dormant), Pakej UPSR Lengkap (140 days), ่ˆฌ่‹ฅๅฟƒ็ป (96 days). RM 8,447 to clear.

5. Best margin to push?

Modul SPM Matematik โ€” 43.7% margin, RM 418k revenue, 546 units/month. Bundle with Add Math & Sciences (all 42โ€“43% margin).

๐Ÿ› ๏ธ Technical Stack & Implementation

Frontend

Vanilla HTML/JS marked.js Server-Sent Events CSS (Inter font)

Backend / Agents

Python 3.11+ FastAPI 0.124 uvicorn 0.40 Google ADK 1.27 A2A SDK 0.3 Gemini 2.5 Flash Pydantic 2.x pandas 2.x openpyxl

Deployment (planned)

Google Cloud Run Docker (per agent) A2A protocol over HTTP authenticated_httpx

How the agents talk to each other

๐Ÿ“– Run It Locally

Prerequisites

Quick start (Windows)

# Clone the repo git clone https://github.com/lyven81/ai-project.git cd ai-project/projects/bookshelf # Install dependencies pip install -e . # Set your Gemini key copy .env.example .env # Edit .env and paste GOOGLE_API_KEY=AIza... # Start all 5 services (5 cmd windows open) start.bat # Browser opens at http://localhost:8000

What start.bat does

๐Ÿ“Š Real-Run Metrics

101,990 sales rows analysed
385 unique SKUs
RM 18.87M revenue dataset
~140s end-to-end pipeline
4 agents researcher ยท judge ยท checker ยท builder
5 captured sample briefs

Business value the captured briefs surfaced

๐Ÿšง Honest Limitations (Phase 1)

Bookshelf Phase 1 is a portfolio demonstration of the multi-agent architecture, not a production tool. These are the gaps you would close to take it to a paying SME client:

No live data pull

Currently reads a bundled dataset.xlsx. To be useful daily, the Researcher would need a connector to the shop's live POS (Square, SAP-BO BS1, daily Excel export, or AlloyDB).

No web search for sourcing

The brief can flag "this category has a sourcing gap" but cannot recommend specific products to add. Phase 3 would add a Trend Spotter agent with google_search for Malaysian distributor leads.

Slower than a chat tool

~140 seconds per question because the pipeline runs 4 LLM calls + the pandas read. Acceptable for daily/weekly briefs, too slow for casual exploration. NotebookLM wins here.

Pull-only delivery

The owner has to open the app and ask. Phase 4 would push the Monday morning brief to WhatsApp / email automatically โ€” that's the real workflow win.

Single-tenant local-only deploy

No multi-shop tenant model, no auth on the web app, no data isolation. Cloud Run deployment for Phase 1 is planned but not in scope tonight.

Mid-tier SKU coverage capped at the trim window

The LLM sees top 30 + bottom 30 + all aging-stale SKUs (~71 of 385). A specific question about a mid-tier non-aging SKU may get a generic "this is a hold" answer. Phase 2 would add a query_sku(name) direct lookup.

โ† Back to Portfolio View All Categories