Hoo Tim Analytics Agent

📁 File Structure

projects/hoo-tim-analytics-agent/ ├── demo.html # Two-panel UI + tool dispatcher + Gemini call ├── data.js # 129 KB of pre-aggregated distribution views ├── precompute.py # Reads dataset.csv, emits data.js (11 views) ├── dataset.csv # 6,074 synthetic order lines (Dec 2025) └── README.md # Project summary and architecture

🎯 Key Code Highlights

1. MCP tools — in-browser query layer

Five deterministic tools run directly against the pre-aggregated window.HOOTIM_DATA object. No server, no calls out. Each tool returns pure JSON for Gemini to narrate.

const TOOLS = {
  get_aggregate: (dim) => {
    const map = { district: D.district, channel: D.channel,
                  category: D.category, sku: D.sku.slice(0, 40) };
    return map[dim] || null;
  },
  get_top_n: (dim, n, direction) => {
    const sorted = [...(D[dim] || [])];
    const dir = direction === 'asc' ? 1 : -1;
    const metric = dim === 'sku' ? 'quantity' : 'revenue';
    sorted.sort((a, b) => dir * (a[metric] - b[metric]));
    return sorted.slice(0, n);
  },
  get_cross_tab: (rowDim, colDim) => { /* district × channel, category × channel */ },
  get_delivery_performance: (groupBy) => { /* on-time / late % */ },
  get_time_series: (period) => { /* daily trend or campaign window */ },
};

2. Keyword-based tool dispatcher

Rather than round-trip function calling, a lightweight regex matcher picks relevant tools from the user's question. This keeps latency low and cost predictable — one Gemini call per question.

function dispatchTools(question) {
  const q = question.toLowerCase();
  const calls = [];

  if (/\bsku\b|slow mov|bottom.*sku/.test(q)) {
    calls.push({ tool: 'get_top_n', args: { dim: 'sku', n: 10,
                 direction: /slow|bottom/.test(q) ? 'asc' : 'desc' },
                 output: TOOLS.get_top_n('sku', 10, ...) });
  }
  if (/district.*channel|channel.*district/.test(q)) {
    calls.push({ tool: 'get_cross_tab',
                 args: { row: 'district', col: 'channel' },
                 output: D.district_channel_matrix });
  }
  if (/deliver|on[- ]?time|late/.test(q)) {
    const groupBy = /channel/.test(q) ? 'channel' : 'district';
    calls.push({ tool: 'get_delivery_performance',
                 args: { group_by: groupBy },
                 output: TOOLS.get_delivery_performance(groupBy) });
  }
  // ...9 more dispatch rules
  return calls;
}

3. Precompute step — aggregate once, ship tiny

The raw dataset is 6,074 rows × 14 columns (~950 KB). Shipping it to the browser would be wasteful. precompute.py rolls it into 11 aggregated views totalling 129 KB — enough for every preset question.

# precompute.py excerpt
by_district = defaultdict(lambda: {"revenue": 0, "deliveries": defaultdict(int), ...})
for r in rows:
    by_district[r["district"]]["revenue"] += r["revenue"]
    by_district[r["district"]]["deliveries"][classify_delivery(r)] += 1

district_table = [{
    "district": d,
    "revenue": round(v["revenue"], 2),
    "on_time_pct": round(v["deliveries"]["on_time"] / total_d * 100, 1),
    "slightly_late_pct": round(v["deliveries"]["slightly_late"] / total_d * 100, 1),
    "very_late_pct": round(v["deliveries"]["very_late"] / total_d * 100, 1),
    ...
} for d, v in by_district.items()]

4. Grounded Gemini call — strict "don't invent"

The system prompt pins the agent to the tool output. Gemini 2.0 Flash is cheap, fast, and obedient with temperature 0.2. The MCP outputs are inlined so the model has no excuse to hallucinate numbers.

const systemPrompt = `You are the Hoo Tim Analytics Agent...

RESPONSE RULES:
- Ground every number in the tool output below. Do not invent figures.
- Use markdown with a clean table, then a short "Insight" paragraph
  and a "Recommendation" bullet.
- Keep answers under 250 words unless more detail is demanded.
- Use "RM" prefix for money, comma thousands, round to nearest RM.
- Audience: Malaysian SME business owner — practical language.

MCP TOOL OUTPUTS:
${toolContext}`;

await fetch(`https://generativelanguage.googleapis.com/v1beta/
    models/gemini-2.0-flash:generateContent?key=${apiKey}`, {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    system_instruction: { parts: [{ text: systemPrompt }] },
    contents: [{ role: 'user', parts: [{ text }] }],
    generationConfig: { temperature: 0.2, maxOutputTokens: 1400 }
  })
});

5. BYO-key security — sessionStorage only

The user's Gemini key is kept in sessionStorage — it disappears when the tab closes. It is never sent anywhere except directly to generativelanguage.googleapis.com.

const savedKey = sessionStorage.getItem('hootim_gemini_key');
if (savedKey) { apiKeyInput.value = savedKey; apiDot.classList.add('ok'); }

function saveKey() {
  const k = apiKeyInput.value.trim();
  if (!k) return;
  sessionStorage.setItem('hootim_gemini_key', k);
  apiDot.classList.add('ok');
}

6. Live tool trace — transparent reasoning

Every question renders a dark terminal-style trace line showing which MCP tools were called, their arguments, and the row count returned. Users can see exactly what the agent looked at before writing the answer.

function addTrace(calls) {
  const lines = calls.map(c => {
    const rowCount = Array.isArray(c.output)
      ? c.output.length + ' rows'
      : Object.keys(c.output).length + ' keys';
    return `<span class="trace-line">▸
      <span class="trace-key">${c.tool}</span>
      (${JSON.stringify(c.args)}) →
      <span class="trace-val">${rowCount}</span>
      </span>`;
  }).join('');
  // append to chat…
}

🔗 Full Source

All files live in the portfolio repo:

github.com/lyven81/ai-project/tree/main/projects/hoo-tim-analytics-agent

🍪 Hoo Tim Analytics Agent — Source Code