Plumber WhatsApp Agent

Source Code & System Prompt Architecture

Gemini Flash HTML/CSS/JS System Prompt Engineering Bahasa Santai

About This Code Showcase

This curated code showcase demonstrates how Jamal's WhatsApp agent works — from the system prompt that defines his personality and conversation flow, to the schedule-aware booking logic, to the lean token architecture that makes it 4-5x cheaper than framework-based agents.

The entire app runs client-side in a single HTML file with no backend. The intelligence comes from a carefully engineered system prompt injected into each Gemini API call.

Core: System Prompt Architecture

The system prompt is the brain of the agent. It defines Jamal's personality, the 5-step conversation flow, fixed pricing, schedule data, and strict guardrails — all in a single prompt injected on every API call.

system-prompt.txt — Personality & Language Rules

You are Jamal, a plumber based in the Klang Valley.
You answer customer WhatsApp messages about plumbing problems.

You speak bahasa santai — casual, friendly, short messages.
Mix Malay and English naturally, like a real Malaysian tradesman texting.
Use "saya" and "awak". Keep messages 2-4 sentences max.

You NEVER reveal that you are AI, a bot, or an assistant.
You ARE Jamal. If asked "are you a bot?", deflect naturally:
"Haha bukan la, saya memang Jamal. Busy sikit tadi, tengah kerja."

system-prompt.txt — Strict 5-Step Conversation Flow

CONVERSATION FLOW (follow strictly in order, do NOT skip steps):

1. GREET — greet warmly, ask what the problem is, check their area
   (PJ/Subang/Shah Alam). If outside area, politely decline and end chat.

2. SCOPE — ask where exactly in the house, is it urgent, request
   photo if problem is unclear. Do NOT quote price until you understand.

3. QUOTE — once you know the job type, state the fixed price,
   what's included, and how long it takes. Then ask "Nak saya datang?"

4. BOOK — if customer says yes, check the schedule. Offer 2-3
   free slots. Multi-hour jobs need consecutive free slots. Ask for address.

5. CLOSE — end with a thank you.
   - Booking confirmed: repeat job, date, time, address, price.
   - Customer says no: "Ok takpe, boleh WhatsApp saya bila-bila."
   - Customer leaves at any step: thank them and stop.

system-prompt.txt — Guardrails

STRICT RULES:
- You are a PLUMBER. Only discuss plumbing problems.
  If customer asks about anything unrelated:
  "Haha saya tukang paip je boss, benda lain saya tak reti la."
- Follow the 5 steps IN ORDER. Do not skip from greeting to booking.
- Every conversation must end at Step 5.
- Do not offer services not in your price list.
- Do not negotiate prices. The price is fixed.
- Never say "bergantung", "lebih kurang", or "tengok keadaan".

Schedule-Aware Booking Logic

The 7-day schedule is embedded directly in the HTML and injected into the system prompt. The LLM reads the free/booked slots and reasons about availability in natural language.

demo.html — Building the Schedule Prompt

function buildSystemPrompt() {
  // Build service list from config
  const svcList = scheduleData.services.map(s =>
    `- ${s.name_ms} (${s.name}): RM${s.price_rm}, ${s.duration_slots} jam`
  ).join('\n');

  // Group bookings by date
  const bookingsByDate = {};
  for (const b of scheduleData.bookings) {
    if (!bookingsByDate[b.date]) bookingsByDate[b.date] = [];
    bookingsByDate[b.date].push(b);
  }

  // For each day, compute occupied vs free slots
  const allSlots = ['09:00','10:00',...,'17:00'];
  for (let i = 0; i < 7; i++) {
    const occupiedSlots = new Set();
    for (const b of dayBookings) {
      for (const s of b.slot_list) occupiedSlots.add(s);
    }
    const freeSlots = allSlots.filter(s => !occupiedSlots.has(s));

    // Inject into prompt: "FREE SLOTS: 11:00, 12:00, 13:00, 16:00"
    scheduleText += `FREE SLOTS: ${freeSlots.join(', ')}\n`;
  }
}

Gemini API Integration

Each user message is sent to Gemini Flash with the full system prompt and conversation history. The response is rendered as a WhatsApp-style chat bubble.

demo.html — Client-Side Gemini API Call

async function callGemini(userMessage) {
  conversationHistory.push({
    role: 'user',
    parts: [{ text: userMessage }]
  });

  const body = {
    system_instruction: { parts: [{ text: systemPrompt }] },
    contents: conversationHistory,
    generationConfig: {
      temperature: 0.7,
      maxOutputTokens: 300,
    }
  };

  const res = await fetch(
    `https://generativelanguage.googleapis.com/v1beta/models/
     gemini-2.0-flash:generateContent?key=${API_KEY}`,
    { method: 'POST', headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify(body) }
  );

  const data = await res.json();
  const reply = data.candidates?.[0]?.content?.parts?.[0]?.text;

  conversationHistory.push({ role: 'model', parts: [{ text: reply }] });

  // Keep conversation manageable (last 20 turns)
  if (conversationHistory.length > 20) {
    conversationHistory = conversationHistory.slice(-20);
  }

  return reply;
}

Booking Seed Generator

A Python script generates realistic booking data for the 7-day demo period. It follows the same pattern as the Wee Auto Car Care project — anchor multi-slot bookings placed first, then random single-slot jobs fill to ~50% capacity.

generate_seed.py — Booking Generation Logic

# Target ~50% occupancy across 7 days (63 total slots)
TARGET = {
    "2026-03-30": 5,   # Mon
    "2026-03-31": 3,   # Tue
    "2026-04-01": 5,   # Wed
    "2026-04-02": 4,   # Thu
    "2026-04-03": 5,   # Fri
    "2026-04-04": 5,   # Sat
    "2026-04-05": 4,   # Sun
}
# Total: 31 slot-units out of 63 = 49.2%

# Anchor multi-slot jobs for demo variety
ANCHORS = {
    "2026-03-30": [("09:00", "Pipe Leak Repair"),      # 2 slots
                   ("14:00", "Water Heater Install")], # 2 slots
    "2026-04-03": [("09:00", "Bathroom Renovation")],  # 3 slots
}

Production Deployment Architecture

This demo runs client-side with a BYO API key. In production, the following additions are needed:

Production Architecture

DEMO (this page):
  Customer Browser → Gemini API → Browser renders reply
  // Single user, single tab, no persistence

PRODUCTION:
  Customer WhatsApp → WhatsApp Business API (Twilio)
    → Backend Server (Flask/FastAPI on Cloud Run)
    → LLM API (Gemini/Claude)
    → Backend → WhatsApp API → Customer WhatsApp

  // Session management additions:
  + SQLite/Firestore for conversation persistence
  + 24-hour session timeout with follow-up message
    ("Hi, masih nak book tak?")
  + Per-phone-number conversation threads
  + Multi-customer concurrent support

Key Design Decisions

Embedded schedule over fetch(): Schedule data is baked into the HTML as a JS object — avoids CORS/file:// issues, works everywhere without a server
System prompt over function calling: The LLM reasons about slot availability in natural language rather than through structured tool calls — simpler, cheaper, and sufficient for a single-plumber schedule
Conversation history pruning: Only the last 20 turns are sent per API call — keeps token cost predictable even in long conversations
BYO key over server-pays: Visitor supplies their own Gemini API key — zero hosting cost, no rate limiting needed, educational for the visitor
Bahasa santai over formal Malay: Casual tone matches how real Malaysian tradesmen actually text — builds trust and makes the AI persona believable

Try Live Demo View Full Project Details