Good Company

Source Code & Architecture

Python FastAPI Gemini API PyMuPDF Vanilla JS

Project Structure

good-company/
  main.py # FastAPI backend, PDF reading, Gemini API calls
  prompts.py # All 10 analysis prompts (Phase 1, Read More, Phase 2, Final)
  index.html # Full UI with visualization renderers
  demo.html # Standalone demo (paste text, no server needed)
  requirements.txt # Python dependencies
  .env # Gemini API key
  start.bat # Windows launcher
  user-guide.txt # Plain language user guide
  Quarter report/ # Drop PDFs here

Backend: main.py

Key Design Decisions

PDF text cache: Extracted text is cached per filename so multiple prompts reuse the same extraction without re-reading the PDF.

Visual data parsing: Each LLM response is split into narrative text and a VISUAL_DATA JSON block. The frontend renders the JSON as cards, scorecards, and bars.

Date parsing from filenames: "Company - DD Mon YY.pdf" is parsed with regex to sort reports by date (newest first).

Core: PDF Extraction + Gemini Call

def extract_text(filename: str) -> str:
    if filename in _text_cache:
        return _text_cache[filename]
    path = REPORT_DIR / filename
    doc = fitz.open(str(path))
    text = "\n".join(page.get_text() for page in doc)
    doc.close()
    _text_cache[filename] = text
    return text

def parse_visual_data(response_text: str):
    match = re.search(
        r"VISUAL_DATA_START\s*(\{.*?\})\s*VISUAL_DATA_END",
        response_text, re.DOTALL
    )
    if match:
        text = response_text[:match.start()].strip()
        try:
            return text, json.loads(match.group(1))
        except json.JSONDecodeError:
            return text, None
    return response_text.strip(), None

API Endpoint

@app.post("/analyze")
async def analyze(req: AnalyzeRequest):
    all_prompts = {**PHASE_1, **READ_MORE, **PHASE_2, **FINAL}
    prompt_data = all_prompts.get(req.prompt)
    text = extract_text(req.report)
    template = prompt_data["instruction"]
    full_prompt = template.replace("{report_text}", text)

    response = model.generate_content(full_prompt)
    narrative, visual_data = parse_visual_data(response.text)

    return {
        "result": narrative,
        "visual_data": visual_data,
        "visual_type": prompt_data.get("visual_type", "none"),
    }

Prompts: prompts.py

10 prompts organized into 4 groups. Each prompt includes:

Prompt Groups

PHASE_1 = {
    "cheat_sheet":        # Step 1 — visual: metric cards
    "forensic_check":     # Step 2 — visual: green/yellow/red scorecard
    "business_vs_stock":  # Step 3 — visual: verdict badge + score bars
}

READ_MORE = {
    "deep_dive":           # text only
    "earnings_breakdown":  # text only
    "stress_test":         # visual: 3 scenario cards
    "company_comparison":  # visual: comparison table
}

PHASE_2 = {
    "valuation":     # Step 4 — visual: cheap/fair/expensive range
    "bull_vs_bear":  # Step 5 — visual: bull/bear tally bar
    "risk_check":    # Step 6 — text only (personalized)
}

FINAL = {
    "final_scorecard":  # Step 7 — visual: 7-dimension scorecard + PASS/FAIL
}

Frontend: index.html

Single-file frontend with 7 visualization renderers. Each renderer takes the VISUAL_DATA JSON and produces HTML:

Visualization Types

function renderVisual(type, data) {
    switch (type) {
        case "metrics":        return renderMetrics(data);
        case "scorecard":      return renderScorecard(data);
        case "verdict":        return renderVerdict(data);
        case "valuation":      return renderValuation(data);
        case "bullbear":       return renderBullBear(data);
        case "scenarios":      return renderScenarios(data);
        case "comparison":     return renderComparison(data);
        case "scorecard_final": return renderFinalScorecard(data);
    }
}

Copy Report Function

Stores raw markdown per section in a reportTexts object. On copy, assembles all opened sections into a formatted text document with headers and separators.