Bright Path Tuition Centre

Core Source Code & NL-to-SQL Implementation

Gemini 2.5 Flash FastAPI SQLite Python GitHub Pages

About This Code Showcase

This curated code showcase demonstrates how Bright Path Tuition Centre translates natural-language management questions into safe SQL queries against a 7-table database of 185 children, 22 specialist tutors, and 63 classes.

The snippets below highlight the config-driven schema, seeded demo data, prompt template, and SQL safety validation. Deployment scripts and API keys are omitted for clarity.

Source Files

config.yaml — Schema & Business Rules

The config file defines the full 7-table schema, business rules, and quick actions. The AI agent receives this as context in every query. Here are key tables:

config.yaml — Table Definitions (excerpt)
schema: tables: students: # 185 children across Preschool to P6 columns: - student_id: INTEGER PRIMARY KEY - name: TEXT - year_group: TEXT # Preschool, P1, P2, ... P6 - date_of_birth: DATE - enrolment_date: DATE - parent_contact: TEXT - status: TEXT # active, withdrawn, graduated tutors: # 22 specialist tutors with subject expertise columns: - tutor_id: INTEGER PRIMARY KEY - name: TEXT - subject_expertise: TEXT # Maths, English, BM, Science - hire_date: DATE - status: TEXT # active, on_leave, terminated assessments: # Score records per child per subject columns: - assessment_id: INTEGER PRIMARY KEY - student_id: INTEGER REFERENCES students - subject: TEXT - score: INTEGER # 0-100 - assessment_date: DATE - assessment_type: TEXT # PBD, UASA, internal business_rules: - "Always refer to learners as 'children', never 'students'" - "Year 4 is the MOE anchor year from 2026" - "P5/P6 children below P4 baseline score trigger escalation" - "Maximum class size is 10 children" - "Improvement = latest score minus previous score" - "Top improver threshold: 15+ marks between assessments"

seed_generator.py — Seeded Demo Stories

The seed generator builds a realistic database with 9 named children whose data tells specific stories — enabling meaningful demo queries:

seed_generator.py — Seeded Story Children
# 9 seeded children with specific learning trajectories # These stories make demo queries return meaningful results SEEDED_STORIES = [ { "name": "Ahmad Rizqi", "year_group": "P4", "story": "Year 4 anchor child — strong in BM, struggling in Maths", "scores": { "BM": [82, 85, 88], "Maths": [45, 48, 52], "English": [70, 72, 75], "Science": [65, 68, 70] } }, { "name": "Mei Ling Tan", "year_group": "P6", "story": "P6 escalation — below P4 baseline in Science, needs urgent help", "scores": { "BM": [60, 58, 55], "Maths": [55, 52, 50], "English": [65, 63, 60], "Science": [40, 38, 35] } }, { "name": "Darren Lim", "year_group": "P5", "story": "Top improver — jumped 20 marks in English after tutor switch", "scores": { "BM": [70, 72, 75], "Maths": [78, 80, 82], "English": [50, 55, 70], "Science": [72, 74, 76] } }, # ... 6 more seeded stories covering: # - Preschool readiness check # - Consistent high performer (P3) # - Withdrawn child (attendance drop) # - New enrolment (P1, only 1 assessment) # - Multi-subject weakness (P4 anchor concern) # - Tutor quality comparison anchor child ] def seed_database(db_path: str): """Build data.db with 185 children (9 seeded + 176 random)""" conn = sqlite3.connect(db_path) # Insert seeded children first for child in SEEDED_STORIES: insert_student(conn, child) insert_assessments(conn, child) # Fill remaining slots with randomised children for i in range(176): insert_random_student(conn, i) conn.commit() conn.close()

main.py — FastAPI App & NL-to-SQL

The FastAPI backend receives natural language questions, constructs a Gemini prompt with the full schema context, generates SQL, validates it for safety, and returns formatted results:

main.py — Prompt Template
SYSTEM_PROMPT = """You are a data assistant for Bright Path Tuition Centre. You translate natural language questions into SQLite SQL queries. IMPORTANT RULES: - Always refer to learners as "children", never "students" - Year 4 (P4) is the MOE anchor year from 2026 - P5/P6 children scoring below P4 baseline (60%) trigger escalation - Maximum class size is 10 children - Improvement = latest score - previous score - Top improver threshold: 15+ marks between assessments - Always use SELECT only — never INSERT, UPDATE, DELETE, or DROP DATABASE SCHEMA: {schema_context} Given the user's question, return ONLY a valid SQLite SELECT query. Do not include explanations or markdown formatting. """ async def generate_sql(question: str) -> str: """Send question + schema to Gemini, get SQL back""" model = genai.GenerativeModel("gemini-2.5-flash") response = model.generate_content( SYSTEM_PROMPT.format(schema_context=schema_context) + f"\n\nQuestion: {question}", generation_config=genai.GenerationConfig( temperature=0.0, max_output_tokens=500 ) ) return response.text.strip()
main.py — SQL Safety Validation
def is_safe_sql(sql): """Block anything that isn't a read-only SELECT or CTE (WITH ... SELECT).""" upper = sql.upper().strip().rstrip(";").strip() # Allow both plain SELECT and WITH...SELECT (CTE pattern) if not (upper.startswith("SELECT") or upper.startswith("WITH")): return False # Word-boundary check so 'created_at' doesn't trip 'CREATE' dangerous = ["INSERT", "UPDATE", "DELETE", "DROP", "ALTER", "TRUNCATE", "EXEC", "REPLACE", "ATTACH"] for keyword in dangerous: if re.search(r"\b" + keyword + r"\b", upper): return False return True @app.post("/ask") async def ask_question(request: QuestionRequest): """Main endpoint: question in, answer out""" sql = await generate_sql(request.question) if not is_safe_sql(sql): return {"error": "Generated query failed safety check"} # Execute validated SQL against data.db conn = sqlite3.connect("data.db") cursor = conn.execute(sql) columns = [desc[0] for desc in cursor.description] rows = cursor.fetchall() conn.close() # Format results for the frontend results = [dict(zip(columns, row)) for row in rows] return {"sql": sql, "results": results, "count": len(results)}

Technical Implementation Notes

Key Design Decisions

Why SQLite?