📊 Time Series Analysis Agent

AI-powered data analytics through natural language queries using Google Gemini

Python 3.8+ Gemini 2.0 Flash Pandas & NumPy Natural Language Auto Code Generation

📋 Project Overview & Problem Statement

Challenge: Business analysts and data scientists spend significant time writing repetitive data analysis code, translating business questions into SQL/Python queries, and creating visualizations. This manual process is time-consuming, error-prone, and requires deep technical expertise that many stakeholders lack.

Solution: Time Series Analysis Agent leverages Google Gemini 2.0's code generation capabilities to automatically convert natural language questions into executable Python/Pandas code, perform analysis, and generate insights with visualizations - reducing analysis time from hours to seconds.

Key Benefits

🤖 AI Capabilities & Technical Innovation

💬 Natural Language Understanding

Gemini AI interprets complex business questions and translates them into precise Pandas operations, date filters, and aggregations.

🧠 Automated Code Generation

Generates production-ready Python code with proper error handling, data validation, and visualization logic.

📈 Time Series Analytics

Monthly/quarterly trends, YoY comparisons, rolling averages, seasonal patterns, and growth rate calculations.

📊 Auto Visualization

Creates charts (line, bar, scatter) with proper labels, legends, and formatting based on the query context.

AI Processing Pipeline

💬 Sample Questions & Use Cases

Revenue & Profit Analysis

  • Show me the monthly revenue trend for Electronics category in 2024
  • Compare Q4 2023 vs Q4 2024 revenue across all regions
  • What is the trend of total profit and revenue during dataset period?
  • Does profit margin fluctuate seasonally throughout the year?

Product Performance

  • List top 10 most profitable products during dataset period
  • Which product category has the highest profit margin?
  • 5 products with highest declining profit margins between January-June 2023
  • Show profit of each product in Home & Furniture by quarter in line chart

Customer Insights

  • Show top 10 repeat customers by profit in bar chart
  • How has the number of active or repeat customers changed over time?
  • List top 5 most frequently purchasing customers and products they bought
  • Show top 10 one-time customers in December 2024 that bring highest profit

Correlation Analysis

  • Show scatter plot for correlation between purchase frequency and value
  • Show correlation between profit margin and sales for each product category
  • Can you give 10 products that show consistent profit margin?

🛠️ Technical Architecture & Implementation

AI & Analytics Stack

Google Gemini 2.0 Flash Python 3.8+ Pandas 2.0+ NumPy Matplotlib Seaborn

Code Generation Framework

Natural Language Processing Code Extraction (Regex) Sandboxed Execution Error Handling Input Validation

Deployment Options

Google Colab Jupyter Notebook Streamlit (Optional) Local Python

System Architecture

Analysis Pipeline:

Helper Functions Provided

# Date filtering get_date_range(df, start, end) -> DataFrame get_quarter_data(df, year, quarter) -> DataFrame # Analytics calculate_profit_margin(df) -> Series get_top_n(df, column, metric, n) -> DataFrame detect_outliers(series, threshold) -> Series # Visualization plt.figure(figsize=(12, 6)) plt.plot() / plt.bar() / plt.scatter() plt.show()

📖 Development Setup & Usage Guide

Quick Start with Google Colab (Recommended)

  1. Open Colab Notebook: Click "Launch in Google Colab" button above
  2. Add API Key: Add GEMINI_API_KEY to Colab Secrets (🔑 icon in sidebar)
  3. Run Setup Cells: Install dependencies and import libraries
  4. Upload CSV: Upload your e-commerce transaction dataset
  5. Ask Questions: Use time_series_agent() function with your questions

Local Installation

# Clone the repository git clone https://github.com/lyven81/ai-project.git cd ai-project/projects/time-series-analysis-agent # Create virtual environment python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt # Set up environment variables cp .env.example .env # Add your Gemini API key to .env # Run in Jupyter jupyter notebook time_series_analysis_agent.ipynb

Environment Configuration

# Required API Configuration GEMINI_API_KEY=your_gemini_api_key_here # Optional Model Settings MODEL_NAME=gemini-2.0-flash-exp TEMPERATURE=0.2 MAX_OUTPUT_TOKENS=8192

Dataset Requirements

Your CSV file should include these columns:

🎯 Advanced Features & Error Handling

Self-Correction System

The agent uses status codes to handle various scenarios:

Visualization Capabilities

Time Series Operations

# Monthly resampling (uses 'ME' for month-end) monthly = df.set_index('Order_Date').resample('ME')['Revenue'].sum() # Quarter filtering q4_2024 = get_quarter_data(df, 2024, 4) # Year-over-year comparison yoy_growth = ((revenue_2024 - revenue_2023) / revenue_2023 * 100) # Rolling average df['Revenue_MA7'] = df['Revenue'].rolling(window=7).mean()

📊 Performance Metrics & Business Impact

5-10s
Analysis Time per Query
100%
Code Generation Success
50MB
Max Dataset Size
Question Types Supported

Business Value Demonstration

Technical Performance

🚀 Example Workflow

Step-by-Step Usage

# 1. Upload your CSV file uploaded = files.upload() df = pd.read_csv('your_file.csv') # 2. Ask a question result = time_series_agent( question="Show me the monthly revenue trend for Electronics in 2024", df=df, temperature=0.3 ) # 3. Get results automatically: # - Generated Python/Pandas code # - Execution logs # - Text insights # - Visualizations (if requested) # - Data tables

Sample Output

✅ Analysis Result (Status: success) In 2024, Electronics generated $2.3M in revenue across 1,245 orders, showing 15% growth vs 2023. Top product was MacBook Air ($450K). The line chart above shows consistent upward monthly trend with peak in Q4 2024. [Line chart displayed automatically]