📋 Project Overview & Problem Statement
Challenge: Marketing teams struggle with manual customer segmentation, spending days analyzing spreadsheets to identify VIP customers, at-risk segments, and campaign targets. This process is time-consuming, error-prone, and requires SQL/Python expertise that many marketers lack.
Solution: Customer Segmentation and Market Agent uses Google Gemini AI to convert natural language questions into executable Python code. It automatically creates customer profiles, performs RFM (Recency, Frequency, Monetary) analysis, and generates targeted marketing campaigns - all through simple English queries.
Key Benefits
- Zero Coding Required: Ask questions in plain English, get instant insights
- Automatic Segmentation: VIP, Regular, At-Risk, New, and Churned customers identified automatically
- Campaign Creation: Generate targeted marketing campaigns with precise customer lists
- RFM Analysis: Track recency, frequency, and monetary value for every customer
- Transparent Operations: See generated code and database changes in real-time
🛠️ Technical Architecture & Implementation
AI & Code Generation
Google Gemini 2.0 Flash
Code-as-Plan Pattern
Natural Language Processing
Dynamic Query Building
Safe Code Execution
Data Analytics Stack
TinyDB NoSQL
Pandas DataFrames
Matplotlib Charts
Seaborn Visualizations
NumPy Calculations
Deployment Platform
Google Colab
Jupyter Notebook
Cloud Storage
Secrets Management
Customer Segmentation Logic
# Automatic segmentation rules:
VIP: total_spent > $2,000 AND total_orders >= 5
Regular: total_spent between $500-$2,000
At-Risk: days_since_last_order > 180
New: total_orders <= 2
Churned: days_since_last_order > 365
# Each customer profile tracks:
- customer_id, customer_name
- total_orders, total_spent, total_profit_contributed
- average_order_value
- first_order_date, last_order_date, days_since_last_order
- favorite_category, segment, tags
📖 Development Setup & Installation Guide
Quick Start with Google Colab (Recommended)
- Open Colab Notebook: Click "Launch in Google Colab" button above
- Add API Key: Click 🔑 icon in left sidebar → Add Secret → Name: GEMINI_API_KEY
- Run Setup Cells: Install dependencies (TinyDB, Pandas, Gemini)
- Upload CSV: Upload your e-commerce transaction dataset
- Ask Questions: Use marketing_agent() function with natural language queries
Dataset Requirements
Your CSV should include these columns:
- Order_ID, Order_Date, Customer_Name
- Category, Product_Name, Quantity, Unit_Price
- Revenue, Profit, Region, City, State
Local Installation (Alternative)
# Clone the repository
git clone https://github.com/lyven81/ai-project.git
cd ai-project/projects/customer-segmentation-agent
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Add your Gemini API key to .env
# Run in Jupyter
jupyter notebook customer_segmentation_and_market_agent.py
Environment Configuration
# Required API Configuration
GEMINI_API_KEY=your_gemini_api_key_here
# Optional Segmentation Rules
VIP_MIN_SPENT=2000
VIP_MIN_ORDERS=5
AT_RISK_DAYS_THRESHOLD=180
CHURNED_DAYS_THRESHOLD=365
🚀 Deployment Options & Usage Workflow
Google Colab Deployment (Primary Method)
# 1. Upload dataset CSV
uploaded = files.upload()
df = pd.read_csv('your_dataset.csv')
# 2. Create database (3 tables)
db, orders_tbl, customers_tbl, campaigns_tbl = create_marketing_database(df)
# 3. Ask questions in natural language
result = marketing_agent(
"Who are our top 10 VIP customers by total spending? Show in bar chart",
db=db,
orders_tbl=orders_tbl,
customers_tbl=customers_tbl,
campaigns_tbl=campaigns_tbl,
)
# 4. View generated code, execution logs, and results
# The agent shows:
# - Question interpretation
# - Generated Python/TinyDB code
# - Before/after database snapshots
# - Answer with visualization
Workflow Example
- Step 1: Upload CSV with transaction history
- Step 2: Database auto-creates orders_tbl from CSV
- Step 3: Ask "Create customer profiles for all customers"
- Step 4: Agent populates customers_tbl with RFM metrics
- Step 5: Ask "How many VIP customers do we have?"
- Step 6: Create campaigns: "Target At-Risk customers in East region"
Output & Transparency
- Generated Code: See exact Python code before execution
- Database Snapshots: Before/after views of customers_tbl and campaigns_tbl
- Execution Logs: Debug output for transparency
- Visualizations: Bar charts, pie charts, segment distributions