📋 Project Overview & Problem Statement
Challenge: A coffee shop owner receives dozens of reviews per week across Twitter, Instagram, Facebook, and Google — thousands per quarter. The raw feed is impossible to read end-to-end, impossible to compare channel by channel, and impossible to prioritise. Was that bad week a Banana Bread problem, a Facebook problem, or neither? By the time the pattern becomes visible, a month of customer confidence has already leaked.
Solution: The Public Sentiment Collection Agent runs social-channel review data through a 5-agent AI pipeline — listening, sentiment classification, visualisation, export, and packaging. This showcase demonstrates the pipeline applied to 500 coffee shop reviews from Q1 2024. Output: ranked product list, channel-by-channel net-sentiment scores, flagged SKUs where negative outweighs positive, and a 60-day action plan — in under 10 minutes.
🚨 Why Channel-Level Analysis Matters
Example: a mid-sized Malaysian coffee shop, 500 reviews in Q1 2024
❌ WITHOUT channel segmentation: "50% positive" — sounds fine, hides which channel is bleeding reputation
✅ WITH channel segmentation:
- Google: +27.6 net sentiment, 55% positive — strong
- Facebook: +4.2 net sentiment, 41% negative — actively bleeding reputation
- Two products (Banana Bread, Matcha Latte) have negative reviews outnumbering positive
- The signal points to 2 SKUs and 1 channel to fix, not "the coffee shop is bad"
Key Benefits
- Channel-level segmentation: Separate Google from Facebook — they report opposite stories
- Product-level drill-down: Find the specific SKUs where negative sentiment exceeds positive
- Net sentiment score per channel: Single KPI for monthly tracking and trend detection
- Automatic flagging: Surfaces products / channels needing urgent attention
- Comprehensive output: 10 files per run (1 markdown report, 4 charts, 5 CSV exports)
🔍 Channel Behaviour & Net-Sentiment Scoring
Channel Behaviour Classification
The system automatically classifies each social channel's behavioural pattern — in the coffee shop case:
- Google: Long-form, effort-required review — tends to reward positive experiences (55% positive, +27.6 net)
- Instagram: Visual and casual — mostly positive (52% positive, +17.9 net)
- Twitter: Short-form, polarised — balanced split (50% positive, +15.6 net)
- Facebook: Complaint-friendly — harshest channel (45% positive, 41% negative, +4.2 net)
Automatic Issue Flags
The system issues warnings when a channel, category, or product signals distress — examples from the coffee shop case:
⚠️ Product negative outweighs positive: Banana Bread (50% neg, 33% pos)
⚠️ Product negative outweighs positive: Matcha Latte (47% neg, 40% pos)
⚠️ Channel net sentiment below +10: Facebook (+4.2 — active reputation repair needed)
⚠️ Category net sentiment below +15: Cold Coffee & Iced Drinks (+12)
Net Sentiment Score Calculation
Net Sentiment Score per channel =
(Positive count − Negative count) / Total × 100
🟢 +20 and above: Healthy (channel ambassadors — amplify content)
🟡 +10 to +19: Monitor (watch the trend, respond to complaints)
🔴 Below +10: Active repair (reply daily, tactical recovery playbook)
🛠️ Technical Architecture & Implementation
AI & Analytics Stack
Google Gemini 2.0 Flash
Tavily Search API
Python 3.9+
Pandas 2.0+
Matplotlib
Seaborn
Multi-Agent Framework
5 Specialized Agents
Web Search Integration
NLP Sentiment Analysis
Data Quality Scoring
Auto Visualization
Deployment Options
Google Colab
Jupyter Notebook
Local Python
Streamlit (Optional)
System Architecture
Pipeline Flow:
1. Geographic Listening → Web search with location filters
2. Source Analysis → Diversity tracking & bias detection
3. Sentiment Analysis → Gemini AI with cultural context
4. Credibility Scoring → Quality assessment (0-100)
5. Visualization → 4 professional charts
6. Data Export → 5 CSV files for Excel/Sheets
7. Report Packaging → Executive markdown report
📖 Development Setup & Usage Guide
Quick Start
- Try the AI Assistant Demo: Click "Launch AI Assistant Demo" button above — zero-setup interactive walkthrough of the 5-agent pipeline
- Run the full pipeline locally: Clone the repo, add GOOGLE_API_KEY and TAVILY_API_KEY to a
.env file
- Install dependencies:
pip install -r requirements.txt
- Run Analysis:
python public_sentiment_collection_agent.py
- Download Results: Get markdown report, 4 charts, and 5 CSV files
Example Usage — coffee shop review case
# Example: Analyse coffee shop reviews across 4 social channels
results = run_enhanced_sentiment_pipeline(
review_dataset="coffee_shop_reviews_q1_2024.csv",
channels=["Twitter", "Instagram", "Facebook", "Google"],
period="Q1 2024",
output_dir="."
)
# Output:
# - voice_of_customer_report_q1_2024.md
# - channel_sentiment_comparison.png
# - product_performance_ranking.png
# - category_heatmap.png
# - monthly_volume_trend.png
# - channel_sentiment.csv
# - category_performance.csv
# - flagged_products.csv
# - customer_reviewers.csv
# - monthly_volume.csv
Required API Keys
- Google AI Studio API: Get from Google AI Studio (free tier available)
- Tavily Search API: Get from Tavily (free 1,000 searches/month)
⚠️ Limitations & Disclaimers
Data Collection Limitations
- Self-selection bias: Reviewers are customers motivated to post — silent majority is not represented
- Channel skew: Each platform has its own review culture (Google rewards effort, Facebook rewards complaint)
- Temporal snapshot: Sentiment reflects the period captured — can shift quickly after menu changes or incidents
- Language coverage: Primarily English/Malay reviews — non-English opinions underrepresented in the demo dataset
- Product attribution: Reviews without a specified product roll up to "general" — limits SKU-level precision
Sentiment Classification Challenges
- Sarcasm and irony can flip classification (e.g. "Great, just what I wanted" read as positive)
- Short reviews (< 5 words) give lower-confidence signals
- Reviews about service vs product can conflate — the demo pipeline tracks product but not service explicitly
Recommended Use
✅ Good for: Trend detection, SKU flagging, channel health monitoring, monthly voice-of-customer reporting
⚠️ Caution for: Attributing cause to single reviews, punishing individual staff based on reviews, forecasting revenue from sentiment alone
❌ Not for: Statistical inference about silent non-reviewing customers