An AI-powered forensic investigation tool that helps police officers analyze telecom data (CDR, IPDR, Tower Dumps) through natural language conversation β built for the Police Hackathon.
We use a ReAct-style autonomous agent powered by a large language model that can reason, plan, and execute forensic analysis tools in a loop.
gpt-5.4-mini via OpenAI-compatible API
ReAct Tool-Calling Agent β reasons about the question, picks tools, reads results, chains up to 8 iterations
Prompted as a forensic analyst specialising in Indian telecom data with knowledge of +91 formats, tower IDs, and investigation procedures
Built-in retry detection β stops after 2 failures on same tool, force-answers after 3 total errors
The agent has access to 9 specialized tools. Tools 7-9 work across all 3 datasets simultaneously.
| # | Tool | Datasets | What It Does |
|---|---|---|---|
| 1 | filter_records | Single | Filter any dataset by phone number, time range, or tower ID |
| 2 | find_connections | CDR | Find direct call links between a set of suspect numbers |
| 3 | cross_reference | CDR+IPDR+Tower | Build a full profile of one number across all datasets |
| 4 | tower_lookup | Tower | Find all devices near a specific cell tower in a time window |
| 5 | flag_anomaly | Single | Detect high call volume (50+), burner phones, IMEI swaps, burst calling |
| 6 | generate_summary | CDR+IPDR+Tower | Generate structured case report with suspect profiles |
| 7 | correlate_suspects | ALL 3 | Find numbers appearing in multiple datasets, rank by suspiciousness, filter by tower/time |
| 8 | timeline | ALL 3 | Chronological timeline of calls, internet, tower pings for a suspect |
| 9 | deep_analysis | ALL 3 | Crime scene analysis: phones near tower + mutual calls + IMEI cross-match |
Who called who, when, call duration, tower used, IMEI. Auto-detected by a_party_number column.
Internet sessions, source/destination IPs, services accessed, data transferred. Auto-detected by source_ip or destination_ip.
All phones that pinged a cell tower in a time window, with IMEI, signal type, location. Auto-detected by time_of_activity.
| Layer | Technology | Purpose |
|---|---|---|
| Backend | FastAPI (Python) | REST API server, file upload, chat endpoint, PDF export |
| Server | Uvicorn | ASGI server running the FastAPI app |
| AI Client | OpenAI SDK | Communicates with LLM via OpenAI-compatible API |
| Data Processing | pandas | All filtering, grouping, anomaly detection on DataFrames |
| PDF Export | fpdf2 | Generates investigation reports as PDF |
| Excel Support | openpyxl | Reads .xlsx/.xls uploads |
| Frontend | HTML/CSS/JS | Single-page app with dark forensic theme |
| Network Graph | vis.js | Interactive node-edge graph for call networks |
| Charts | Chart.js | Bar charts for activity visualization |
| Config | python-dotenv | Loads API keys from .env file |
Agent chains up to 8 tool calls per query β cross-references, then checks anomalies, then builds timeline automatically.
Detects IMEI swaps across tower and CDR data β flags suspects using multiple devices.
Deep analysis around a tower: finds who was there, checks if they called each other, cross-matches IMEIs.
Network graphs showing call connections, bar charts for activity patterns, real-time stats.
One-click export of investigation findings as a structured PDF report for court submission.
Auto-classifies uploaded files as CDR, IPDR, or Tower by analyzing column names β no manual selection needed.
| File | Lines | Role |
|---|---|---|
main.py | ~740 | FastAPI app β endpoints, visualization logic, PDF export |
agent.py | ~170 | AI agent loop β LLM calls + tool execution + error recovery |
tools.py | ~800 | All 9 forensic analysis tools (pandas-based) |
tool_definitions.py | ~200 | OpenAI function-calling schemas for the 9 tools |
templates/index.html | ~810 | Full frontend UI (HTML + CSS + JS) |
templates/about.html | β | This page β project documentation |
test_tools.py | ~150 | 43 automated tests for all 9 tools |
All 9 tools are covered by 43 automated tests that run against the demo dataset (5000 CDR + 5000 IPDR + 5000 Tower records). Tests verify filtering, cross-referencing, anomaly detection, timeline building, deep analysis, and error handling.
Run tests: python test_tools.py