← Projects

Argus

The Problem

GDELT is a public database that monitors the world’s news in real time — 300+ sources, updated every 15 minutes, hundreds of thousands of events logged every day. The problem is that most of what it flags as conflict isn’t: crime reports, court cases, and sports stories that got mislabeled. A map built directly on that data is useless. It’s just noise.

What I Did

I needed to make that data actually usable, which meant deciding what to filter, how, and in what order. I designed a layered approach — drawing on Palantir’s open source documentation for the pipeline architecture — simple rules handle the obvious cases first (fast and free), then an AI reads the source article and scores genuinely ambiguous events before making a call. Sequencing it that way kept costs low and made every decision traceable — you can always see why something was included or rejected.

I also recognized that checking a media report against more media reports is a weak form of verification. So I brought in NASA satellite thermal data as a second, independent source. When a reported explosion lines up with a heat signature at the same location and time, that’s physical evidence — not just another headline saying the same thing. The interface labels every event by its evidence type so there’s no ambiguity about what you’re looking at.

I owned the full scope from problem definition to deployment. It runs on a 15-minute automated cycle and is live at argusosint.vercel.app.

Argus โ€” live global conflict event map with severity clustering and real-time data

Data Sources

GDELT 2.0 300+ news sources processed every 15 minutes — the primary event ingestion layer, filtered down from hundreds of thousands of daily records
UCDP GED Peer-reviewed fatality dataset from Uppsala University — fused with GDELT signals so analysts can distinguish validated records from raw intelligence
NASA FIRMS Near real-time satellite thermal imagery — matched to reported kinetic events by coordinates and timestamp for physical corroboration independent of media

Stack

React 19 + Mapbox GL JS Frontend analyst interface — live clustered event map, source-badged feed, and detail panel with full AI classification breakdown
Express.js API layer serving filtered events and satellite corroboration data
Claude Haiku AI classifier for ambiguous events — scores credibility, severity, specificity, novelty, and conflict relevance; returns auditable output
GitHub Actions 15-minute cron for automated event ingestion, pipeline execution, and satellite matching
Vercel + Vercel Blob Deployment and filtered event persistence for predictable frontend latency