Multilingual news API: structured news data beyond English
Most news APIs are English-first and hand you raw text. NewsAgent Data goes deep on Russian and English — fully scored and classified — and tags coverage across more than a dozen languages and 190+ countries. Here's how it works, and where the enrichment line is.
Why multilingual is harder than a language list
Plenty of APIs advertise "40+ languages." The catch is depth: shallow keyword coverage with no scoring, no classification, and language detection that quietly mislabels Latin-script languages as English or Cyrillic as Russian — polluting your filters. Doing it properly means tagging language at the source level and keeping each language's data honest.
How we handle it
- Per-source language tagging with a guard that prevents Spanish-Latin being counted as English or Bulgarian-Cyrillic as Russian.
- Russian + English: fully enriched — urgency score, political lean, topic, audience, clustering.
- Spanish + Portuguese: scored — the urgency engine understands their trigger vocabulary, so a missile-strike or earthquake headline surfaces with the right score.
- 10+ more languages: coverage — French, German, Italian, Swedish, Polish and others are tagged to their country and language for breadth (not yet scored).
Filtering by language and country
curl -H "X-API-Key: YOUR_KEY" \ "https://api.newsagentdata.com/v1/feed?language=es&country=mx&min_score=5"
Language and country are separate axes, so you can ask for Spanish-language Mexican news, or all coverage of a country regardless of language. Every record carries language and country_tags.
The honest part
We don't pretend every language is deeply enriched. Russian and English are the moat; Spanish and Portuguese are scored; the rest is coverage. That honesty is the point — your language counts stay clean and your scored filters mean what they say.