About — The Front Page

What is this?

The Front Page is a free, non-commercial news aggregator that displays top headlines from credible news sources across 108 countries. It is completely static, uses Google Analytics for aggregate traffic measurement, has no ads, no algorithmic personalization, and no filter bubbles.

How are sources selected?

RSS feeds are pulled from multiple open-source repositories and merged into a single deduplicated feed list:

Primary upstream: yavuz/news-feed-list-of-countries — ~900 feeds across 100+ countries, community-maintained with automated feed validation
Supplementary: vandenbroucke/rss-news-list — 50+ feeds from major international outlets with ISO country codes
Fallback database: 127 countries with curated tier 0/1 sources, including AllAfrica country-specific feeds for 14 African nations, and institutional feeds (WHO, EU Commission) under their headquarter countries

All sources are merged and deduplicated by URL. Feeds with bot protection are automatically excluded since they cannot be reliably fetched.

Geographic Coverage

The fallback feed database covers 127 countries across every inhabited region, with particular attention to areas historically underrepresented in English-language news aggregators:

Africa (14+ countries): AllAfrica country-specific feeds provide coverage for Algeria, Botswana, Cameroon, Ethiopia, Libya, Morocco, Mozambique, Namibia, Senegal, Sudan, Tanzania, Tunisia, Uganda, Zambia, and Zimbabwe
Central Asia: Kazakhstan, Kyrgyzstan, Mongolia, Tajikistan, Turkmenistan, Uzbekistan
Caribbean & Central America: Jamaica, Cuba, Dominican Republic, Costa Rica, El Salvador, Guatemala, Honduras, Nicaragua, Panama
Institutional sources: WHO (Switzerland) and EU Commission (Belgium) feeds provide international organization coverage

Source Tiering

Sources are assigned to tiers based on their name. Higher-tier sources receive a ranking boost:

Tier 0 (Wire Services & Public Broadcasters): Reuters, Associated Press, AFP, Bloomberg, UPI, BBC, NPR, NHK, Deutsche Welle, France 24, RFI, Al Jazeera, Al Arabiya, CGTN, Voice of America, Xinhua, Yonhap, TASS, ANSA, Kyodo, EFE, CBC, PBS, ABC News, DPA, Press Trust of India, Anadolu Agency, Antara, Bernama, IRNA, Lusa, Agerpres, Ukrinform, Interfax, AllAfrica, YLE, NRK, RNZ, and others
Tier 1 (Papers of Record): New York Times, Washington Post, Wall Street Journal, The Guardian, Financial Times, The Economist, Le Monde, Le Figaro, El País, Der Spiegel, FAZ, Corriere della Sera, Times of India, Hindustan Times, The Hindu, Asahi Shimbun, Nikkei, South China Morning Post, Straits Times, Korea Herald, Sydney Morning Herald, Globe and Mail, Arab News, Haaretz, Dawn, Jakarta Post, Bangkok Post, El Tiempo, La Tercera, Kyiv Independent, Moscow Times, Daily Maverick, and others
Tier 2 (Default): All other sources

Ranking Algorithm

Each headline is scored using a weighted formula:

Factor	Weight	Description
Consensus	35%	Stories covered by multiple sources score higher. Calculated as min(source_count × 3, 10).
Recency	30%	Exponential decay with a 6-hour half-life. Formula: 10 × e^(-0.693 × hours / 6)
Source Tier	20%	Higher-tier sources get more points: (5 - tier) × 2
Importance	15%	Keyword-based scoring (see below)

Importance Keywords

Headlines containing certain keywords receive score adjustments:

Boosted (+6 to +10): breaking, urgent, emergency, killed, dead, death, war, attack, explosion, earthquake, president, prime minister, election, parliament, congress, law, recession, inflation, billion, protest, court, arrest

Penalized (-3 to -10): celebrity, kardashian, instagram, viral, meme, horoscope, "you won't believe", shocking

Additional penalties apply for excessive exclamation marks or ALL CAPS headlines.

Source Diversity

To avoid showing 3 headlines from the same outlet, repeated sources receive a 30% score penalty per occurrence. This means if a source's headlines rank #1, #2, and #3, the second headline's score is reduced to 70%, and the third to 49%, allowing other sources to compete.

Regional Wire Backfill

Countries with fewer than 3 local headlines are supplemented by regional wire service feeds. There are currently 25 regional wire feeds from 11 services:

BBC News — 7 regional feeds (Africa, Asia, Europe, Latin America, Middle East, North America, Oceania)
France 24 — 5 feeds (Africa, Middle East, Asia-Pacific, Europe, Americas)
Deutsche Welle — 5 feeds (Europe, Asia, Africa, Latin America, Middle East)
Al Jazeera — Middle East, Africa, Asia
AllAfrica — pan-African aggregator (120+ African news organizations)
AP News, Reuters — global wire services
RFI — Africa, Europe
NHK World — Asia, Oceania
Xinhua, CGTN — Asia, Africa

Wire backfill is limited to 1 headline per wire source per country, and duplicates of existing local headlines are skipped.

Deduplication

Similar headlines are detected using sequence matching (difflib.SequenceMatcher). If two headlines have >55% similarity, only the higher-scoring one is shown. This prevents the same story from appearing multiple times with slightly different wording.

Freshness Filtering

Articles are filtered using news-hours instead of wall-clock hours. Weekend hours (Saturday and Sunday in the article's country timezone) are excluded from the age calculation, so a Friday evening article remains fresh through Monday morning.

Articles older than 72 news-hours are dropped. Articles with future publication dates are clamped to the current time to prevent timestamp manipulation.

Example: An article published Friday at 6pm (New York time), checked Monday at 9am — wall-clock age is ~63 hours, but 48 of those hours fall on the weekend. News-hours age: 15 hours, well within the freshness window.

Translation

Non-English headlines are automatically translated using the deep-translator library, which uses Google Translate's free web API. Language detection is performed by the langdetect library. The original headline text and detected language are preserved and displayed below the translation.

Translation quality varies by language and is provided as a convenience, not a guarantee of accuracy.

Newspaper Sections

Like a traditional broadsheet, The Front Page organizes headlines into topic-based sections: World, Business, Sport, Science & Tech, Arts & Culture, and Health. Each section shows headlines from around the world filtered to that topic, still organized by region and country.

Headlines are classified using two methods:

RSS category tags: Many news feeds include category metadata (e.g., "sports", "business"). These are extracted and mapped to our sections.
Keyword matching: Headline text is matched against per-section keyword lists as a fallback, ensuring every headline is classified.

A headline can appear in multiple sections. Headlines with political or global significance always appear in the World section.

Update Frequency

Headlines are refreshed every 2 hours via GitHub Actions. The timestamp at the bottom of the page shows when the current version was generated.

Technical Architecture

Hosting: GitHub Pages (free, static hosting)
Backend: None — the site is entirely static HTML/CSS/JSON
Build System: Python scripts run by GitHub Actions
RSS Parsing: feedparser library
Feed Sources: Multi-upstream merging with URL-level deduplication across primary, supplementary, and fallback feed lists
Async Fetching: aiohttp with max 50 concurrent connections

Privacy

Google Analytics is enabled to measure aggregate traffic (page views, countries, devices). This uses cookies.
No user accounts or personalization
No third-party scripts except Google Fonts and Google Analytics
We do not sell or share data with third parties beyond Google Analytics

Limitations

Some publications block automated access, require subscriptions, or time out under load
Translation quality depends on Google Translate's capabilities for each language
Source tiering is subjective and based on name pattern matching
The importance keyword list reflects editorial judgment about newsworthiness

What This Site Does NOT Do

No personalized recommendations or filter bubbles
No engagement optimization (no "you might also like")
No ads, sponsored content, or affiliate links
No social media integration or share tracking
No comments or user-generated content