How Recipeas chooses recipes

Recipeas crawls over 2.5 million recipes from chefs and food bloggers around the web. Not all of them are equally good. This page explains, in plain language and with live numbers, how we decide which to surface in browse and which to keep behind search.

1,873,527
Recipes accepted into the catalog
35%
Accept rate (3,525,382 discarded at ingest)
220,777
Recipes scored so far (rolling pass)

Two questions every recipe has to answer

Before a recipe even enters the catalog, our crawler asks: does it have a real photo? does it have at least 3 ingredients and 2 instructions? does the title look like a recipe and not a blog header? About 35% of what we scrape clears that bar — the rest gets discarded immediately.

The accepted recipes then get a discovery score from 0 to 100, which decides whether they show up in the browse feed. Recipes with low scores stay searchable — you can always find them by name or ingredient — they just don't lead the browse feed.

How the discovery score works

It's a small, transparent formula. We're not trying to be clever; we're trying to be honest about what makes a recipe worth recommending.

discovery_score = 50 (baseline) + up to +25 if the host is a famous American food brand + up to +15 for how many ingredients we successfully canonicalized + up to +10 for a clean English title − up to −25 for photo problems (placeholder, logo, tiny thumb, broken) − up to −10 for ingredient lines that didn't parse cleanly − up to −10 for a corrupted (mojibake) title show_in_feed = discovery_score ≥ 55

A few specific notes about that formula:

Is it actually working? The data.

The whole point of the score is to put the bad-looking recipes in the hidden bucket and the good-looking ones in the shown bucket. The fastest test is to look at known quality signals — corrupted titles, ALL-CAPS shouting, broken image URLs — and compare the rates.

Hidden from browse
173,191
recipes still searchable, hidden from feed
Corrupted (mojibake) titles14.8%
ALL-CAPS titles8.4%
Suspicious image URLs0.7%
Shown in browse
42,613
recipes leading the discovery feed
Corrupted (mojibake) titles0.0%
ALL-CAPS titles1.3%
Suspicious image URLs0.0%

If the score weren't doing anything useful, those percentages would be similar between the two columns. Today the gap is roughly 1476× on mojibake — we're correctly funneling broken text away from the feed.

Which hosts lead each bucket

Top hosts in browse
www.allrecipes.com2,432
www.food.com2,229
www.americastestkitchen.com667
www.povarenok.ru493
www.justapinch.com461
jamiegeller.com442
www.greatbritishchefs.com441
sunset.com427
www.bbcgoodfood.com415
www.delicious.com.au415
Top hosts in hidden
www.povarenok.ru12,484
www.cuisinelolo.fr1,863
eatsmarter.de1,174
www.culinar.ro865
migusto.migros.ch728
varecha.pravda.sk700
pt.petitchef.com622
culinariefy.com573
www.kochbar.de567
www.kotikokki.net525

Languages in the catalog

We accept recipes from chefs and food bloggers around the world. The app auto-translates titles and ingredient lines into English by default; a one-tap toggle in the recipe view flips back to the original.

LanguageAccepted recipes
en1,342,867
(unknown)411,128
ru118,381
es274
de133
id103
pt73
zh-CN72
it54
hi-Latn45

What we're still working on