Problem & solution

Batch Processing Marathon Photos at Scale — Tag Bib Numbers

A major marathon generates tens of thousands of photos. Processing them one-by-one, or with generic batch tagging that adds the same keywords to everything, doesn't work: runners search for themselves by bib, and the first photographer to publish a searchable gallery tends to win the sales for that event. Slow workflows lose ground to faster ones.

On a mass-participation event, the bottleneck is turning thousands of raw frames into individually identified, sellable photos. The more frames you can reliably tag by bib number and the sooner you can deliver, the more of the field you can sell to. Manual tagging at scale is slow and error-prone; the goal is to automate the repetitive identification step so your time goes to editing and delivery.

Understanding the problem

Batch processing at scale means running thousands of photos through one pass that identifies each one individually — reading the bib number on each runner and matching it to a named participant — rather than applying generic keywords to the whole set. The challenge is doing this accurately and consistently across a massive, varied dataset: different course positions, lighting, poses, and bib visibility.

Marathon photo services are a high-volume business, and revenue is driven by match rate: the share of photos that get correctly identified so a runner can find and buy them. Generic 'marathon / finisher / running' tags make every photo unfindable by individual — a runner searching for themselves gets the whole gallery. Individual bib-level identification is the entire point. Delivery speed compounds this: runners check for their photos right after they finish, so the sooner a searchable gallery is live, the more of that immediate demand you capture.

In this sport specifically

Marathons are among the largest-scale events any race photographer covers. Major city marathons routinely field tens of thousands of participants, and a photographer working multiple course positions plus the finish line can come back with a very large number of frames — far more than any manual workflow can identify one by one. The business model effectively requires automation. Marathon photos are also time-sensitive: same-day delivery of top finishers, then the rest of the field as fast as you can turn it around, is a real competitive differentiator.

Where it shows up

Traditional approaches, and why they fall short

Manual tagging with a team working long overnight shifts

Many hours per event; scales linearly with the number of photos and shrinks only by adding people · Good on clean, well-lit photos; quality drops as fatigue sets in over a long shift

Doesn't scale economically. Trained overnight taggers are hard to staff, and fatigue errors tend to land on exactly the hard photos — folded or partial bibs — that matter most for customer satisfaction.

Basic keyword tagging for all photos, with no individual identification

Quick to set up, then automatic — but the output is not individually usable · No individual identification — every photo gets the same generic tags

Photos can't be found by individual runner. Every runner searching for their photo finds the whole set. Individual identification is the entire business model of race photo services, so this approach defeats the purpose.

Manually tagging only the top finishers for same-day delivery and deferring the rest of the field to later in the week

A manageable amount of work for the top finishers; a much larger job for the full field afterward · Higher on top finishers (easier lighting, less occlusion), lower across the full field

A two-tier service means recreational and age-group runners wait longer and feel like an afterthought, and the bulk of the field's revenue gets delayed or lost because those photos take too long to process by hand.

How RaceTagger handles it

RaceTagger processes a folder of photos as a batch. For each photo it detects and reads the bib number, then matches it against the start-list CSV you upload (bib, name, age group, and any other columns you include), so each photo is tied to a named participant rather than a generic keyword. It reads both JPEG and RAW files (via the embedded preview), and writes the result straight into each photo's metadata (EXIF/XMP/IPTC), so your tags travel with the files into Lightroom, Photo Mechanic, Capture One, or your gallery and delivery tools. RaceTagger can also organize the tagged files into folders by number, name, or category.

Key advantage

It automates the repetitive identification step — reading the bib and matching it to a named runner — across a whole batch, instead of one photo at a time. When a read is uncertain, it flags that photo for review rather than guessing, so you can trust the confident matches and spend your review time only on the ones it wasn't sure about.

Good conditions
Clean, well-lit bibs in an upright pose (typical at the finish line) read reliably.
Challenging
Mid-course frames with varied lighting and partial bib occlusion are harder; uncertain reads are flagged for review instead of being silently mis-tagged.
Worst case
Rain, heavy occlusion, and motion blur are the hardest cases — RaceTagger flags low-confidence reads for review rather than guessing, so the harder a photo is, the more likely it lands in your review queue rather than the wrong runner's gallery.

Import your start list once (include bib, name, age group, and anything else you want carried through). Point RaceTagger at a folder and run the batch; process each photographer's batch as their folder lands rather than waiting for the whole event. Tagged metadata is written into the files for import into your editor or gallery tool, and RaceTagger can organize the photos into folders by number, name, or category. Review the flagged, low-confidence photos, fix the few that need it, and deliver — top finishers first, then the rest of the field.

Manual vs OCR vs AI vision

MetricManualBasic OCRRaceTagger
Identification approachPerson-by-person tagging — accurate when fresh, but slow and fatigue-prone at scaleGeneric keywords on the whole set — no individual identificationReads each bib and matches it to a named participant from your start list, in one batch pass
Match rate at scaleHigh on clean photos, declining as the team tiresNo individual match rate — photos can't be found by runnerStrong on clean bibs; harder frames are flagged for review rather than mis-tagged
Cost modelLabor cost that scales with the size of the team and the length of the shiftCheap to run but produces output you can't sell individuallyCredits per photo analyzed (1 credit = 1 photo), with the repetitive work automated
Output and editor integrationNeeds separate consolidation and formatting after taggingGeneric tags only — requires complete re-tagging to be usableWrites bib/name metadata into each file (EXIF/XMP/IPTC) and can organize into folders by number, name, or category
Handling hard photosHard photos hit at the fatigue point and get the most errorsCan't read occluded or partial bibs at allFlags low-confidence reads for review instead of guessing, so hard photos surface for a human

Practical tips

  1. 1

    Process each photographer's batch as their folder lands, instead of waiting for the whole event's files to be present.

    On a multi-photographer event, files arrive in waves. Running each folder as a batch when it's ready means the first uploads are already tagged and reviewed while later ones are still coming in, which trims total delivery time.

  2. 2

    Build a start-list CSV that includes name and age group alongside the bib, so matched photos carry that information through.

    The richer your start list, the more useful the tagged output: matching against name and age group lets you segment and organize galleries by category instead of handing every runner one giant undifferentiated set to scroll through.

  3. 3

    Tag which course position each batch was shot at so you can keep finish-line, mid-course, and start sets distinct.

    Keeping batches labeled by position helps you organize a runner's frames from different points of the course and makes review and delivery more orderly, especially when the same bib appears across several locations.

  4. 4

    Trust the confident matches and spend your review time on the photos RaceTagger flags as low-confidence.

    Because uncertain reads are flagged rather than guessed, your manual review can focus on just the flagged frames — the folded bibs, partial numbers, and hard lighting — instead of re-checking the whole batch.

  5. 5

    Budget extra review time for rain races and difficult conditions — and remember those photos are often premium content.

    Wet, dramatic conditions reduce read confidence, so expect more flagged frames and plan a bigger review window. Runners who finish in the rain are proud and want their photos, and those frames often sell well, so the extra review time is worth it.

Tag your next marathon batch by bib number — and deliver sooner

Start with your free monthly credits (1 credit = 1 photo). Point RaceTagger at a folder from your last event and see how it tags bib numbers, matches them to your start list, and flags the hard ones for review.

Try it free →

Questions photographers ask

Can RaceTagger handle a very large batch of marathon photos?

Yes — RaceTagger is built for batch processing, so you point it at a folder and it works through the set rather than tagging one photo at a time. On a mass-participation event you'll typically split the work into folders (for example, per photographer or per course position) and run each as a batch. Photos it isn't confident about are flagged for review so you can deliver the confident matches quickly and check the rest.

What if photos arrive from multiple photographers over several hours? Do I have to wait for everyone before processing?

No. Process each photographer's folder as a batch when it lands, instead of waiting for the entire event. That way the first uploads are tagged and reviewed while later ones are still coming in, which shortens overall delivery time.

How does RaceTagger turn bib numbers into named galleries?

You upload your start-list CSV — bib, name, age group, and any other columns you include — and RaceTagger matches each detected bib number to that participant, so the photo is tied to a named runner rather than just a number. It can then organize the tagged files into folders by number, name, or category for delivery.

What happens to photos where the bib is hard to read?

When a read is uncertain — a folded or partial bib, heavy occlusion, motion blur, rain — RaceTagger flags that photo for review instead of guessing a number. This keeps the confident matches trustworthy and concentrates your manual review on just the photos that actually need a human eye.

Does RaceTagger work with RAW files and my existing editor?

Yes. It reads both JPEG and RAW files (using the embedded preview) and writes the bib/name tags into each photo's metadata (EXIF/XMP/IPTC), so the tags travel with the files into Lightroom, Photo Mechanic, Capture One, or your gallery and delivery tools. It sits between the shoot and your editor — it doesn't replace your catalog or culling tools.

Does RaceTagger run offline?

No. Number recognition runs in the cloud, so RaceTagger needs an internet connection to analyze your photos. Plan for connectivity when you're processing a batch on event day.

Keep reading

← All guides