Problem & solution

Metadata Tagging in Marathon Photography — How AI Solves It

After a marathon, you come back with thousands of photos, and each one needs metadata: the runner's name, bib number, race name, date, location. Metadata is how participants find their own photos in your gallery — and how delivery platforms like SmugMug and PhotoShelter make those photos searchable. Without per-photo metadata, your work is effectively invisible to the people who'd buy it.

Untagged photos don't sell. Metadata is the link between your work and the runner's wallet: a participant searches for their name or bib and either finds their photos or doesn't. Generic, whole-folder keywords leave every photo unfindable by individual, so the goal is per-photo identification that turns a folder of frames into a searchable, sellable archive.

Understanding the problem

Metadata tagging is the process of writing standardized fields — title, keywords, description, photographer, copyright — into each photo file. In marathon photography the per-photo identifiers that matter are the runner's name, bib number, race name, date, and location, plus any category (age group, wave) you have in your start list. The hard part isn't the format; it's getting the right runner's identity onto each of thousands of individual frames.

Marathon photography is largely a business-to-consumer model: runners buy their own photos, directly from you or through a delivery platform. Galleries are searchable by metadata, so a runner who searches their name or bib expects to land on their own photos immediately. If every photo carries only generic 'Berlin Marathon / running' keywords, that search returns the whole gallery — or nothing useful — and the sale is lost. Per-photo, per-participant metadata is what makes the archive work as a storefront.

In this sport specifically

A big-city marathon fields tens of thousands of runners, and a photographer working several course positions plus the finish line comes back with far more frames than any manual workflow can identify one at a time. Several wrinkles are specific to the discipline: relay teams can share a bib across legs; common names recur many times across a large field; and bibs get folded, angled, or partly hidden behind arms and other runners. Metadata has to be specific enough that each runner finds only their own photos, which means reliably reading each bib and matching it to the right entry in your start list.

Where it shows up

Traditional approaches, and why they fall short

Manual metadata tagging in Lightroom or Capture One, one photo at a time

Slow and linear — it scales only with how many hours and people you throw at it, which is why it breaks down at marathon volume. · Careful manual entry is accurate while the tagger is fresh, but quality drops as fatigue sets in over a long session.

Doesn't scale. Labor is the bottleneck, and on a tens-of-thousands-of-photos event, hand-tagging every frame can't hit a same-day or next-day delivery window.

Batch-applying the same generic keywords to the whole folder

Fast to set up and effectively instant to run — but the output isn't individually usable. · No individual identification — every photo gets identical generic keywords.

Runners can't find their specific photos, so the gallery doesn't function as a storefront. It forces a manual re-tagging pass afterward, which defeats the purpose.

Outsourcing identification and tagging to a third-party metadata service

Adds a turnaround delay while files go out and come back, which pushes delivery later. · Depends on the service and, critically, on the quality of the underlying bib reads it's working from.

Costs add up per event and the turnaround delays delivery to runners. The output is only as good as the bib detection feeding it — garbage in, garbage out.

How RaceTagger handles it

RaceTagger processes a folder of photos as a batch. For each photo it detects and reads the bib number, then matches it against the start-list CSV you upload — bib, name, age group, and any other columns you include — so the photo is tied to a named participant instead of a generic keyword. It reads both JPEG and RAW files (via the embedded preview) and writes the result straight into each photo's metadata (EXIF/XMP/IPTC), so the tags travel with the files into Lightroom, Photo Mechanic, Capture One, or your gallery and delivery tools. When a read is uncertain, it flags that photo for review rather than guessing a number.

Key advantage

It automates the repetitive identification step — reading the bib and matching it to a named runner — across a whole batch, instead of one photo at a time. Confident matches you can trust; uncertain ones are surfaced for review, so your manual time goes only to the frames that actually need a human eye, not the whole set.

Good conditions
Clean, well-lit, upright bibs (typical at the finish line) read reliably.
Challenging
Folded, angled, or partly occluded bibs and mixed lighting are harder; uncertain reads are flagged for review instead of being silently mis-tagged.
Worst case
Heavy occlusion, motion blur, and bibs hidden behind other runners are the hardest cases — RaceTagger flags low-confidence reads for review rather than guessing, so the harder a photo is, the more likely it lands in your review queue rather than the wrong runner's gallery.

After shooting and culling, import your start-list CSV once (include bib, name, age group, and anything else you want carried through), then point RaceTagger at a folder and run the batch. For each detected bib it looks up the participant and writes name and bib into the photo's metadata. Photos it isn't confident about are flagged for review — you confirm the read, or leave a genuinely unreadable bib untagged. Export writes the tags into the files (EXIF/XMP/IPTC) so you can upload to SmugMug, PhotoShelter, or your delivery tool of choice, where runners can search by name or bib and find their own photos.

Manual vs OCR vs AI vision

MetricManualBasic OCRRaceTagger
Metadata tagging approach at marathon scaleOne photo at a time — accurate when fresh, but slow and fatigue-prone across tens of thousands of framesGeneric keywords on the whole folder — no per-runner identityReads each bib and matches it to a named participant from your start list, in one batch pass
Cost modelLabor cost that scales with team size and shift lengthCheap to run, but produces output you can't sell individuallyCredits per photo analyzed (1 credit = 1 photo), with the repetitive identification automated
Per-runner findabilityAccurate identity, but only on the photos you have time to reachNone — every photo carries the same generic keywordsPer-photo name and bib written into the file, so runners can search and find their own photos
Delivery-platform compatibility (SmugMug, PhotoShelter, Zenfolio)Requires manual per-photo entry to make galleries searchableGeneric tags only — not searchable by individualWrites standard EXIF/XMP/IPTC metadata into each file that delivery platforms read
Handling duplicate participant namesManual disambiguation required, photo by photoNo matching availableAnchors identity on the detected bib number, which is unique even when names repeat

Practical tips

  1. 1

    Build a start-list CSV with name and age group alongside the bib, so matched photos carry that information through.

    The bib is what RaceTagger reads off the photo; everything else in the matched metadata comes from your CSV. The richer the start list, the more useful the tagged output — you can segment and organize galleries by category instead of handing every runner one undifferentiated set.

  2. 2

    Run a small test batch and check how the metadata lands in your delivery platform before processing the whole event.

    Export a handful of tagged photos and upload them to SmugMug, PhotoShelter, or whichever platform you use, then confirm names and bibs show up the way you expect and are searchable. Sorting out any formatting on a test set is far cheaper than discovering it after the full event is processed.

  3. 3

    For relay teams or group runners who share a bib, put a leg or group identifier in your CSV.

    Reading the shared bib alone can't tell legs apart. If that structure exists as columns in your start list, the matched metadata can carry it, so relay athletes aren't all collapsed into one undifferentiated set.

  4. 4

    Anchor identity on the bib, not the name, when your field has lots of common names.

    Names repeat across a large marathon field; the bib is the unique key. Because RaceTagger matches on the detected bib number, the per-photo identity stays unambiguous even when many runners share a name.

  5. 5

    Trust the confident matches and spend your review time on the photos RaceTagger flags as low-confidence.

    Because uncertain reads are flagged rather than guessed, your manual review can focus on just the flagged frames — folded bibs, partial numbers, hard lighting — instead of re-checking the entire batch.

Turn a marathon folder into a searchable, sellable archive

Start with your free monthly credits (1 credit = 1 photo). Point RaceTagger at a folder from your last event and see how it reads bib numbers, matches them to your start list, writes per-photo metadata, and flags the hard ones for review.

Try it free →

Questions photographers ask

Can RaceTagger generate metadata for a bib that isn't in my CSV (last-minute registrants)?

If a bib is detected but not present in your start list, RaceTagger can still record the bib number it read, but it has no name or category to attach because those come from the CSV. You can add the missing entries to the start list and re-process the affected photos.

What if a photo has several runners in frame? Does RaceTagger tag all of them?

RaceTagger detects the bib numbers it can read in the photo and matches each to your start list, so a finish-line frame with multiple visible bibs can be tied to multiple runners. Bibs that are hidden or unreadable in that frame are flagged rather than guessed.

How does RaceTagger turn bib numbers into named, searchable galleries?

You upload your start-list CSV — bib, name, age group, and any other columns you include — and RaceTagger matches each detected bib to that participant, writing the name and bib into the photo's metadata. When you upload to a delivery platform, runners can search by name or bib and find their own photos.

What happens to photos where the bib isn't readable?

When a read is uncertain — a folded or partial bib, heavy occlusion, motion blur — RaceTagger flags that photo for review instead of guessing a number. During review you can confirm the bib yourself, or leave a genuinely unreadable photo untagged so it never lands in the wrong runner's gallery.

Keep reading

← All guides