What file formats does RaceTagger support?

RaceTagger supports RAW files from major camera manufacturers (Canon, Nikon, Sony) as well as JPEG formats. The desktop application processes both formats simultaneously for complete workflow integration.

How does the CSV integration work?

Import official race starting lists in CSV format to enable automatic driver identification. When race numbers are detected, RaceTagger matches them with driver names from your CSV data for accurate photo tagging.

What's included with early access?

Early access includes 500 free tokens on signup, 100 free analyses every month, priority support, and direct feedback channel.

How long does photo processing take?

Average processing time is approximately 4 seconds per image, though this varies based on image complexity and race car visibility. Batch processing allows efficient handling of entire race photo folders.

Bib Detection in Cycling Photography — Back Bibs, Saddle Bags, and Peloton Density

Name: RaceTagger
Availability: InStock
Author: RaceTagger

Understanding the Problem

Cycling race numbers (bibs) are printed cloth sewn to the back of the rider's jersey, not the front. The number is typically 4-5 digits, positioned horizontally across the lower back. The challenge is that the number is almost never fully visible: saddle bags, water bottle cages, and jersey fit cover portions of it. In peloton photos, riders are 30-50 deep, and numbers at distance become illegible pixels.

Cycling has two distinct photography markets: professional Grand Tour media (Tour de France, Giro d'Italia, Vuelta) where photo tagging is essential for media delivery, and sportive/gran fondo events where participants are the customers and expect to find their photos quickly. Untagged photos mean lost sales and angry customers. Professional cyclists and teams also need photos of themselves identified correctly for social media.

In specifically:

Cycling is unique among sports in that the primary race number is on the BACK, not the front. This is by design — TV cameras film from motorcycles following the peloton, shooting into the riders' backs. However, it creates a massive challenge for still photographers: photos from the front, side, or from ahead of the peloton cannot see the number at all. Additionally, cycling numbers appear in multiple places: back bib (primary), sometimes on the bike frame (frame number, smaller), and on leg bands or helmets (alternate ID, used in some tours). Photographers need to identify numbers from multiple sources.

Common Scenarios

Peloton pass — 40 riders in frame, camera angle from the side, bibs visible but hundreds of pixels deep at distance

very common

✗ Individual bibs are illegible without zooming. OCR can't read text that's only 15-20 pixels high. The photo shows the peloton but doesn't identify a single rider.

Rider with a large saddle bag covering the middle digits of the back bib (5 digits, but 2-3 are hidden)

very common

✗ With only 2-3 digits visible, OCR fails or returns a wrong number. Manual tagging requires knowing if it's rider '12345' or '12945' — the starting list can have both.

Frame number shot — camera focused on the bike instead of the rider, catching the small frame number on the down tube (20-30 pixels high)

common

✗ OCR struggles with frame numbers because they're typically smaller, lower contrast, and variable in placement. Traditional OCR accuracy on frame numbers is under 40%.

Breakaway photo with 5-6 riders strung out — some angles show back bibs clearly, others show bike frames, one rider has the number partially hidden under a rain jacket

occasional

✗ The photo shows multiple identification points (back bibs, frame numbers, rider positions), but OCR has to be run separately for each type. Integration becomes a manual matching puzzle.

Traditional Approaches (And Why They Fall Short)

Manual identification by cycling reference data — using race position, team jersey color, and rider appearance

Time: 3-5 minutes per high-quality peloton shot when bibs are unreadableAccuracy: 75-85% (relies on recognizing individual cyclists by face and jersey, plus knowing race positions from the moment in the event timeline)

⚠ Only works for professional races where the photographer is deeply familiar with the grid. Impossible for sportives where you don't know which riders are in the photo. Labor-intensive and doesn't scale.

Basic OCR on visible text only (back bibs when readable, frame numbers separately)

Time: 2-3 minutes per 100 photos for manual verification and consolidation of multiple number sourcesAccuracy: 55-68% on back bibs (occlusion by bags/jersey is constant), 25-40% on frame numbers

⚠ Peloton density makes OCR worthless for group shots. Frame number OCR is particularly weak due to size and contrast variation.

GPS/timing-based matching — use rider GPS data or timing chip location data to match position in photo to identity

Time: Requires integration with race timing data and GPS trackers (not always available)Accuracy: High if data available, but only for timed checkpoints, not mid-race action

⚠ Only works if the event has live GPS tracking or timing mats at the photo location. Most sportives don't have this infrastructure. Doesn't help with mid-race action photos.

How AI Vision Solves It

AI vision models trained on cycling imagery recognize that numbers are on the BACK of jerseys, not the front. The model learns to identify partial bibs (reading 2-3 visible digits and inferring the complete number from the starting list and race context), detect frame numbers as secondary identification sources, and estimate rider identity from multiple cues: jersey team color, riding position, bike characteristics, and partial number data. For peloton shots, the AI clusters nearby riders and associates them with visible bibs, using body position and proximity to assign numbers to individuals.

Key advantage

Multi-source identification. The AI doesn't rely on OCR alone — it integrates back bib (even if partial), frame number, team jersey color, and riding position to converge on identity. When the back bib is hidden, it can still read the frame number. When the frame number isn't visible, it uses the back bib. This redundancy is essential in cycling where no single number source is always available.

93-96% — clear back bib, upright position, decent lighting

Good conditions

86-91% — partial bib occlusion, peloton context, frame number fallback

Challenging

75-83% with confidence flags — dense peloton, both bib and frame number obscured, need team context

Worst case

Import the starting list as a single CSV (all riders, no class distinction needed). RaceTagger processes all photos and detects numbers from multiple sources: back bibs, frame numbers, and contextual position. Output flags which number source had highest confidence and separates flagged photos (typically 6-10% of peloton shots) for manual confirmation. XMP sidecars include all detected number sources with confidence scores.

Manual vs OCR vs AI Vision

Metric	Manual	Basic OCR	AI Vision (RaceTagger)
Processing time (3000-8000 photos per stage)	10-18 hours (team of 2-3, requires cycling knowledge)	60-90 minutes (plus 120+ minutes manual review of failed bibs)	~120-150 minutes (batch processing with multi-source identification)
Accuracy — clear back bib, upright position	92-96%	68-75%	93-96%
Accuracy — back bib partially hidden by saddle bag	80-88% (requires grid knowledge or starting list cross-reference)	30-45%	86-91% (partial number inference)
Frame number detection and integration	Manual lookup, extremely slow	25-40% accuracy, not integrated with bib data	Automatic detection and used as backup when bib is obscured
Cost per 5000 photos	€200-350 (cycling expertise required)	€10-15 (compute)	€40-60 (tokens, multi-source inference)

Practical Tips

1beginner

Position yourself behind and slightly to the side of the peloton for the clearest back bib angles

Back-left 45° angle shows the full width of the rider's back without foreshortening. Front angles show only the jersey collar. Side profiles show the bib but at an angle. Photographers working from motorcycles know this — position for bib visibility, not for the most dramatic angle.

2beginner

Scout the course for frame number visibility — some sections shoot bike details better than others

Climb sections where riders are hunched = good frame number visibility. Fast descent sections where riders are tucked = frame number hidden under jersey. Plan your shooting positions to capture frame numbers on climbs as a backup identification source.

3intermediate

Use team jersey color as a secondary filter — import a team-to-number-range mapping if available

Team jerseys are color-coded (Sky blue, Jumbo pink, Alpecin-Deceuninck white-black). If the AI detects a partial bib that could be two different numbers from the same team, it can use jersey color to disambiguate. This requires adding a team field to your starting list.

4intermediate

For peloton shots, increase burst shooting and let the AI select the clearest frame from the sequence

In a 5-frame burst through a peloton, riders shift positions slightly and bibs are revealed/hidden as they move. The AI can analyze all 5 frames and pick the one where the target rider's bib is most visible, then tag that single frame.

5advanced

For sportive events, prepare for 8-12% manual review on peloton shots — but prepare your photographers for this in advance

Peloton density makes perfect AI accuracy impossible. Set expectations: the AI will identify breakaway riders (95%+ accuracy), isolated riders (92%+ accuracy), but peloton shots (75-85% accuracy) will need spot checks. Budget the review time and use it as a quality control pass, not a correction step.

Tag your cycling event gallery in 2 hours, not 8

500 free tokens. Upload a batch of peloton and breakaway shots from any cycling event — see how the AI handles back bibs, frame numbers, and partial occlusion.

Start tagging for free →

Frequently Asked Questions

How does the AI know to look for numbers on the BACK of jerseys and not the front?

The AI model is trained specifically on cycling race photos where numbers are positioned on the back. It learns to identify jersey back panels and associated number regions through examples in the training data. When you specify 'cycling' as the sport, the model applies cycling-specific rules. This is why sport-specific AI is critical — a generic OCR tool trained on documents doesn't know cycling conventions.

Can it detect and match both back bibs AND frame numbers in the same photo?

Yes. The AI detects numbers from both sources and outputs both readings with confidence scores. If the back bib is partially hidden but the frame number is clear, it uses the frame number. If both are visible, it cross-validates them. The CSV output includes which source each detection came from.

What's the accuracy on dense peloton shots with 30+ riders in frame?

Peloton shots are inherently challenging because numbers become small pixels at distance. Accuracy on cluster identification is 75-83% in very dense packs. The AI does best with breakaway groups (2-5 riders) at 92-96% accuracy. Plan your workflow to flag peloton shots for manual verification as a quality control step, rather than expecting 95% accuracy on group shots.

Does the AI work for grand fondo/sportive cycling events or only professional races?

The AI works equally well for both. The difference is in your workflow: professional races have annotated starting lists with rider names; sportives typically have just participant numbers. The detection accuracy is the same. The output format differs: for sportives, you get just the number tags ready for a web gallery; for pros, you can add rider name, team, and social media handles.

← All Guides