Algiers Airport Trolley OBB Dataset

A hybrid synthetic and real-world dataset for robust Oriented Bounding Box (OBB) detection of airport logistics assets, featuring a Digital Twin of Algiers International Airport.

View on GitHub

Airport trolleys with bounding boxes visualization

About the Dataset

Understanding the challenge and our solution

The Problem

The sim-to-real gap remains a significant challenge in computer vision applications for airport logistics. Existing datasets lack robust CCTV-based surveillance footage that realistically captures the complexity of airport environments. This gap has hindered the development of robust models for detecting and tracking luggage trolleys in real-world deployment scenarios.

Our Solution

The Algiers Airport Trolley OBB Dataset bridges this gap by combining comprehensive synthetic pre-training data generated from a high-fidelity Digital Twin of Algiers International Airport with a carefully curated set of real-world testing data. This hybrid approach ensures models trained on our dataset demonstrate superior generalization capabilities in production environments.

Dataset Distribution & Statistics

Key metrics and composition of the dataset

Total Images

4,500+

High-resolution annotated images

Annotated Trolleys

44,500+

Oriented Bounding Box annotations

Format

YOLO OBB

Standard for rotated object detection

Data Composition

Synthetic Data

3,248 images (71.3%)

Real Data

1,305 images (28.7%)

About the Research

Automated logistical monitoring in airports is frequently bottlenecked by the high cost, privacy concerns, and strict security regulations associated with collecting large-scale CCTV data.

"To overcome this data scarcity, our research introduces a highly data-efficient framework leveraging a synthetic Digital Twin of the Algiers International Airport."

By generating a massive, automatically annotated synthetic dataset of luggage trolleys, we solve the geometric challenges of detecting dense, nested, and overlapping objects using Oriented Bounding Boxes (OBB). This paper systematically evaluates how synthetic data can bridge the "Sim-to-Real" domain gap.

Experimental Design

We evaluated a YOLO-OBB architecture across varying subsets of real-world data (5% to 50%), benchmarking three domain adaptation strategies:

Strategy A: Linear Probing

Pre-training on synthetic data and fine-tuning only the prediction head (frozen backbone) on real data.

Strategy B: Full Fine-Tuning

Pre-training on synthetic data and updating the entire network (unfrozen backbone) on real data.

Strategy C: Mixed Training

Training from scratch using a combined dataset of full synthetic data + incremental fractions of real data.

Key Findings

The Data Efficiency Multiplier

Mixed Training achieved a 0.73 mAP score using only 30% of available real data. This outperformed a Real-Only baseline trained on 40% data, translating to a 25% reduction in manual annotation effort.

Bridging the Texture Gap

Linear Probing performed poorly, revealing that while synthetic data teaches excellent geometry (shape/orientation), the backbone must be unfrozen to learn real-world textures (lighting, reflections).

The Recall Advantage

While Mixed Training dominated in low-data regimes (5–30%), Full Fine-Tuning achieved the highest overall Recall (91.1%) once the real-world data fraction reached 50%.

Conclusion

Blending Digital Twin synthetic data with a small, strategic subset of real-world data provides a highly scalable pathway for deploying robust computer vision systems, significantly reducing setup costs and labeling burdens.

Citation

Please cite this dataset in your research

@dataset{algiers_airport_trolley_2026,
  title={Evaluating Synthetic Data for Baggage Trolley Detection in Airport Logistics},
  author={Abdeldjalil Taibi,Mohmoud Badlis,Amina Bensalem,Belkacem Zouilekh,and Mohammed Brahimi},
  year={2026},
  institution={National Higher School of Artificial Intelligence},
  note={Available at https://github.com/djallilou13/yolo-obb-experiments}
}