- Python 77.3%
- Jupyter Notebook 18.9%
- Shell 3.8%
| .github/workflows | ||
| configs | ||
| notebooks | ||
| reports | ||
| scripts | ||
| src/shelfsight | ||
| submissions | ||
| tests | ||
| weights | ||
| .gitignore | ||
| AGENTS.md | ||
| checksums.sha256 | ||
| download_data.sh | ||
| predict.py | ||
| predictions.json | ||
| pyproject.toml | ||
| README.md | ||
| taxonomy.json | ||
| test_images.json | ||
| uv.lock | ||
ShelfSight - Hackology II
ShelfSight is a retail shelf product detector for the Hackology II object detection challenge. The repository is prepared around the private evaluator contract: clone the repo, install locked dependencies with uv, run predict.py, and receive a valid COCO detection JSON.
Problem
Given shelf images, detect products from the provided taxonomy.json and output COCO detection records with valid image_id, category_id, [x, y, width, height] boxes, and confidence scores. The final metric is private-test mAP@0.5.
Approach
The current jury path is a single YOLO detector fine-tuned on the competition data:
input images -> YOLO detector -> clipped COCO boxes -> predictions.json
The default inference settings are intentionally conservative for the private evaluator GPU:
weights: weights/latest_best.pt
imgsz: 1024
conf: 0.001
iou: 0.70
max_det: 500
batch: 1
half: enabled on CUDA
Optional tiling and ensembling code exists in src/shelfsight/inference/, but it is not enabled by default because the final private run is timeout-limited and must fit on a 16 GB GPU.
Repository Structure
configs/ Training and inference configs
src/shelfsight/data/ COCO inspection, split, and conversion helpers
src/shelfsight/training/ Training entry points and wrappers
src/shelfsight/inference/ Prediction validation and post-processing helpers
src/shelfsight/evaluation/Local scoring and error-analysis tools
scripts/ Reproducible workflow commands
weights/ Committed final-weight candidate and manifest
reports/ Experiment log, analysis notes, and figures
notebooks/ Exploration and Colab training workflow
tests/ CLI, schema, mapping, and inference helper tests
task/, data/, and generated runs/ are intentionally ignored.
Setup
uv sync --locked
Development check:
uv run pytest
Inference
Required evaluator command:
uv run python predict.py --input <dir> --output predictions.json
predict.py discovers images under the input directory, resolves image_id from test_images.json when available, falls back to numeric filename stems or deterministic index, loads category IDs from taxonomy.json, runs the detector one image at a time, clips boxes to image bounds, and writes a COCO detection JSON array.
Validate a generated file:
uv run python -m shelfsight.inference.validate_predictions --predictions predictions.json --input <dir>
Public-test helper:
bash scripts/05_predict_public.sh
Final local gate:
bash scripts/07_final_submission_check.sh
Environment overrides for measured experiments:
SHELFSIGHT_IMGSZ=1280 SHELFSIGHT_CONF=0.005 SHELFSIGHT_IOU=0.75 \
uv run python predict.py --input data/public_test/images --output /tmp/predictions.json
Model Weights
The task PDF recommends committing weights under 100 MB. The current model is committed directly:
weights/latest_best.pt
size: 54,230,693 bytes
sha256: b970115580153b8d256e6b476c58cf2210c7aaf34d22f3a69f5e2b428db95d0f
No external weight download is required for the default path. SHELFSIGHT_MODEL can point to another local checkpoint for experiments.
Training
Training workflows are in scripts/ and notebooks/02_train_colab.ipynb. The current champion was trained in Colab from pretrained YOLO weights with a rare-synthetic-to-real curriculum, Drive backups, and final candidate generation.
Main data recipes:
bash scripts/01_prepare_data.sh
bash scripts/02_train_yolo_real.sh
bash scripts/03_train_yolo_synth_then_real.sh
Every meaningful run should be logged in reports/experiments.md with model path, settings, validation score, runtime, and SHA256.
Validation Strategy
The repository uses real-image validation for local model selection and keeps synthetic data out of the main validation split. Public leaderboard submissions are treated as limited confirmation, not as the only optimization signal.
The current public leaderboard best for team-02 was mAP@0.5 = 0.5497 on 2026-05-24. The selected checkpoint reached local validation mAP@0.5 = 0.72667. The private score can differ, so the default inference path prioritizes reproducibility and memory safety.
Runtime
The organizer environment is expected to provide Python 3.11, CUDA 12.x, one 16 GB GPU, internet access, and a 30 minute timeout including install and inference. predict.py uses batch 1 and imgsz=1024 by default to avoid the 1280 px CUDA out-of-memory issue seen during local/Colab testing.
Error Analysis
Known hard cases are dense neighboring products, rare classes, very small boxes, visually similar packaging variants, and public/private distribution shift. See reports/error_analysis.md and reports/figures/ for the current analysis artifacts.
Presentation
The organizer task requires a 5-7 slide PDF/PPTX presentation in the final repository. The old placeholder presentation/ folder was removed because it was not a real deliverable. Before final handoff, add the actual PDF/PPTX artifact with results, visuals, metrics, error analysis, and limitations.
Model-derived charts and example overlays for the deck are in reports/presentation_assets/ and reports/figures/model_*.
External Data And Tools
- Competition data: provided by Hackology II; not redistributed in this repository.
- Pretrained models: Ultralytics YOLO family weights used as initialization.
- AI coding tools: Codex/ChatGPT assisted with code and documentation.
- Cloud: Google Colab used for training; no paid inference API is required.
License Notes
Model weights are derived from competition training data plus Ultralytics pretrained initialization. Verify final organizer and upstream model-license constraints before public release outside the hackathon repository.