dtecx/hackology2026-full

No description

Python 77.3%
Jupyter Notebook 18.9%
Shell 3.8%

Find a file

dtecx bbeda5767e data: remove dataset archive from git		2026-06-04 13:01:47 +02:00
.github/workflows	Initial commit	2026-05-23 12:08:33 +02:00
configs	repo: prepare jury-ready inference package	2026-05-24 00:00:33 +02:00
notebooks	notebook: add drive backup recovery helper	2026-05-24 06:29:37 +02:00
reports	docs: refresh champion presentation assets	2026-05-24 07:02:00 +02:00
scripts	add dataset.zip	2026-06-04 12:34:31 +02:00
src/shelfsight	inference: harden jury image id handling	2026-05-24 08:06:50 +02:00
submissions	submission: pretrained rare-to-real 1280 predictions	2026-05-24 04:47:10 +00:00
tests	inference: harden jury image id handling	2026-05-24 08:06:50 +02:00
weights	docs: refresh champion presentation assets	2026-05-24 07:02:00 +02:00
.gitignore	data: remove dataset archive from git	2026-06-04 13:01:47 +02:00
AGENTS.md	repo: polish evaluator safety and docs	2026-05-23 20:18:30 +02:00
checksums.sha256	Initial commit	2026-05-23 12:08:33 +02:00
download_data.sh	add dataset.zip	2026-06-04 12:34:31 +02:00
predict.py	inference: harden jury image id handling	2026-05-24 08:06:50 +02:00
predictions.json	submission: pretrained rare-to-real 1280 predictions	2026-05-24 04:47:10 +00:00
pyproject.toml	setup: scaffold shelfsight project	2026-05-23 12:48:05 +02:00
README.md	docs: refresh champion presentation assets	2026-05-24 07:02:00 +02:00
taxonomy.json	Initial commit	2026-05-23 12:08:33 +02:00
test_images.json	Initial commit	2026-05-23 12:08:33 +02:00
uv.lock	setup: scaffold shelfsight project	2026-05-23 12:48:05 +02:00

README.md

ShelfSight - Hackology II

ShelfSight is a retail shelf product detector for the Hackology II object detection challenge. The repository is prepared around the private evaluator contract: clone the repo, install locked dependencies with uv, run predict.py, and receive a valid COCO detection JSON.

Problem

Given shelf images, detect products from the provided taxonomy.json and output COCO detection records with valid image_id, category_id, [x, y, width, height] boxes, and confidence scores. The final metric is private-test mAP@0.5.

Approach

The current jury path is a single YOLO detector fine-tuned on the competition data:

input images -> YOLO detector -> clipped COCO boxes -> predictions.json

The default inference settings are intentionally conservative for the private evaluator GPU:

weights: weights/latest_best.pt
imgsz: 1024
conf: 0.001
iou: 0.70
max_det: 500
batch: 1
half: enabled on CUDA

Optional tiling and ensembling code exists in src/shelfsight/inference/, but it is not enabled by default because the final private run is timeout-limited and must fit on a 16 GB GPU.

Repository Structure

configs/                  Training and inference configs
src/shelfsight/data/      COCO inspection, split, and conversion helpers
src/shelfsight/training/  Training entry points and wrappers
src/shelfsight/inference/ Prediction validation and post-processing helpers
src/shelfsight/evaluation/Local scoring and error-analysis tools
scripts/                  Reproducible workflow commands
weights/                  Committed final-weight candidate and manifest
reports/                  Experiment log, analysis notes, and figures
notebooks/                Exploration and Colab training workflow
tests/                    CLI, schema, mapping, and inference helper tests

task/, data/, and generated runs/ are intentionally ignored.

Setup

uv sync --locked

Development check:

uv run pytest

Inference

Required evaluator command:

uv run python predict.py --input <dir> --output predictions.json

predict.py discovers images under the input directory, resolves image_id from test_images.json when available, falls back to numeric filename stems or deterministic index, loads category IDs from taxonomy.json, runs the detector one image at a time, clips boxes to image bounds, and writes a COCO detection JSON array.

Validate a generated file:

uv run python -m shelfsight.inference.validate_predictions --predictions predictions.json --input <dir>

Public-test helper:

bash scripts/05_predict_public.sh

Final local gate:

bash scripts/07_final_submission_check.sh

Environment overrides for measured experiments:

SHELFSIGHT_IMGSZ=1280 SHELFSIGHT_CONF=0.005 SHELFSIGHT_IOU=0.75 \
  uv run python predict.py --input data/public_test/images --output /tmp/predictions.json

Model Weights

The task PDF recommends committing weights under 100 MB. The current model is committed directly:

weights/latest_best.pt
size: 54,230,693 bytes
sha256: b970115580153b8d256e6b476c58cf2210c7aaf34d22f3a69f5e2b428db95d0f

No external weight download is required for the default path. SHELFSIGHT_MODEL can point to another local checkpoint for experiments.

Training

Training workflows are in scripts/ and notebooks/02_train_colab.ipynb. The current champion was trained in Colab from pretrained YOLO weights with a rare-synthetic-to-real curriculum, Drive backups, and final candidate generation.

Main data recipes:

bash scripts/01_prepare_data.sh
bash scripts/02_train_yolo_real.sh
bash scripts/03_train_yolo_synth_then_real.sh

Every meaningful run should be logged in reports/experiments.md with model path, settings, validation score, runtime, and SHA256.

Validation Strategy

The repository uses real-image validation for local model selection and keeps synthetic data out of the main validation split. Public leaderboard submissions are treated as limited confirmation, not as the only optimization signal.

The current public leaderboard best for team-02 was mAP@0.5 = 0.5497 on 2026-05-24. The selected checkpoint reached local validation mAP@0.5 = 0.72667. The private score can differ, so the default inference path prioritizes reproducibility and memory safety.

Runtime

The organizer environment is expected to provide Python 3.11, CUDA 12.x, one 16 GB GPU, internet access, and a 30 minute timeout including install and inference. predict.py uses batch 1 and imgsz=1024 by default to avoid the 1280 px CUDA out-of-memory issue seen during local/Colab testing.

Error Analysis

Known hard cases are dense neighboring products, rare classes, very small boxes, visually similar packaging variants, and public/private distribution shift. See reports/error_analysis.md and reports/figures/ for the current analysis artifacts.

Presentation

The organizer task requires a 5-7 slide PDF/PPTX presentation in the final repository. The old placeholder presentation/ folder was removed because it was not a real deliverable. Before final handoff, add the actual PDF/PPTX artifact with results, visuals, metrics, error analysis, and limitations.

Model-derived charts and example overlays for the deck are in reports/presentation_assets/ and reports/figures/model_*.

External Data And Tools

Competition data: provided by Hackology II; not redistributed in this repository.
Pretrained models: Ultralytics YOLO family weights used as initialization.
AI coding tools: Codex/ChatGPT assisted with code and documentation.
Cloud: Google Colab used for training; no paid inference API is required.

License Notes

Model weights are derived from competition training data plus Ultralytics pretrained initialization. Verify final organizer and upstream model-license constraints before public release outside the hackathon repository.