KDD Lab Mirror Canonical repository and issue tracker: AhsanZaidi12/GRAZE
Forked from AhsanZaidi12/GRAZE · KDD Research Lab, Kansas State University
Official implementation of GRAZE, accepted at CVSports @ CVPR 2026.
GRAZE is a training-free pipeline for First Point of Contact (FPOC) detection in American football tackle videos. It combines open-vocabulary grounding (GroundingDINO), promptable segmentation (SAM2), and motion-aware temporal reasoning, no task-specific training required.
Part of the KSU Tackle Safety Study: a research program developing automated biomechanical assessment tools for American football, targeting injury risk reduction through computer vision analysis of practice footage.
GRAZE pipeline: text-prompted grounding → SAM2 segmentation → multi-prompt temporal search → motion-aware backward refinement → FPOC prediction
GRAZE detects the First Point of Contact (FPOC) frame in tackle video clips without any task-specific training:
- Grounding: GroundingDINO localizes the tackler and ball carrier via text prompts
- Segmentation: SAM2 produces per-frame masks for both players
- Temporal Search: Multi-prompt progressive search identifies candidate contact windows
- Motion-Aware Scoring: Optical flow and mask overlap signals rank candidate frames
- Backward Refinement: Temporal consistency check refines the FPOC prediction
GRAZE/
├── segment_tacklesV3.py # ★ GRAZE pipeline (full method, matches paper results)
├── CombineXls.py # Aggregates per-video predictions to Excel
├── verify_setup.py # Checks weights and environment
├── setupenv.sh # Conda environment setup
├── submit.sh # SLURM single-job submission
├── run_array.sh # SLURM array job (one job per video)
├── configs/ # SAM2 model configuration files
│ └── sam2.1/
└── requirements.txt
git clone https://github.com/kddresearch/TackleStudy_CVSports_GRAZE.git
cd TackleStudy_CVSports_GRAZEbash setupenv.sh
# OR manually:
conda create -n graze python=3.10 -y
conda activate graze
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txtpip install git+https://github.com/IDEA-Research/GroundingDINO.gitpip install git+https://github.com/facebookresearch/segment-anything-2.gitmkdir -p weights
# GroundingDINO
wget -P weights/ https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
wget -P weights/ https://raw.githubusercontent.com/IDEA-Research/GroundingDINO/main/groundingdino/config/GroundingDINO_SwinT_OGC.py
# SAM2 Large
wget -P weights/ https://dl.fbaipublicfiles.com/segment_anything_2/sam2_hiera_large.ptpython verify_setup.pyconda activate graze
python segment_tacklesV3.py \
--video_path /path/to/clip.mp4 \
--output_dir ./Results \
--weights_dir ./weightsSet your paths as environment variables before submitting — do not hardcode them in the scripts:
export VIDEO_DIR="/your/hpc/path/to/videos"
export OUTPUT_DIR="/your/hpc/path/to/Results"
export GRAZE_DIR="/your/hpc/path/to/TackleStudy_CVSports_GRAZE"
export CONDA_ENV="graze"
bash submit.shThis queues 4 parallel array jobs (one per video batch) plus an automatic merge job that runs after all batches succeed.
For a single array submission without the merge step:
sbatch run_array.sh
FPOC detection examples: GRAZE correctly localizes the first point of contact frame across diverse tackle scenarios in practice footage
python CombineXls.py --results_dir ./Results --output graze_results.xlsxThe results in the paper are evaluated on TackleNet, 738 annotated American football practice tackle clips with FPOC ground truth labeled using the SATT biomechanical rubric. The dataset spans multiple teams and seasons of practice footage.
Raw video data cannot be released due to athlete privacy and institutional data agreements. Researchers interested in access may contact [email protected].
This repository is part of a multi-year research program at KSU on automated sports safety assessment using American football practice footage.
| Year | Component | Authors | Description | Venue |
|---|---|---|---|---|
| 2022 | Risky Tackle Detection (3D-CNN) | Nafi, Dietrich, Hsu | 3-stage pipeline: anomaly detection → object detection → 3D conv classification of safe/risky tackles | MLDM 2022 |
| 2023 | Instance Segmentation for Tackle Detection | Nafi, Rediger, Dietrich, Hsu | Relevant instance segmentation of player + dummy as pretext task to improve tackle classification | ICMLA 2023 |
| 2026 | Risky Tackle Classifier (ViViT) | Zaidi et al. | ViViT-based clip classification with focal loss and Taguchi L18 augmentation against SATT rubric | ICPR 2026 |
| 2026 | GRAZE (this repo) | Zaidi, Shamir, Hsu, Dietrich, Zaidi | Training-free zero-shot FPOC localization via GroundingDINO + SAM2 + motion-aware temporal refinement | CVSports @ CVPR 2026 |
| TBD | TackleNet | Zaidi et al. | Dataset & benchmark for spatiotemporal tackle analysis — 738 annotated clips with FPOC ground truth | In preparation |
If you use this work, please cite:
@misc{zaidi2026grazegroundedrefinementmotionaware,
title={GRAZE: Grounded Refinement and Motion-Aware Zero-Shot Event Localization},
author={Syed Ahsan Masud Zaidi and Lior Shamir and William Hsu and Scott Dietrich and Talha Zaidi},
year={2026},
eprint={2604.01383},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2604.01383},
}MIT License. See LICENSE.
GroundingDINO and SAM2 are Apache 2.0 licensed by their respective authors.
Ahsan Zaidi — PhD Candidate, Computer Science, Kansas State University
[email protected] · AhsanZaidi12/GRAZE
Advised by Dr. Lior Shamir · Co-advised by Dr. William Hsu and Dr. Scott Dietrich · KDD Research Lab · Kansas State University