-
Notifications
You must be signed in to change notification settings - Fork 1
I need an explanation #6
Copy link
Copy link
Open
Description
w2507154692
opened on Apr 29, 2026
Issue body actions
- The output of a detection model typically includes coordinates, categories, and confidence scores. I would like to understand how the model passes the image content within the predicted bounding boxes into the encoder via differentiable operations to compute the post-regression (post-reg) loss. In the current implementation, the code performs Non-Maximum Suppression (NMS) followed by direct image cropping based on the predicted coordinates. As far as I know, direct cropping is a non-differentiable operation. How is the gradient backpropagated to the detection head in this case?
Could you explain why lreg is removed when calculating the final loss in the YOLOv5 loss.py file? It appears that lreg is not being utilized for optimization. Furthermore, even if it were included in the optimization objective, it seems unable to backpropagate gradients due to the non-differentiable nature of the cropping process mentioned in the first point. What is the underlying rationale for this?Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels