[WIP] Optimize task and job viewsets by zhiltsov-max · Pull Request #10446 · cvat-ai/cvat

zhiltsov-max · 2026-04-02T17:08:26Z

Motivation and context

Improved DB queries in Task and Job endpoints. The main focus was to remove extra requests from the non-CRUD endpoints in these endpoint groups (e.g. /metadata, /annotations). The CRUD endpoints were also slightly optimized.

How has this been tested?

Checklist

I submit my changes into the develop branch
I have created a changelog fragment
I have updated the documentation accordingly
I have added tests to cover my changes
I have linked related issues (see GitHub docs)

License

I submit my code changes under the same MIT License that covers the project.
Feel free to contact the maintainers if that's a concern.

zhiltsov-max · 2026-04-02T17:09:01Z

@archibald1418, you can check perf tests on this branch.

Copilot

Pull request overview

This PR focuses on reducing database query overhead in the Task and Job API endpoints by shifting heavier related-object loading away from base querysets (especially for list/count paths) and into page-scoped prefetching.

Changes:

Refactors TaskViewSet and JobViewSet querysets to avoid heavy joins/prefetches for non-CRUD endpoints and list COUNT queries.
Updates Task/Job metadata endpoints to reuse get_object() (permission-checked) and then populate relations via prefetch_related_objects, removing the extra .get(pk=...) query.
Adds page-scoped prefetching and lightweight bulk-loading in list serializers to prevent N+1 queries while keeping list COUNT queries cheaper.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
`cvat/apps/engine/views.py`	Reworks viewset query construction and metadata endpoints to reduce redundant DB hits and limit heavy relation loading to where it’s needed.
`cvat/apps/engine/serializers.py`	Adds page-only prefetching/bulk-loading in list serializers to avoid N+1 queries without inflating list COUNT queries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-06T13:18:02Z

cvat/apps/engine/views.py

-            'segment__task__project__annotation_guide',
-        )
+    queryset = Job.objects.select_related(
+        # prefetch data for permission checks


The inline comment says "prefetch data for permission checks", but this queryset only uses select_related on assignee/segment/task/project and does not fetch task.data. Please update the comment to reflect what is actually loaded (or include task.data if that was the intent).

Suggested change

# prefetch data for permission checks

# load related objects commonly used by permission checks

Copilot · 2026-04-06T13:18:02Z

cvat/apps/engine/serializers.py

+            )
+            storages = {
+                s.id: s for s in models.Storage.objects.filter(id__in=page_storage_ids)
+            }


page_storage_ids includes None when a task has no source/target storage, which causes an unnecessary DB query (id__in with NULL). Filter out null IDs and skip the Storage.objects.filter(...) query entirely when the set is empty.

Suggested change

)

storages = {

s.id: s for s in models.Storage.objects.filter(id__in=page_storage_ids)

}

if v is not None

)

storages = (

{

s.id: s for s in models.Storage.objects.filter(id__in=page_storage_ids)

}

if page_storage_ids else {}

)

Copilot · 2026-04-06T13:18:03Z

cvat/apps/engine/views.py

+        def prefetch():
+            data_queryset = (
+                models.Data.objects
+                .select_related("validation_layout", "video")
+                .prefetch_related(
+                    Prefetch(
+                        'images',
+                        queryset=(
+                            models.Image.objects
+                            .prefetch_related('related_files')
+                            .order_by('frame')
+                        )
+                    )
+                )
+            )
+
+            prefetch_related_objects(
+                [db_task],
+                "segment_set",
+                Prefetch("data", queryset=data_queryset)
+            )
+
+        prefetch()



The nested prefetch() function is defined and immediately invoked once. Inlining this logic would simplify control flow and make it clearer what queries are being executed for this endpoint.

Suggested change

def prefetch():

data_queryset = (

models.Data.objects

.select_related("validation_layout", "video")

.prefetch_related(

Prefetch(

'images',

queryset=(

models.Image.objects

.prefetch_related('related_files')

.order_by('frame')

)

)

)

)

prefetch_related_objects(

[db_task],

"segment_set",

Prefetch("data", queryset=data_queryset)

)

prefetch()

data_queryset = (

models.Data.objects

.select_related("validation_layout", "video")

.prefetch_related(

Prefetch(

'images',

queryset=(

models.Image.objects

.prefetch_related('related_files')

.order_by('frame')

)

)

)

)

prefetch_related_objects(

[db_task],

"segment_set",

Prefetch("data", queryset=data_queryset)

)

Copilot · 2026-04-06T13:18:03Z

cvat/apps/engine/views.py

@@ -1967,7 +1981,13 @@ def metadata(self, request: ExtendedRequest, pk: int):
                    )
                )
            )
-        ).get(pk=pk)
+
+            prefetch_related_objects(
+                [db_job],
+                Prefetch("segment__task__data", queryset=data_queryset)
+            )
+
+        prefetch()


Same as above: the nested prefetch() helper is only called once. Consider inlining to reduce indirection and keep the metadata endpoint logic easier to follow.

archibald1418 · 2026-04-08T11:30:29Z

perf testing:

zhiltsov-max added 2 commits April 1, 2026 15:09

Optimize prefetch in Task and Job viewsets

7eb385d

Add missing list prefetches

b24ed3a

zhiltsov-max requested a review from SpecLad as a code owner April 2, 2026 17:08

archibald1418 requested a review from Copilot April 6, 2026 13:13

Copilot started reviewing on behalf of archibald1418 April 6, 2026 13:14 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Optimize task and job viewsets#10446

[WIP] Optimize task and job viewsets#10446
zhiltsov-max wants to merge 2 commits intodevelopfrom
zm/optimize-viewsets2

zhiltsov-max commented Apr 2, 2026 •

edited

Loading

Uh oh!

zhiltsov-max commented Apr 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

archibald1418 commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	# prefetch data for permission checks
	# load related objects commonly used by permission checks

Conversation

zhiltsov-max commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and context

How has this been tested?

Checklist

License

Uh oh!

zhiltsov-max commented Apr 2, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

archibald1418 commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zhiltsov-max commented Apr 2, 2026 •

edited

Loading