Optimize vid_vec_rep_clip operator for long videos and add profiling … by Ishaswm · Pull Request #624 · tattle-made/feluda

Ishaswm · 2025-05-02T11:09:46Z

Optimization Notes for `vid_vec_rep_clip`

Problem Statement

The current implementation of the vid_vec_rep_clip operator lacks support for processing longer videos efficiently and reliably. Specifically, we wanted to investigate:

Can the operator process longer videos (1min to 1hr) without breaking or exhausting system resources?
Is the model itself a bottleneck, or is the limitation due to code inefficiencies?
How does the operator perform in terms of CPU and memory usage for large video inputs?

Goals

Determine if the operator can process videos of varying lengths (1, 5, 10, 20, 30, 45, 60 mins).
Profile memory and CPU usage during execution.
Fix inefficiencies (if any) in the original implementation.
Ensure output vector correctness post-refactor.

Findings from Original Implementation

Inconsistencies: Memory usage for the 60-min video is unexpectedly lower than for the 30-min video, suggesting inefficient memory handling or potential leaks in intermediate steps.

Results After Refactor

Note: Longer videos show memory increase due to more efficient baseline measurement.

Performance Comparison

Memory-Optimized vs Original Implementation

Memory Optimization Highlights:

81% reduction for 30-minute videos (1917MB → 365MB)
73.7% savings for 10-minute videos (1858MB → 488MB)
More stable memory profile across all durations.

Processing Tradeoffs:

37-121% longer processing for videos ≤30 minutes.
14-22% faster for very long videos (>45 minutes)
More accurate performance measurements

Key Improvements

Efficient I-Frame Sampling

Switched to extracting only I-frames using ffmpeg, reducing unnecessary frame processing and improving memory efficiency.

Built-in Memory Profiling:

Integrated psutil and tracemalloc to monitor memory usage before and after processing.
Reports net memory change, helping diagnose scaling issues.

Scalable to Long Videos:

Successfully tested on videos up to 1 hour, showing stable memory growth.
Reports net memory change, helping diagnose scaling issues.

Enhanced Test Coverage:

Includes test cases for:
- Local long videos (e.g., 1 hr)
- Sample short videos
- Remote video URLs

Summary of Changes

This PR introduces the following enhancements to the vid_vec_rep_clip operator:

I-Frame Sampling Strategy:

Instead of decoding every frame or relying on precomputed metadata, the updated operator uses ffmpeg to extract only I-frames for vector representation. This reduces redundancy and improves scalability.
Streaming Feature Extraction:

Frames are now loaded and processed in a streaming manner (one at a time) using temporary storage, preventing memory bloat.
Detailed Profiling Added to Tests:

The unittest suite has been enhanced to capture:
- Memory usage before/after processing
- Net memory consumption
- CPU time and usage
- Peak memory (from tracemalloc)
- Total I-frames and vectors generated
Average Vector Addition:

The final output includes a mean vector of all I-frame features, maintaining consistency with prior behavior.

Limitations

I-Frame Distribution: The number of I-frames is determined by video encoding, so a shorter video could occasionally have more I-frames than a longer one. This is expected and valid behavior.
Processing Time: The new implementation may take slightly longer for short videos due to I-frame extraction overhead, but this tradeoff is acceptable given the lower memory usage and improved scalability.
No Parallelism Yet: Current implementation processes frames sequentially. There’s room for future speedup via batching or multithreading.

Checklist

✅ Code handles long videos (1 min to 1 hour)
✅ Memory and CPU profiling included
✅ Documented tradeoffs (time vs. memory)
✅ Old and new results clearly documented
✅ Known limitations acknowledged

…results

aatmanvaidya · 2025-05-03T20:13:24Z

hello @Ishaswm thank you for the fix, there are still some changes needed to be made
please give me some time, I will get back with more detailed feedback soon

aatmanvaidya · 2025-05-04T18:38:33Z

hi @Ishaswm I have left some more comments above

since all the code changes you have made are related to bechmarking, I think you should the following

create a folder called "benchmark" at root
create a folder for the operator here and all your profiling code
don't add the operator file to the benchmark folder, instead call the run() function from the operator in the profiling code in the benchmark folder

- Create dedicated benchmark module for profiling - Move performance tests to operators/benchmark/ - Keep operator code focused on core functionality

…benchmarking - Replaced shell-based ffmpeg call with cross-platform subprocess.run() for robust I-frame extraction. Improved error handling with check=True and stderr capture. - Added comprehensive benchmark documentation with performance stats table.

…add benchmarking docs - Replaced shell-based ffmpeg call with subprocess.run() for robust I-frame extraction. - Improved error handling with check=True and stderr capture. - Added benchmark README with performance stats table.

Ishaswm · 2025-05-11T02:52:16Z

@aatmanvaidya I would love to join the Tattle Slack to stay in the loop. Could you please send me an invite to your workspace at my email address: ishaswami52003@gmail.com

dennyabrain · 2025-05-11T03:46:28Z

Hi @Ishaswm, have sent you an invite.

aatmanvaidya · 2025-05-18T12:44:41Z

+import sys
+sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '../../..')))


this is an windows specific change right? If yes, let's remove it

aatmanvaidya · 2025-05-18T12:46:03Z

can you move this file to the benchmark folder?

aatmanvaidya · 2025-05-18T12:47:42Z

hi @Ishaswm have left some minor comments above

omkar-334 · 2025-05-18T14:00:28Z

Tests are failing here because frame_sample_rate is not defined in class __init__(), but used later on.

__init__()

run

… test.py file

Optimize vid_vec_rep_clip operator for long videos and add profiling …

5dbde69

…results

aatmanvaidya self-requested a review May 2, 2025 11:10

aatmanvaidya reviewed May 2, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/vid_vec_rep_clip.py Outdated

refactor(vid_vec_rep_clip): remove inline benchmarking from analyze()

393a1de

aatmanvaidya reviewed May 4, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/vid_vec_rep_clip.py Outdated

aatmanvaidya reviewed May 4, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/vid_vec_rep_clip.py Outdated

aatmanvaidya reviewed May 4, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/vid_vec_rep_clip.py Outdated

aatmanvaidya reviewed May 4, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/test.py

feat: add benchmarking system

7d91265

- Create dedicated benchmark module for profiling - Move performance tests to operators/benchmark/ - Keep operator code focused on core functionality

aatmanvaidya reviewed May 8, 2025

View reviewed changes

Comment thread operators/benchmark/__init__.py Outdated

aatmanvaidya reviewed May 8, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/1.png Outdated

Isha Swami added 2 commits May 11, 2025 07:13

aatmanvaidya reviewed May 17, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/1.png Outdated

aatmanvaidya reviewed May 17, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/OPTIMIZATION_NOTES.md Outdated

aatmanvaidya reviewed May 17, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/TECHNICAL_ANALYSIS.md

Isha Swami added 2 commits May 18, 2025 07:00

Remove markdown file from the operator

8e31b19

Remove unused frame_sample_rate argument from VideoAnalyzer

ddffd0b

aatmanvaidya reviewed May 18, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/test.py Outdated

aatmanvaidya reviewed May 18, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/1.png Outdated

aatmanvaidya reviewed May 18, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/TECHNICAL_ANALYSIS.md

aatmanvaidya reviewed May 18, 2025

View reviewed changes

Comment thread operators/vid_vec_rep_clip/vid_vec_rep_clip.py Outdated

Isha Swami and others added 4 commits May 19, 2025 06:16

Clean up: remove unnecessary files, update vid_vec_rep_clip logic and…

291957b

… test.py file

chore: remove operator changes

69fd6be

refactor: readme file

da650a7

chore: fix lint issues

641ab0b

aatmanvaidya changed the base branch from main to development May 19, 2025 06:31

aatmanvaidya merged commit 03df960 into tattle-made:development May 19, 2025
3 of 4 checks passed

aatmanvaidya linked an issue May 19, 2025 that may be closed by this pull request

Video Operator should process video of any length and size #323

Closed

aatmanvaidya mentioned this pull request May 19, 2025

Video Operator should process video of any length and size #323

Closed

		import sys
		sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '../../..')))

Conversation

Ishaswm commented May 2, 2025

Optimization Notes for vid_vec_rep_clip

Problem Statement

Goals

Findings from Original Implementation

Results After Refactor

Performance Comparison

Memory-Optimized vs Original Implementation

Memory Optimization Highlights:

Processing Tradeoffs:

Key Improvements

Efficient I-Frame Sampling

Built-in Memory Profiling:

Scalable to Long Videos:

Enhanced Test Coverage:

Summary of Changes

I-Frame Sampling Strategy:

Streaming Feature Extraction:

Detailed Profiling Added to Tests:

Average Vector Addition:

Limitations

Checklist

Uh oh!

Uh oh!

aatmanvaidya commented May 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aatmanvaidya commented May 4, 2025

Uh oh!

Uh oh!

Uh oh!

Ishaswm commented May 11, 2025

Uh oh!

dennyabrain commented May 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aatmanvaidya May 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aatmanvaidya May 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

aatmanvaidya commented May 18, 2025

Uh oh!

omkar-334 commented May 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Optimization Notes for `vid_vec_rep_clip`

omkar-334 commented May 18, 2025 •

edited

Loading