Skip to content

Feature Request: Add model selection and smart video trimming for talk videos #5

@DrKyro

Description

@DrKyro

Feature Request

1. Model ID Selection

Allow users to choose their preferred model ID instead of using hardcoded defaults.

Current behavior:

  • Qwen provider uses qwen3.5-flash by default
  • OpenRouter uses stepfun/step-3.5-flash:free by default

Desired behavior:

  • Add --model or --model-id CLI argument
  • Allow users to specify any model ID (e.g., qwen-max, gpt-4o, etc.)
  • In Streamlit UI, add a dropdown to select model

Example:

uv run python video_orchestrator.py --model-id qwen-max "VIDEO_URL"

2. Smart Video Trimming for Talk Videos (Smart Cut)

Automatically remove breath sounds (气口) and silence segments from talk videos to create tighter edits.

Desired behavior:

  • Add --smart-cut or --trim-silence flag
  • Detect silence segments (configurable threshold, e.g., -40dB)
  • Detect breath sounds between sentences
  • Automatically cut out these segments from the final clip
  • Optionally keep a small buffer (e.g., 100ms) at cut points for natural transitions

Example:

uv run python video_orchestrator.py --smart-cut "VIDEO_URL"

Parameters to consider:

  • --silence-threshold: dB threshold for silence detection (default: -40dB)
  • --min-silence-duration: minimum silence duration to cut (default: 0.3s)
  • --breath-detection: enable breath sound detection

Thanks for considering these features!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions