Skip to content

Releases: HKUDS/DeepTutor

DeepTutor-v1.0.3

12 Apr 19:22
b815e4c

Choose a tag to compare

DeepTutor v1.0.3 Release Notes

Release Date: 2026.04.13

Highlights

Question Notebook — Unified Quiz Review System

Replaced the single-purpose "Wrong Answer Note" with a full Question Notebook that stores every quiz question (correct and incorrect) with rich metadata — question type, options, explanation, and difficulty. Each entry supports bookmarking and category tagging for organised review. A new dedicated /notebook page provides filtering (all / bookmarked / wrong), category management (create, rename, delete), and direct links back to the originating session. The QuizViewer component now integrates inline bookmark and category controls so users can organise questions without leaving the quiz flow.

Mermaid Diagram Support in Visualize

Extended the Visualize capability with a third render type — Mermaid. The analysis agent now chooses between svg, chartjs, and mermaid, preferring Mermaid for structured diagrams (flowcharts, sequence diagrams, class diagrams, mindmaps, etc.). The code generator and review agents received corresponding prompt updates, and the frontend VisualizationViewer renders Mermaid diagrams via a dedicated <Mermaid> component.

Embedding Model Mismatch Detection

Knowledge bases now record the embedding model and dimension used at index time. On load, the system compares stored fingerprints against the currently configured embedding model; mismatched KBs are flagged with embedding_mismatch and needs_reindex. The RAG search pipeline surfaces a warning when querying a mismatched KB, and the Knowledge page displays an alert badge so users know to re-index.

System Message Merging for Qwen / vLLM Compatibility

Consolidated multiple system messages into a single merged system message in both AgenticChatPipeline and ChatAgent, fixing compatibility with Qwen models served via vLLM that reject multi-system-message conversations. The context builder now filters duplicate system messages from stored history to avoid redundancy.

LM Studio & llama.cpp Provider Support

Added first-class ProviderSpec entries for LM Studio (localhost:1234) and llama.cpp (localhost:8080) with automatic base-URL detection and the openai_compat backend, plus embedding-provider alias mapping.

Glass Theme

Introduced a new Glass theme with frosted-glass card surfaces, gradient backgrounds, and glow-accent buttons. The theme switcher now cycles through light → dark → glass.

Deep Research Reporting Agent Resilience

Extracted a shared _call_llm_json helper with configurable retry logic in the reporting agent, replacing three identical inline LLM-call-then-parse blocks for introduction, section body, and conclusion generation.

Documentation Migration

Removed the legacy VitePress docs/ folder; documentation has been migrated to the project website.

Community Contributions

Full Changelog: v1.0.2...v1.0.3

DeepTutor-v1.0.2

11 Apr 08:25

Choose a tag to compare

DeepTutor v1.0.2 Release Notes

Release Date: 2026.04.11

Highlights

Search Consolidation Simplification & SearXNG Fallback

Removed the explicit consolidation_type parameter — consolidation now runs automatically for any provider that doesn't return its own answer. A new generic fallback formatter handles providers (e.g. SearXNG) that lack a dedicated Jinja2 template, fixing the "no template consolidation available" error. The CONSOLIDATION_TYPES constant and related config fields have been removed.

Provider Switch Fix

Settings page now always overwrites base_url when the user selects a different provider, instead of only filling it when the field was previously empty. This prevents stale base URLs from persisting across provider changes.

Explicit Runtime Config in Test Runner

ConfigTestRunner now builds LLM, Embedding, and Search configs directly from the resolved runtime catalog instead of relying on the global config cache, ensuring test runs always reflect the current active selection.

Frontend Resource Leak Fixes

  • Added AbortController cleanup across all Playground testers (ToolExecutor, DeepQuestionTester, DeepResearchTester, CapabilityTester) and the SaveToNotebookModal, preventing orphaned fetch requests on unmount or re-execution.
  • Introduced a MAX_CACHED_SESSIONS = 20 eviction policy in UnifiedChatContext to prevent unbounded session memory growth.
  • WebSocket runners and retry timers are now properly cleaned up on provider unmount.
  • Fixed auto-scroll throttle timer leak by returning a cleanup function from the throttle effect.

Docker Build Fix

  • Fixed setTimeout return type mismatch (number vs NodeJS.Timeout) in UnifiedChatContext that caused the frontend compilation to fail during Docker multi-platform builds.

What's Changed

New Contributors

Full Changelog: v1.0.1...v1.0.2

DeepTutor-v1.0.1

10 Apr 15:00

Choose a tag to compare

DeepTutor v1.0.1 Release Notes

Release Date: 2026.04.10

Highlights

Visualize Capability with Chart.js/SVG Rendering Pipeline

Added a new Visualize capability that turns natural-language data descriptions into interactive Chart.js or inline SVG visualizations. The backend runs a three-stage agent pipeline (analysis → code generation → review) with bilingual prompt support (en/zh). The frontend ships two new components — VisualizeConfigPanel for request configuration and VisualizationViewer for rendering — wired into the workspace home page and chat composer.

Explicit Reference Picker in Chat Composer

Added an explicit Reference dropdown button to the chat composer toolbar, sitting alongside the existing Tools dropdown. Users can now directly click the @ Reference button to attach Notebook records or Chat History sessions as context — no need to discover the hidden @ trigger in the input field. The dropdown shows per-category selection counts and a total badge, making it clear at a glance how much context is attached. The original @ keyboard shortcut remains functional as a power-user alternative.

Quiz Duplicate Prevention & Generation History

Fixed repeated quiz questions by introducing a dedicated previous_questions parameter through the Generator pipeline, cleanly separated from conversation history_context. A MAX_PREVIOUS_QUESTIONS=20 cap keeps prompt size bounded, and language labels are moved into YAML templates to avoid language mixing across locales.

o4-mini & Future o-Series Model Support

Extended the o-series regex in LLM config to recognize o4-mini and future o-series model identifiers, ensuring max_completion_tokens is set correctly instead of the unsupported max_tokens parameter (closes #274).

Server Logging Improvements

  • Suppressed noisy uvicorn WebSocket connection/disconnection logs that cluttered server output.
  • Added selective HTTP access logging middleware that only logs non-200 responses, reducing log noise while preserving actionable error visibility.
  • Added MiniMax model override (supports_response_format: false) for providers that do not support structured output.

What's Changed

  • fix:CoWriterEditor scroll sync by @Frant1cc in #175
  • Fix/i18n improvement by @Frant1cc in #176
  • Feature/llm hardening core slim (#52) by @scrrlt in #183
  • docs: update ru translate by @oshliaer in #184
  • feat: Add OpenRouter search provider by @infstellar in #194
  • Restrict code execution scope and enforce import whitelist by @RinZ27 in #196
  • feat/vision_slover by @kms9 in #191
  • Let's meet DeepTutor 1.0.0! by @pancacake in #238
  • update tag by @pancacake in #239
  • fix(deps): gate oauth-cli-kit to Python 3.11+ by @2023Anita in #251
  • fix(question-generator): support nested MinerU output in mimic mode by @2023Anita in #250
  • fix: add missing imports for mimic websocket router by @YizukiAme in #253
  • fix: invalidate runtime caches after settings changes by @YizukiAme in #254
  • fix(start-tour): tolerate non-UTF-8 subprocess output by @2023Anita in #259
  • fix: use parse_json_response for LLM outputs to handle markdown fences by @kagura-agent in #263
  • Fix Windows compatibility for Math Animator renderer by @kevinmw in #256
  • fix: Windows compatibility + Guided Learning improvements by @kevinmw in #266
  • docs: clarify github copilot provider login semantics by @LocNguyenSGU in #262
  • fix: use lowercased filename in mimetypes.guess_type() for consistent MIME validation by @kuishou68 in #272
  • fix: extend o-series regex to cover o4-mini and future o-series models by @kuishou68 in #275
  • Prevent duplicate quiz questions by removing duplicates and adding history by @Leadernelson in #281

New Contributors

DeepTutor-1.0.0-beta.4

09 Apr 16:57

Choose a tag to compare

DeepTutor v1.0.0-beta.4 Release Notes

Release Date: 2026.04.10

Highlights

Embedding Progress Tracking & Rate Limit Retry

Added real-time embedding progress reporting during knowledge base initialization — the UI now shows batch N/M complete as documents are embedded. HTTP 429 (Too Many Requests) responses are automatically retried with exponential back-off, and a configurable batch_delay parameter lets free-tier users throttle requests to stay within rate limits. Progress callbacks are properly cleaned up in finally blocks to prevent leaking into subsequent search calls.

Cross-Platform Start Tour Dependency Management

The onboarding start tour now auto-installs bootstrap dependencies (e.g. PyYAML) if missing, and supports system-dependency installation across macOS (Homebrew), Linux (apt/dnf/yum), and Windows (winget/Chocolatey) for Math Animator prerequisites like LaTeX, FFmpeg, Cairo, and CMake. The typer[all] dependency was also simplified to typer to avoid pulling unnecessary extras.

Case-Insensitive MIME Validation

Fixed a platform-dependent bug where files with uppercase extensions (e.g. report.PDF, data.JSON) bypassed MIME type validation on Linux. mimetypes.guess_type() now receives the lowercased filename, consistent with the extension whitelist check.

What's Changed

  • fix: use lowercased filename in mimetypes.guess_type() for consistent MIME validation by @kuishou68 in #272

Contributors

  • @oxkage — Embedding progress tracking and HTTP 429 rate limit retry (#268)
  • @kuishou68 — Case-insensitive MIME type validation fix (#272, closes #271)

Full Changelog: v1.0.0-beta.3...v1.0.0-beta.4

v1.0.0-beta.3

08 Apr 12:24

Choose a tag to compare

DeepTutor v1.0.0-beta.3 Release Notes

Release Date: 2026.04.08

Highlights

Remove LiteLLM Dependency

Replaced the litellm abstraction layer with native openai and anthropic SDKs across both the services and TutorBot layers. Added a new OpenAICompatProvider (covering OpenAI, DeepSeek, Mistral, StepFun, XiaoMi-MiMo, Qianfan, oVMS, and more) and a dedicated AnthropicProvider. The settings UI now includes a provider dropdown with auto base-URL filling. Auto-fallback to streaming is triggered when tool-call format errors occur (fixes #265).

Windows Math Animator Compatibility

Fixed SelectorEventLoop incompatibility on Windows by replacing asyncio.create_subprocess_exec with subprocess.Popen + reader threads + asyncio.Queue, preserving real-time line-by-line progress output. Also applied ProactorEventLoop policy for subprocess support on Windows.

Robust JSON Parsing for LLM Outputs

Seven agent modules (planner, idea, design, note, reporting, citation, data structures) now use parse_json_response() instead of raw json.loads(), correctly handling LLM responses wrapped in markdown code fences. A _UNSET sentinel was introduced for the fallback parameter so callers can explicitly request None as the failure value.

Guided Learning Fixes

  • Fixed KaTeX math rendering by configuring $...$ and $$...$$ delimiters, removing broken SRI integrity hashes, and adding parent-window fallback rendering for bare LaTeX text nodes.
  • Fixed backend poll (fetchPageStatuses) overwriting user's tab navigation by only accepting current_index when the user hasn't navigated yet.
  • Increased guide agent max_tokens from 8192 to 16384 to prevent HTML truncation.

Full Internationalization

Completed i18n coverage for the web UI — all hardcoded strings across workspace, utility, sidebar, and component pages are now translation-keyed with full English and Chinese locale files.

What's Changed

  • fix(start-tour): tolerate non-UTF-8 subprocess output by @2023Anita in #259
  • fix: use parse_json_response for LLM outputs to handle markdown fences by @kagura-agent in #263
  • Fix Windows compatibility for Math Animator renderer by @kevinmw in #256
  • fix: Windows compatibility + Guided Learning improvements by @kevinmw in #266
  • docs: clarify github copilot provider login semantics by @LocNguyenSGU in #262

New Contributors

Full Changelog: v1.0.0-beta.2...v1.0.0-beta.3

v1.0.0-beta.2

07 Apr 03:16

Choose a tag to compare

DeepTutor v1.0.0-beta.2 Release Notes

Release Date: 2026.04.07

Highlights

Hot Settings Reload

Model settings changes (API keys, model selection, endpoints) now take effect immediately — no server restart required. The runtime LLM, embedding, and config caches are automatically invalidated after saving via the Settings page or onboarding tour.

MinerU Nested Output Support

The question extractor now discovers parsed markdown in nested MinerU output directories (e.g. hybrid_auto/), fixing cases where MinerU successfully parsed a document but question generation still failed because the markdown was not found.

Mimic WebSocket Fix

Fixed a NameError crash on the /mimic WebSocket endpoint caused by missing sys and Path imports.

Python 3.11+ Minimum

Dropped Python 3.10 support. The minimum required version is now Python 3.11. CI matrix, pyproject.toml, and all documentation have been updated accordingly.

CI & Maintenance

  • Removed Dependabot automatic dependency update PRs
  • Streamlined CI test matrix to Python 3.11 / 3.12
  • Added regression tests for question extractor, mimic WebSocket router, and settings cache invalidation

What's Changed

New Contributors

Full Changelog: v1.0.0-beta.1...v1.0.0-beta.2

DeepTutor-1.0.0-beta.1

04 Apr 05:13
540400c

Choose a tag to compare

🚀 DeepTutor v1.0.0-beta1 Release Notes

Release Date: 2026.04.04

We're thrilled to announce DeepTutor v1.0.0-beta1 — the first beta of the brand new DeepTutor architecture. This is a ground-up rewrite that transforms DeepTutor from a monolithic RAG tutor into an agent-native learning platform with a two-layer plugin model (Tools + Capabilities), three unified entry points (CLI / WebSocket / Python SDK), and a completely rebuilt web application shell. Find more surprises by your own!

⚠️ Beta Notice: This is beta 1 of v1.0.0. The core architecture is stable, but some UI interactions and edge-case workflows may still contain bugs. We appreciate your patience and welcome bug reports via Issues.

📌 Knowledge Base Note: In this release, the RAG pipeline has been simplified to LlamaIndex only. LightRAG and RAG-Anything pipelines along with their related knowledge base content have been temporarily removed to focus on stability. They will be re-introduced in upcoming releases.

Tip

Call for Feedback: It is aware that some of the old features are not included in the newest version. If you have any helpful comments, encounter any bugs or have any feature requests, please open an issue! PRs are also welcome — see our Contributing Guide.

Diff Scope: main...dev (903 files changed, 92,701 insertions, 73,749 deletions)


Quick Summary

  • Architecture — Complete rewrite from src/ to deeptutor/ + deeptutor_cli/ with agent-native runtime (Tools + Capabilities).
  • Entry Points — Three unified entry points: standalone CLI (deeptutor), WebSocket API (/api/v1/ws), and Python SDK facade.
  • Capabilities — Five built-in capabilities: chat, deep_solve, deep_question, deep_research, math_animator.
  • Tools — Seven LLM-callable tools: rag, web_search, code_execution, reason, brainstorm, paper_search, geogebra_analysis.
  • Web App — Rebuilt Next.js app with workspace/utility route groups, new Playground, Co-Writer, Agents, and Guide pages.
  • TutorBot — Multi-channel bot agent supporting 12 messaging platforms.
  • Infra — SQLite-backed session persistence, turn runtime, provider-level LLM traffic control and telemetry.

✨ Highlights

🏗️ Agent-Native Runtime (Tools + Capabilities)

Introduced a two-layer plugin model that decouples tool execution from high-level agent workflows:

  • Core Contracts: ToolProtocol, CapabilityProtocol, UnifiedContext, StreamEvent, and StreamBus — the foundation of all runtime execution.
  • ChatOrchestrator: Central coordinator with two registries:
    • ToolRegistry — tool discovery, OpenAI-style schema export, and execution.
    • CapabilityRegistry — capability routing, manifest management, and stage-aware streaming.

🖥️ Unified Entry Points: CLI / WebSocket / Python SDK

Three entry points share a single ChatOrchestrator runtime:

Entry Point Description
CLI (deeptutor) Typer-based CLI with sub-commands: run, chat, bot, kb, memory, session, notebook, plugin, config, provider, serve
WebSocket (/api/v1/ws) Unified endpoint with turn lifecycle: start_turn, subscribe_turn, subscribe_session, resume_from, cancel_turn
Python SDK (deeptutor.app.facade) Programmatic facade for SDK-style integrations

🧠 Capability Layer

Five built-in capabilities, each a multi-step agent pipeline:

Capability Stages Description
chat responding Default tool-augmented conversation
deep_solve planning → reasoning → writing Multi-stage problem solving
deep_question ideation → evaluation → generation → validation Intelligent question generation with follow-up mode
deep_research search → analyze → synthesize → report Multi-agent research with report generation
math_animator analysis → design → codegen → review → render Manim-based math concept video generation

🔧 Tooling System

Seven unified LLM-callable tools with bilingual prompt hints (en/zh):

Tool Description
rag Knowledge base retrieval via LlamaIndex
web_search 10 search providers: Tavily, Exa, Jina, Serper, Perplexity, Brave, Baidu, SearXNG, DuckDuckGo, OpenRouter
code_execution Sandboxed Python execution with AST-based safety guards
reason Dedicated deep-reasoning LLM call
brainstorm Breadth-first idea exploration with structured rationale
paper_search arXiv academic paper search

🤖 TutorBot — Multi-Channel Bot Agent

New autonomous bot system (deeptutor/tutorbot/) that brings DeepTutor to messaging platforms:

  • 12 Channels: Telegram, Discord, Slack, WeChat Work (WeCom), Feishu, DingTalk, WhatsApp, Matrix, QQ, Email, MoChat
  • Agent Loop: Tool-augmented LLM loop with memory, subagent spawning, and team collaboration
  • Built-in Tools: Shell, filesystem, web, MCP, cron, and message tools
  • Background Services: Heartbeat health checks and cron-based scheduled tasks

🌐 Web Application Restructure

Complete rebuild of the Next.js frontend with new route groups:

Workspace Routes ((workspace)/):

Page Description
Home (/) Main chat interface with tool-augmented conversation
Guide (/guide) Interactive learning guide with session history, progress tracking, and completion summaries
Playground (/playground) Unified deep capability UI (deep_solve, deep_question, deep_research, math_animator)
Co-Writer (/co-writer) AI-assisted collaborative writing with edit and narrator agents
Agents (/agents) TutorBot management — create, configure, and chat with custom bots

Utility Routes ((utility)/):

Page Description
Knowledge (/knowledge) Knowledge base management with LlamaIndex pipeline
Memory (/memory) User memory and preference management
Settings (/settings) Unified configuration for LLM, Embedding, TTS, and Search services

🏭 Service Infrastructure Rebuild

Refactored services into clearer domains:

deeptutor/services/
├── config/       # Environment store, model catalog, provider runtime
├── llm/          # Multi-provider LLM: factory, registry, traffic control, telemetry
├── embedding/    # Adapter-based: OpenAI-compatible, Cohere, Jina, Ollama
├── rag/          # LlamaIndex pipeline with component-based architecture
├── search/       # 10 web search providers with result consolidation
├── session/      # SQLite store, turn runtime, context builder
├── memory/       # User memory persistence
├── notebook/     # Notebook management
├── prompt/       # Bilingual prompt template manager (en/zh)
├── settings/     # Interface settings
├── setup/        # Application initialization
├── tutorbot/     # TutorBot management
└── path_service  # Centralized data path resolution

🔒 Security & Stability

  • Code Execution Safety: AST-based import/call guards with configurable allowlists.
  • LLM Traffic Control: Provider-level circuit breaker, error rate tracking, and retry mechanisms.
  • Startup Validation: Capability-to-tool consistency checks at boot time.

🧪 Test Coverage

53+ new test files across all major layers: runtime (tool/capability registry, orchestrator), services (LLM provider/factory/routing/telemetry, RAG pipeline, embedding, search, session, memory, notebook, config), agents (chat, solve, question, math_animator), API (knowledge, memory, solve, WebSocket turn runtime), CLI, and tools (code executor safety).


⚠️ Breaking Changes

  • Package layout: src/deeptutor/ + deeptutor_cli/. Old src/ directory fully removed (140 files).
  • Package renamed: ai-tutordeeptutor, version 1.0.0.
  • Runtime model: Capability-native orchestration. chat is the default; deep modes selected explicitly via run command or WebSocket.
  • Web routes: All pages reorganized under (workspace)/ and (utility)/. Legacy pages (/solver, /question, /research, /ideagen, /notebook, /history) removed.
  • RAG pipeline: Only LlamaIndex available. LightRAG and RAG-Anything temporarily removed.
  • Data layout: Runtime data centered under data/user/workspace/....
  • Dependencies: Split into layered requirements: cli.txt, server.txt, dev.txt, math-animator.txt, tutorbot.txt.

📦 What's Changed

  • Complete codebase rewrite with agent-native architecture (DeepTutor 2.0).
  • Two-layer plugin model (Tools + Capabilities) with ChatOrchestrator coordinator.
  • Standalone CLI package (deeptutor_cli/) with 11 sub-commands via Typer.
  • Unified WebSocket endpoint with turn lifecycle and session streaming.
  • 5 built-in capabilities and 7 LLM-callable tools with bilingual prompt hints.
  • TutorBot multi-channel bot agent with 12 platform integrations.
  • Rebuilt web app with workspace/utility route groups and new Playground, Co-Writer, Agents, and Guide pages.
  • Service infrastructure rebuild: LLM provider registry, embedding adapters, SQLite session store, memory, notebook, and search consolidation.
  • AST-based code execution safety, LLM traffic control, and provider telemetry.
  • 53+ test files across runtime, services, agents, API, CLI, and tools.
  • Updated Docker configuration and layered dependency management.
Read more

ver0.6.0

22 Jan 17:48

Choose a tag to compare

DeepTutor v0.6.0 Release Notes

Release Date: 2026.01.23

Highlights

Frontend State Persistence

Implemented robust session persistence across the application:

  • Solver, Guide, and other sessions now persist across browser refreshes
  • Improved state management with dedicated persistence layer
  • Better user experience with session continuity

Incremental Document Upload

Enhanced knowledge base with incremental document processing:

  • Add new documents to existing knowledge bases without full re-indexing
  • Significant performance improvement for large document collections
  • Smarter document change detection

Flexible RAG Pipeline Import

Refactored RAG initialization for better compatibility:

  • On-demand loading of RAG libraries (RAG-Anything, LlamaIndex)
  • Reduced startup time and memory footprint
  • Graceful fallback when optional dependencies are unavailable

Full Chinese Localization (i18n)

Added complete Chinese language support for the web interface:

  • Comprehensive translation across all pages and components
  • Dynamic language switching without page reload
  • i18n audit tools for translation consistency

Bug Fixes & Improvements

  • Enhanced LLM retry mechanism for complex agent operations
  • Fixed temperature parameter handling issues
  • Docker build optimizations and npm compatibility fixes
  • Added api_version parameter for Azure OpenAI support

Full Changelog: v0.5.2...v0.6.0

ver0.5.2

18 Jan 07:49

Choose a tag to compare

DeepTutor v0.5.2 Release Notes

Release Date: 2026.01.18

Highlights

Docling Support for RAG-Anything

Added alternative RAG-Anything initialization using Docling as the document parser:

  • For users whose local environment is not suitable for MinerU
  • Provides a lightweight alternative for document processing
  • Same multimodal graph capabilities with different backend

Logging System Optimization

Refactored the logging system for better management:

  • Improved log output control across all modules
  • Better structured logging adapters
  • Enhanced console, file, and WebSocket handlers

Bug Fixes & Code Improvements

  • Optimized code structure across multiple modules
  • Fixed several bugs affecting user experience
  • Improved CI/CD workflows with Python 3.10/3.11 matrix testing

Full Changelog: v0.5.1...v0.5.2

ver0.5.1

18 Jan 04:28

Choose a tag to compare

DeepTutor v0.5.1 Release Notes

Release Date: 2026.01.18

Hey everyone! We just released v0.5.1!

Highlights

Docling Support for RAG-Anything

Added alternative RAG-Anything initialization using Docling as the document parser:

  • For users whose local environment is not suitable for MinerU
  • Provides a lightweight alternative for document processing
  • Same multimodal graph capabilities with different backend

Logging System Optimization

Refactored the logging system for better management:

  • Improved log output control across all modules
  • Better structured logging adapters
  • Enhanced console, file, and WebSocket handlers

Bug Fixes & Code Improvements

  • Optimized code structure across multiple modules
  • Fixed several bugs affecting user experience
  • Improved CI/CD workflows with Python 3.10/3.11 matrix testing

Full Changelog: v0.5.0...v0.5.1

What's Changed

New Contributors

Full Changelog: v0.5.0...v0.5.1