Skip to content

Latest commit

 

History

History
198 lines (160 loc) · 7.06 KB

File metadata and controls

198 lines (160 loc) · 7.06 KB

Impact Prediction System - Team Presentation Plan

Team Size: 5 people Presentation Time: ~7 minutes per person (35 minutes total) Language: Chinese + English versions


Team Division Strategy

Person 1: Project Overview & Data Processing (7 mins)

Focus: Introduction, Dataset, Feature Engineering

Responsibilities:

  • Project background and objectives
  • Dataset overview (1.2M papers, 836K authors)
  • Data processing pipeline
  • Feature engineering (63 paper features, 44 author features)
  • Data quality and validation

Key Deliverables:

  • Explain the scientific impact prediction problem
  • Describe data sources (Semantic Scholar dataset)
  • Show feature extraction process
  • Demonstrate data statistics

Person 2: Baseline Models & Evaluation Metrics (7 mins)

Focus: Fundamental Models, Metrics Design

Responsibilities:

  • Evaluation metrics (MAPE, R², MSE, MAE, RMSE, PA-R²)
  • 4 Baseline models:
    • Constant (F)
    • PlusK (PK)
    • Variable K
    • Simple Linear (SM)
  • Model comparison framework
  • Performance analysis

Key Deliverables:

  • Explain why each metric matters
  • Show baseline model implementations
  • Present comparative results
  • Establish performance baselines

Person 3: Advanced ML Models & Model Evaluation (7 mins)

Focus: Machine Learning, Visualization, Analysis

Responsibilities:

  • Advanced ML models:
    • Lasso Regression (LAS)
    • Random Forest (RF)
    • Gradient Boosting (GBRT)
  • Model training and optimization
  • Comprehensive evaluation (ANALYSIS_REPORT.md)
  • Visualization system
  • Model selection recommendations

Key Deliverables:

  • Explain model architectures
  • Show training process and hyperparameters
  • Present evaluation results and visualizations
  • Provide model selection guide

Person 4: FastAPI Backend Development (7 mins)

Focus: REST API, Server Architecture

Responsibilities:

  • FastAPI server architecture
  • API endpoint design:
    • GET /healthz
    • GET /version
    • POST /predict/paper
    • POST /predict/author
  • Request/response validation (Pydantic)
  • Model loading and caching
  • API documentation (Swagger UI)
  • Error handling

Key Deliverables:

  • Show API architecture diagram
  • Demonstrate API endpoints
  • Explain validation and caching strategies
  • Live API documentation demo

Person 5: Web Frontend Development (7 mins)

Focus: User Interface, Visualization

Responsibilities:

  • Flask web application
  • Interactive prediction interfaces:
    • Homepage with status monitoring
    • Paper citation prediction UI
    • Author h-index prediction UI
  • Real-time visualization (Chart.js)
  • Export functionality (CSV/JSON)
  • Responsive design

Key Deliverables:

  • Show UI/UX design
  • Demonstrate prediction workflow
  • Live visualization demo
  • Explain user interaction features

Presentation Flow

┌─────────────────────────────────────────────────────────────┐
│ Person 1: Introduction & Data (7 min)                       │
│ - Project overview                                          │
│ - Dataset & features                                        │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│ Person 2: Baseline Models & Metrics (7 min)                 │
│ - Evaluation framework                                      │
│ - Baseline implementations                                  │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│ Person 3: Advanced ML & Evaluation (7 min)                  │
│ - ML model architectures                                    │
│ - Training & analysis                                       │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│ Person 4: FastAPI Backend (7 min)                           │
│ - API architecture                                          │
│ - Endpoint demonstrations                                   │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│ Person 5: Web Frontend (7 min)                              │
│ - UI/UX design                                              │
│ - Live system demo                                          │
└─────────────────────────────────────────────────────────────┘

Key Statistics to Highlight

  • Dataset: 1,193,650 papers + 836,024 authors
  • Features: 63 paper features + 44 author features
  • Time Window: 1975-2005 training → 2006-2015 prediction
  • Models: 7 models (4 baseline + 3 ML)
  • Best Results:
    • Constant (F): MAPE 47%
    • Lasso (LAS): R² 0.89
    • Random Forest (RF): PA-R² 0.27
  • Code: ~12,000 lines (Python + JS + HTML/CSS)
  • Documentation: ~5,000 lines

Presentation Tips

  1. Person 1: Start strong with motivation, show compelling statistics
  2. Person 2: Focus on "why these metrics matter" for scientific predictions
  3. Person 3: Highlight surprising findings (e.g., Constant model's MAPE performance)
  4. Person 4: Demo live API calls, show Swagger UI
  5. Person 5: End with impressive live demo of full system

Documents to Create

For each person:

  1. Detailed Report (中文 + English)

    • Full technical content
    • Code snippets
    • Detailed explanations
  2. Presentation Slides Content (中文 + English)

    • Bullet points
    • Key visuals
    • Speaker notes
  3. 7-Minute Script (中文 + English)

    • Timed talking points
    • Transition phrases
    • Q&A preparation

Created: 2025-10-10 Project: Impact Prediction System Version: 1.0