Team Size: 5 people Presentation Time: ~7 minutes per person (35 minutes total) Language: Chinese + English versions
Focus: Introduction, Dataset, Feature Engineering
Responsibilities:
- Project background and objectives
- Dataset overview (1.2M papers, 836K authors)
- Data processing pipeline
- Feature engineering (63 paper features, 44 author features)
- Data quality and validation
Key Deliverables:
- Explain the scientific impact prediction problem
- Describe data sources (Semantic Scholar dataset)
- Show feature extraction process
- Demonstrate data statistics
Focus: Fundamental Models, Metrics Design
Responsibilities:
- Evaluation metrics (MAPE, R², MSE, MAE, RMSE, PA-R²)
- 4 Baseline models:
- Constant (F)
- PlusK (PK)
- Variable K
- Simple Linear (SM)
- Model comparison framework
- Performance analysis
Key Deliverables:
- Explain why each metric matters
- Show baseline model implementations
- Present comparative results
- Establish performance baselines
Focus: Machine Learning, Visualization, Analysis
Responsibilities:
- Advanced ML models:
- Lasso Regression (LAS)
- Random Forest (RF)
- Gradient Boosting (GBRT)
- Model training and optimization
- Comprehensive evaluation (ANALYSIS_REPORT.md)
- Visualization system
- Model selection recommendations
Key Deliverables:
- Explain model architectures
- Show training process and hyperparameters
- Present evaluation results and visualizations
- Provide model selection guide
Focus: REST API, Server Architecture
Responsibilities:
- FastAPI server architecture
- API endpoint design:
- GET /healthz
- GET /version
- POST /predict/paper
- POST /predict/author
- Request/response validation (Pydantic)
- Model loading and caching
- API documentation (Swagger UI)
- Error handling
Key Deliverables:
- Show API architecture diagram
- Demonstrate API endpoints
- Explain validation and caching strategies
- Live API documentation demo
Focus: User Interface, Visualization
Responsibilities:
- Flask web application
- Interactive prediction interfaces:
- Homepage with status monitoring
- Paper citation prediction UI
- Author h-index prediction UI
- Real-time visualization (Chart.js)
- Export functionality (CSV/JSON)
- Responsive design
Key Deliverables:
- Show UI/UX design
- Demonstrate prediction workflow
- Live visualization demo
- Explain user interaction features
┌─────────────────────────────────────────────────────────────┐
│ Person 1: Introduction & Data (7 min) │
│ - Project overview │
│ - Dataset & features │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Person 2: Baseline Models & Metrics (7 min) │
│ - Evaluation framework │
│ - Baseline implementations │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Person 3: Advanced ML & Evaluation (7 min) │
│ - ML model architectures │
│ - Training & analysis │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Person 4: FastAPI Backend (7 min) │
│ - API architecture │
│ - Endpoint demonstrations │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Person 5: Web Frontend (7 min) │
│ - UI/UX design │
│ - Live system demo │
└─────────────────────────────────────────────────────────────┘
- Dataset: 1,193,650 papers + 836,024 authors
- Features: 63 paper features + 44 author features
- Time Window: 1975-2005 training → 2006-2015 prediction
- Models: 7 models (4 baseline + 3 ML)
- Best Results:
- Constant (F): MAPE 47%
- Lasso (LAS): R² 0.89
- Random Forest (RF): PA-R² 0.27
- Code: ~12,000 lines (Python + JS + HTML/CSS)
- Documentation: ~5,000 lines
- Person 1: Start strong with motivation, show compelling statistics
- Person 2: Focus on "why these metrics matter" for scientific predictions
- Person 3: Highlight surprising findings (e.g., Constant model's MAPE performance)
- Person 4: Demo live API calls, show Swagger UI
- Person 5: End with impressive live demo of full system
For each person:
-
Detailed Report (中文 + English)
- Full technical content
- Code snippets
- Detailed explanations
-
Presentation Slides Content (中文 + English)
- Bullet points
- Key visuals
- Speaker notes
-
7-Minute Script (中文 + English)
- Timed talking points
- Transition phrases
- Q&A preparation
Created: 2025-10-10 Project: Impact Prediction System Version: 1.0