Skip to content

Commit 25cfb9a

Browse files
authored
Revise RLix description for clarity and detail
Updated the description of RLix to clarify its functionality and benefits in managing RL jobs and GPU utilization.
1 parent 9c6ce5c commit 25cfb9a

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,7 @@ Leveraging a multi-role distributed architecture with Ray for flexible resource
156156
---
157157

158158
## 🏆 Notable work based on ROLL
159-
- [RLix](https://github.com/rlops/rlix): RLix is an open-source control plane for concurrent reinforcement learning that time-shares GPUs across jobs to improve utilization without changing each pipeline’s on-policy or off-policy behavior.
159+
- [RLix](https://github.com/rlops/rlix): RLix is an RL job manager that lets more RL jobs run concurrently with less waiting by sharing GPU capacity across jobs, while preserving each pipeline’s training behavior and improving GPU utilization.
160160
- [TurningPoint-GRPO](https://arxiv.org/abs/2602.06422): A GRPO framework for Flow Matching models in text-to-image generation that alleviates step-wise reward sparsity by modeling step-level incremental rewards and explicitly captures long-term effects via turning points detection, providing dense learning signals for each denoising action.
161161
- [STAgent](https://arxiv.org/abs/2512.24957): An agentic LLM specialized for spatio-temporal understanding and complex tasks like constrained POI discovery and itinerary planning, featuring hierarchical data curation with 1:10,000 filter ratio and cascaded training (seed SFT + difficulty-aware SFT + RL), achieving strong performance on TravelBench while preserving general capabilities.
162162
- [IPRO](https://arxiv.org/abs/2510.14255): A novel video diffusion framework using reinforcement learning to enhance identity preservation in human-centric I2V generation, optimizing diffusion models with face identity scorer and KL-divergence regularization.

0 commit comments

Comments
 (0)