Showing:
tag “rlhf”
Reset
Python
Stable
May 12, 2026
A Chinese researcher has published 100+ hand-crafted SVG architecture diagrams covering LLMs, reinforcement learning, RLHF, GRPO, and more. If you've ever struggled to find a clear visual explanation of PPO in the context of language model training, this repo probably has what you need.
3,997 stars
changyeyu/LLM-RL-Visualized
7 min read