⚠

Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.

PythonPKU-Alignment/safe-rlhf

safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

51.3/100

★ 1.6KForks: 132

View on GitHub →Homepage →

Loading report...

Similar Projects

Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python★ 18.9K

Chinese-LLaMA-Alpaca-2

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Python★ 7.1K

Chinese-LLaMA-Alpaca-3

中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3

Python★ 2.0K

unsloth

Web UI for training and running open models like Gemma 4, Qwen3.5, DeepSeek, gpt-oss locally.

Python★ 62.5K

← Back to List