โ† Back to List
โš 
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonsail-sg/oat

oat

๐ŸŒพ OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

70.8/100
โ˜… 637Forks: 60
View on GitHub โ†’
Loading report...

Similar Projects

HALOs

56

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

Pythonโ˜… 907

LlamaFactory

92

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Pythonโ˜… 68.0K

InternLM

68

Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).

Pythonโ˜… 7.2K

MedicalGPT

77

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. ่ฎญ็ปƒๅŒป็–—ๅคงๆจกๅž‹๏ผŒๅฎž็Žฐไบ†ๅŒ…ๆ‹ฌๅขž้‡้ข„่ฎญ็ปƒ(PT)ใ€ๆœ‰็›‘็ฃๅพฎ่ฐƒ(SFT)ใ€RLHFใ€DPOใ€ORPOใ€GRPOใ€‚

Pythonโ˜… 4.9K
โ† Back to List