Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
PythonNVIDIA-NeMo/Curator

Curator

Scalable data pre processing and curation toolkit for LLMs

78.2/100
1.6KForks: 283
View on GitHub
Loading report...

Similar Projects

llama_index

93

LlamaIndex is the leading document agent and OCR platform

Python50.0K

Edit-Banana

74

Edit Banana: A framework for converting statistical formats into editable.

Python5.3K

DataFlow

82

Easy Data Preparation with latest LLMs-based Operators and Pipelines.

Python4.7K

DeepAnalyze

62

DeepAnalyze is the first agentic LLM for autonomous data science. 🎈你的AI数据分析师,自动分析大量数据,一键生成专业分析报告!

Python4.2K
Back to List