Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Rustkreuzberg-dev/kreuzberg

kreuzberg

A polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 97+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, R, C, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server.

88.3/100
8.2KForks: 476
View on GitHubHomepage →
Loading report...

Similar Projects

EmbedAnything

81

Highly Performant, Modular, Memory Safe and Production-ready Inference, Ingestion and Indexing built in Rust 🦀

Rust1.2K

pdf_oxide

81

The fastest PDF library for Python and Rust. Text extraction, image extraction, markdown conversion, PDF creation & editing. 0.8ms mean, 5× faster than industry leaders, 100% pass rate on 3,830 PDFs. MIT/Apache-2.0.

Rust720

turbovec

70

A vector index built on TurboQuant, written in Rust with Python bindings

Rust542

WrenAI

88

The open context layer that gives AI agents grounded, governed SQL across 20+ data sources.

Rust15.1K
Back to List