CMD + K

opendataloader-pdf

Community

PDF Parser for AI-ready data. Extract Markdown, JSON, and HTML from any PDF.

Installation

To install this package, run one of the following:

Conda
$conda install conda-forge::opendataloader-pdf

Usage Tracking

2.4.2
2.4.1
2.4.0
2.3.0
4 / 8 versions selected
Downloads (Last 6 months): 0

Description

OpenDataLoader PDF is an open-source PDF parser that extracts structured Markdown, JSON (with bounding boxes), and HTML from any PDF. It features deterministic local extraction with correct reading order, table detection, heading hierarchy, and built-in AI safety filters. Hybrid mode adds OCR, complex table extraction, formula extraction, and chart descriptions.

About

Summary

PDF Parser for AI-ready data. Extract Markdown, JSON, and HTML from any PDF.

Last Updated

May 6, 2026 at 16:26

License

Apache-2.0

Supported Platforms

noarch