CMD + K

datasketches

Community

The Apache DataSketches Library for Python

Installation

To install this package, run one of the following:

Conda
$conda install sfe1ed40::datasketches

Usage Tracking

5.2.0
4.1.0
2 / 8 versions selected
Downloads (Last 6 months): 0

Description

A software library of stochastic streaming algorithms. In the analysis of big data there are often problem queries that don't scale because they require huge compute resources and time to generate exact results. Examples include count distinct, quantiles, most-frequent items, joins, matrix computations, and graph analysis. If approximate results are acceptable, there is a class of specialized algorithms, called streaming algorithms, or sketches that can produce results orders-of magnitude faster and with mathematically proven error bounds. For interactive queries there may not be other viable alternatives, and in the case of real-time analysis, sketches are the only known solution. This package provides a variety of sketches as described below. Wherever a specific type of sketch exists in Apache DataSketches packages for other languages, the sketches will be portable between languages (for platforms with the same endianness).

About

Summary

The Apache DataSketches Library for Python

Last Updated

Apr 2, 2025 at 09:04

License

Apache-2.0

Total Downloads

22.6K

Supported Platforms

linux-aarch64
macOS-64
macOS-arm64
linux-64
win-64

Unsupported Platforms

linux-ppc64le Last supported version: 4.1.0
linux-s390x Last supported version: 4.1.0