CMD + K

pyminhash

Community

Efficient MinHashing

Installation

To install this package, run one of the following:

Conda
$conda install conda-forge::pyminhash

Usage Tracking

0.1.5
0.1.4
0.1.3
0.1.2
0.1.1
5 / 8 versions selected
Downloads (Last 6 months): 0

Description

MinHashing is a very efficient way of finding similar records in a dataset based on Jaccard similarity. PyMinHash implements efficient minhashing for Pandas dataframes. See instructions below or look at the example notebook to get started.

Developed by Frits Hermans

PyPI: https://pypi.org/project/PyMinHash/

About

Summary

Efficient MinHashing

Last Updated

Jan 6, 2023 at 19:56

License

MIT

Total Downloads

15.0K

Version Downloads

2.1K

Supported Platforms

noarch