CMD + K

r-textreuse

Community

Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.

Installation

To install this package, run one of the following:

Conda
$conda install russh::r-textreuse

Usage Tracking

0.1.4
1 / 8 versions selected
Downloads (Last 6 months): 0

About

Summary

Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.

Last Updated

Jan 10, 2020 at 11:44

License

MIT + file LICENSE

Total Downloads

11

Supported Platforms

linux-64