Polars extension for string similarity
copied from cf-post-staging / polars-strsimThis package provides python bindings to compute various string similarity measures directly on a polars dataframe. All string similarity measures are implemented in rust and computed in parallel. The similarity measures that have been implemented are: Levenshtein Jaro Jaro-Winkler Jaccard Sørensen-Dice Each similarity measure returns a value normalized between 0.0 and 1.0 (inclusive), where 0.0 indicates the inputs are maximally different and 1.0 means the strings are maximally similar.