Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
Splink is a Python package for probabilistic record linkage (entity resolution) that allows you to deduplicate and link records from datasets without unique identifiers.