Image Deduplicator
imagededup is a python package that simplifies the task of finding exact and near duplicates in an image collection.
About
Image_Deduplicator_-_mona_lisa.png
This package provides functionality to make use of hashing algorithms that are particularly good at finding exact duplicates as well as convolutional neural networks which are also adept at finding near duplicates. An evaluation framework is also provided to judge the quality of deduplication for a given dataset.
Following details the functionality provided by the package:
- Finding duplicates in a directory using one of the following algorithms:
- Convolutional Neural Network (CNN)
- Perceptual hashing (PHash)
- Difference hashing (DHash)
- Wavelet hashing (WHash)
- Average hashing (AHash)
imagededup is compatible with Python 3.6 and is distributed under the Apache 2.0 license.