Fork me on GitHub!
Señor Developer!


Ruby User Group Berlin

That Looks Oddly Familiar

this Topic will be presented by Jan Stępień

at March Meetup 2019, hosted by Tobias Pfeiffer

Perceptual hashing is a fascinating technique of summarising media files. It has little in common with cryptographic hashes such as SHA1. Two input files which look similar will end up having different cryptographic yet similar perceptual hashes. And by similar, we mean having most bits set the same way.

In this talk we'll combine pHash and a BK-tree to efficiently search through metric spaces of perceptual hashes. We will use Ruby to implement a simple command line tool. It will scan our photo library, hash all the pictures, and look for similarities. By the end of the talk, we'll have a complete list of similar and near-duplicate images needlessly occupying space on our hard drive.


2 People like it