Minimal Perfect Hashing

mphfA perfect hash is a one-to-one map. A key is taken and hash (into an integer) and looked up in a single location in the map. If it is in the map, it will be found in that location. This makes perfect hashes incredibly fast and very light on the CPU cache and the virtual memory system of the OS.

A minimal perfect hash is the smallest perfect hash that can be found for a set of keys. Determining the minimal perfect hash by brute force is time consuming. The paper describes an algorithm for generating a minimal perfect hash in linear time. That means that the time to determine the minimal perfect hash is directly proportional to the number of keys.

I found a sample implementation of the algorithm in python in the article Throw away the keys: Easy, Minimal Perfect Hashing. This program can be run as a benchmark, and takes about 1.7s to run on my laptop. The article points to another implementation in python, this time of the MOS Algorithm II CACM92, by the scientist who discovered the algorithm. When I ran this version, which is supposed to have a compacter representation, the program never came back.

Update: I looked into why the MOS II algorithm went ballistic. It is not a linear algorithm. The part of the code it gets stuck in is a nested loop of O(n2). I left it running, and six hours later, it was still in this loop (the inner loop of which is labelled with the comment “Step 3: rotate patterns and search for suitable displacement”). It’s author claims “This code runs fast linearly”. I beg to differ.


About jimbelton

I'm a software developer, and a writer of both fiction and non-fiction, and I blog about movies, books, and philosophy. My interest in religious philosophy and the search for the truth inspires much of my writing.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s