Member-only story
How to Build Countable Classes in Python
Or how to make ‘hashable’ objects that can act as keys in dicts

Consider a situation when you’re given some objects of type Point(x,y)
and you want to check if there are any duplicates in there, or better yet, to get a count for each distinct point. For example:
Point(1,2), Point(2,3), Point(1,2) -> {Point(1,2): 2, Point(2,3): 1}
This can be done using a dict
(a bit more code), a collections.defaultdict
(less code), or a collections.Counter
(one-liner). But no matter which one you choose, you’ll run into a problem that Python treats points with the same coordinates as different objects:
Even if you override the ==
operator (see ‘How to Build Comparable Classes’), Counter
(as well as defaultdict
and dict
) won’t use it for its internal machinery. This time, however, they’ll give a hint as to what went wrong:
Hashability explained
In order to be used in a dict-like structure (such as dict
, set
, defaultdict
, Counter
, etc) as a key, it is not enough for an object to be comparable.
Internally, those structures distribute the stored objects in the ‘buckets’ according to the result of a hash function applied to the objects. And only if the hashes collide, Python falls back to a linear search based on direct comparisons. By default (just as for the comparisons), Python does not care about the values and even about existence of the fields. What gets hashed for a user-defined class is its location in memory.
To give Python a better idea of what should be considered when calculating the hash, you need to define the __hash__
method. There’s no need to fine-tune the actual hash function; you can re-use the hashing…