Suppose that tuppy and guppy are:
- both instantiated from the
tupleclass - the only elements of
tuppyandguppyare integers such as0,1,2, etc...
Is it the case that hash(tuppy) == hash(guppy) if and only if tuppy == guppy?
Note that the following all amount to nearly the same thing:
number = hash(t)
number = t.__hash__()
number = tuple.__hash__(t)
Will hash(x) == hash(y) always return True for examples like the following?
x = tuple([0, 1])
y = tuple(range(0, 2))
print("x == ".ljust(20), type(x), repr(x))
print("y == ".ljust(20), type(y), repr(y))
print("hash(x) == hash(y)".ljust(20), hash(x) == hash(y))
I have two concerns:
two different tuples hashing to the same value. For example
(1234, 5)and(1, 2345)might, theoretically, both hash to the same value.hash(copy.copy(x))might be different thanhash(x).Do tuples have the behavior that
hash(x) == hash(y)if and only ifx == y, provided that:- the instances of
tuplecome directly fromtupleand are not instantiated from a sub-class oftuple? - the elements of the
tupleare integers?
- the instances of
The above assumes that we are working with tuples of integers.
What about nested trees of tuples?
import copy
t1a = (1, (2, (3, (4, 5), 6)))
t1b = tuple(eval("[1, (2, (3, (4, 5), 6))]"))
t2a = ((((1, 2), 3), 4), 5, 6)
t2b = copy.deepcopy(t2a)
print("hash(t1a) == hash(t1b)", hash(t1a) == hash(t1b))
print("hash(t1a) == hash(t2a)", hash(t1a) == hash(t2a))
print("hash(t2a) == hash(t2b)", hash(t2a) == hash(t2b))
I know the answer for a smattering of 3 or 4 test-cases, but what about in general?
If you have huge tuples (say, 1 gigabyte each) will the hash values (int) get truncated? I could imagine that the concatenation of x and y has the same hash value as x if x is sufficiently long.
Suppose we had a million different tuples in memory. We might begin to have many different tuples associated with the same hash value.
CodePudding user response:
As usual for well-formed hash functions, a == b implies hash(a) == hash(b), but hash(a) == hash(b) does not necessarily imply that a == b.
CodePudding user response:
Regardless of type, a __hash__() implementation is playing by the rules if x == y implies hash(x) == hash(y). There is no reliable implication in the other direction: from hash(x) == hash(y), nothing follows about how x compares to y.
This is all true even for simple types like ints. There are an unbounded number of possible Python ints, but only a finite number of possible hash codes. Therefore there must exist distinct ints i and j such that hash(i) == hash(j).
