The hash keys are only 31 bit positive integers. From someone else's analysis I see that:
"Assuming a 32-bit hash and k=10,000 items, a collision will occur with a probablility of 1.2%. For 77,163 samples the probability becomes 50%!"
It does not appear that there's any code to compare the original keys and detect these conflicts. Even with a moderately sized cache errors go undetected.
I would suggest either a 64-bit hash or a mechanism for detecting the errors.
The hash keys are only 31 bit positive integers. From someone else's analysis I see that:
"Assuming a 32-bit hash and k=10,000 items, a collision will occur with a probablility of 1.2%. For 77,163 samples the probability becomes 50%!"
It does not appear that there's any code to compare the original keys and detect these conflicts. Even with a moderately sized cache errors go undetected.
I would suggest either a 64-bit hash or a mechanism for detecting the errors.