Recalling this slighly modified passage from Hamlet: Act II Scene 2
Hamlet: We must abstract! Functions of hash for all!
Polonius: [Aside] Though this be madness, yet there is method in't.
For some reason, we specified a hashFunction to use when implementing the NotThreadSafeHashTable. One might ask: Why not just use Objects::hashCode(object) directly?
class HashUtils
int hashCodeOf(K key, ToIntFunction<K> hashFunction)
int arrayIndexForKey(K key, int arrayLength, ToIntFunction<K> hashFunction)
E itemInArrayForKey(K key, E[] array, ToIntFunction<K> hashFunction)
class NotThreadSafeHashTable<K, V> implements Map<K, V>
NotThreadSafeHashTable(int chainCount, ToIntFunction<K> hashFunction) {
Today's assignment also specifies a hashFunction to use when implementing MatrixMapReduceFramework.
public class MatrixMapReduceFramework<E, K, V, A, R> implements MapReduceFramework<E, K, V, A, R>
public MatrixMapReduceFramework(Mapper<E, K, V> mapper,
AccumulatorCombinerReducer<V, A, R> accumulatorCombinerReducer,
int mapAndAccumulateTaskCount,
int combineAndReduceTaskCount,
ToIntFunction<K> hashFunction)
Both HashMap and MatrixMapReduceFramework use the hashCode of a key to distribute the load. HashMap uses hashCode to determine which bucket a key belongs to. MatrixMapReduceFramework use the hashCode to determine which column's HashMap a key belongs to.
The MatrixMapReduceFramework uses the number of processors for the number of columns. This is reasonable. Typically the number of processors is a power of 2. Let’s say it is 8 for this scenario.
HashMap starts with an initial capacity, say 16:
public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V> {
/**
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
and doubles bucket count when the “load” gets too high. This is also reasonable. By maintaining a power of 2 bucket count, the relatively expensive floorMod can be replaced by: (buckets.length - 1) & hash
.