Important point to remember about the HashMap in java?

  • A primitive type cannot be a key or a value of the HashMap.
  • Only one null allowed as the key and multiple null are allowed as the value object.
  • The key must be unique within a HashMap i.e., duplicity of keys are not allowed
  • HashMap store key-value pair into the indexed bucket (index calculate using the hashing technique), so insertion order is not maintained.
  • HashMap in Java is not synchronized among the multiple threads, so we need to make it synchronized by ourselves.
  • HashMap is marked with two Marker interfaces those are Cloneable and Serializeable

Time complexity of HashMap in Java? :

Time complexity to store and retrieve data from the HashMap is O(1) in the Best Case. But it can be O(n) in the worst case and after the changes made in Java 8 the worst case time complexity can be O(log n) atmost.

Internal working of HashMap in java

HashMap maintains an array of the buckets, where each bucket is a linked-list and the linked list is a list of nodes wherein each node contains key-value pairs.

HashMap in Java works on the principle of hashing, where hashing is used to calculate the index of the bucket.

To store and retrieve any key-value pair, hashing method applied onto the key object to calculate the index of the bucket.

Hashing Principle :

Simply, Hashing is a method used to produce an integer value from an object and this integer value known as the hash value. In HashMap, the key object is used for Hashing.

Note : After Java 8, a bucket can be a linked-list or a binary tree depending upon the Threshold.

Bucket :

A bucket is a linked list of nodes where each node is an instance of class Node<K,V>. The key of the node is used to generate the hash value and this hash value is used to find the bucket from Bucket Table.

We know the internal structure of HashMap, that HashMap maintains an array of the bucket. But when we store or retrieve any key-value pair, HashMap calculates the index of the bucket for each and every operation. The Key object is used to calculate the index of the bucket. By using this key, the hash value is calculated using the  hash(key) private method of HashMap.

What happens when there is a hash collision?

Now, as we know that two unequal objects can have the same hash code value, how two different objects will be stored in the same array location called a bucket.

The answer is LinkedList. If you remember, the Entry class had an attribute “next.” This attribute always points to the next object in the chain. This is exactly the behavior of the LinkedList.

put() method :

1. The key is used to calculate the hash value by calling private hash(key) method, internally hash(key) method call hashCode() method of key.

hash = hash(key);

Note : HashMap can store one null value as the key. In that case, the hash value returned by the hash(key) method will be 0 and 0th bucket location will be used.

2. The above hash is reduced from 0 to n-1 to calculate the index of bucket (where n is the size of an array of the bucket).

index = hash & (n-1);

How Collisions Are Resolved

In case of collision, Entry objects are stored in a linked list form. When an Entry object needs to be stored in a particular index, HashMap checks whether there is already an entry. If there is no entry already present, the entry object is stored in this location.

hashcode is equal, both objects are equal and HashMap  will throw exception or not store them again etc, Then you might want to remind them about equals() and hashCode() contract  that two unequal objects in Java can have same hashcode. Some will give up at this point and few will move ahead and say “Since hashcode is same, bucket location would be same and collision will occur in HashMap Since HashMap uses LinkedList to store object, this entry (object of Map.Entry comprise key and value )  will be stored in LinkedList.

What if we add another value object with the same key as entered before?

Logically, it should replace the old value. How it is done? Well, after determining the index position of the Entry object, while iterating over LinkedListd on the calculated index, HashMap calls equals method on the key object for each entry object.

All these entry objects in LinkedList will have similar hashcode but  equals()  method will test for true equality. If key.equals(k) will be true then both keys are treated as the same key object. This will cause the replacement of value objects inside the entry object only.

How get method works in HashMap in Java

Why String, Integer and other wrapper classes are considered good keys? And Why Object are not considered?

String, Integer and other wrapper classes are natural candidates of HashMap key, and String is most frequently used key as well because String is immutable and final, and overrides equals and hashcode() method. Other wrapper class also shares similar property. Immutability is required, in order to prevent changes on fields used to calculate hashCode() because if key object returns different hashCode during insertion and retrieval than it won’t be possible to get an object from HashMap. 

Immutability is best as it offers other advantages as well like thread-safety, If you can  keep your hashCode same by only making certain fields final, then you go for that as well. Since equals() and hashCode() method is used during retrieval of value object from HashMap, it’s important that key object correctly override these methods and follow contact. If unequal object returns different hashcode than chances of collision will be less which subsequently improve the performance of HashMap.

Can we use any custom object as a key in HashMap?

This is an extension of previous questions. Of course you can use any Object as key in Java HashMap provided it follows equals and hashCode contract and its hashCode should not vary once the object is inserted into Map. If the custom object is Immutable than this will be already taken care because you can not change it once created.

Can we use ConcurrentHashMap in place of Hashtable?

This is another question which getting popular due to increasing popularity of ConcurrentHashMap. Since we know Hashtable is synchronized but ConcurrentHashMap provides better concurrency by only locking portion of map determined by concurrency level. ConcurrentHashMap is certainly introduced as Hashtable and can be used in place of it, but Hashtable provides stronger thread-safety than ConcurrentHashMap.

How null key is handled in HashMap? Since equals() and hashCode() are used to store and retrieve values, how does it work in case of the null key?
The null key is handled specially in HashMap, there are two separate methods for that putForNullKey(V value) and getForNullKey(). Later is an offloaded version of get() to look up null keys.  Null keys always map to index 0.

This null case is split out into separate methods for the sake of performance in the two most commonly used operations (get and put), but incorporated with conditionals in others. In short, equals() and hashcode() method are not used in case of null keys in HashMap.

here is how nulls are retrieved from HashMap

private V getForNullKey() {

        if (size == 0) {
            return null;
        }
        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
            if (e.key == null)
                return e.value;
        }
        return null;
    }

In terms of usage, Java HashMap is very versatile and I have mostly used HashMap as cache in an electronic trading application I have worked. Since finance domain used Java heavily and due to performance reason we need caching HashMap and ConcurrentHashMap  comes as very handy there.

NOTE

From JDK 1.8 onwards HashMap has introduced an improved strategy to deal with a high collision rate. 

Since a poor hash function e.g. which always return the location of the same bucket, can turn a HashMap into linked list, i.e. converting get() method to perform in O(n) instead of O(1) and someone can take advantage of this fact, Java now internally replace linked list to a binary true once certain threshold is breached. This ensures performance or order O(log(n)) even in the worst case where a hash function is not distributing keys properly

100% LikesVS
0% Dislikes

One thought on “HashMap Internal Implementation”

Leave a Reply

Your email address will not be published. Required fields are marked *