Hadoop is a Java-based programming framework. It supports the processing of large data sets in a distributed computing environment. This makes it possible to run applications on systems with thousands of nodes involving thousands of terabytes. Its distributed file system makes rapid data transfer rates among nodes possible. It also allows the system to continue operating uninterrupted in case of a node failure.