Hadoop Tutorials


The tutorials show the steps to follow, increasing gradually the level of difficulty and understanding. Because of this, the tutorials shown below, must be studied following the given order, so that you can understand all the components.

In the menu, you can select which tutorial you want to read about, you can use this list as well:


  • Hadoop tutorial
  • HBase tutorial
  • Hive tutorial


Bidoop Tutorial

Bidoop uses Bidoop Layer, which is a layer of Big Data components for Hadoop, which allows to analyse various information sources and speed up the development of new analitical models or the creation of operational processes, providing value to large volumes of data, even in real time.



Hadoop Tutorials

Apache Hadoop is a software ecosystem with distributed applications under an open source license. This system enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by the Google documents  for MapReduce and Google File System (GFS).

Hadoop is a high level Apache project which is being built and used by a global community of contributors, using the Java programming language.



HBase Tutorial

Apache Hadoop is a low-latency NoSQL DB, distributed and open sourced. This is the Hadoop java open source version of the famous Google NoSQL DB: BigTable. As main features we can include: data stored in columns, data versioning system, consistency of writes and reads, automatic recovery in case of failures. It has been chosen by Facebook, among other things, to store all messages from users of the same platform.



Hive Tutorial

Hive is a data store and an open source analysis package running over Hadoop. Hive works with a SQL-based language called Hive QL which allows users to structure, summarize and query data sources stored in HDFS. Hive adds metadata to the information to facilitate its handling, creating what is called a warehouse. It was originally developed by Facebook.