Big Data, Hadoop and Big SQL, A Crash Course at the IOD 2013

Big Data and Hadoop are now long-lasting buzzwords in the data processing community. Yet, few database practitioners understand what these technologies are, how to use them productively and how to integrate them into a conventional data processing landscape. It’s no wonder, as nearly all resources on these topics target software developers and not data professionals.

IOD 2013: big data, hadoop, big sql, big insights
At this year’s IBM Information on Demand Conference, November 3-7 in Las Vegas, I will be giving a tutorial that is addressing this concern specifically: we will approach Big Data and Hadoop technologies from the perspective of data professionals. We will introduce the key elements of the Hadoop ecosystem, the IBM’s enhancements and highlight the impact of these technologies on the data systems and practices in the enterprise.

For this tutorial, we use IBM BigInsights Hadoop system and besides exploring the common Hadoop features we delve into some of its unique enhancements.

Here is the overview of what we are going to talk about:

  • What is Big Data? For sure you could not escape the Big Data buzzword, but do you know what Big Data really is? Is your data Big? How about Medium data? Could you/should you apply Hadoop and its tooling to it? There are benefits even if your data is not huge!
  • MapReduce algorithm. At the heart of Hadoop is MapReduce, the algorithm for processing large data sets with a parallel, distributed algorithm executing on a cluster. Learn about this algorithm that brings scalability and fault-tolerance to variety of applications.
  • Hadoop. Hadoop is the framework that implements the common parts of the MapReduce. It provides the environment in which to run user Big Data programs. It is fault tolerant, it scales, it is cost effective and it can enable thousands of computers to jointly process data in parallel.
  • Hive and Pig. While Java APIs for Hadoop allow for a lot of flexibility, they are at a fairly low level. For data professionals, the productive way of approaching the Hadoop is at a higher level: Hive allows for a subset of SQL to be run over the files stored in Hadoop’s Distributed File System (HDFS), while Pig is a data flow language. See the characteristics of both and its strengths and weaknesses.
  • HBase. The database for Hadoop. Complementing traditional Hadoop processing, which falls into a category of batch processing, HBase is a database that provides online / real-time performance. It lies on top of the other Hadoop infrastructure and it is a distributed columnar database.
  • Big SQL. Of course, the most productive approach for a data practitioner would be trusted SQL, but plain Hadoop does not have this feature. IBM’s Big SQL extension to Hadoop provides SQL users a familiar environment to become productive with Hadoop and even to use the JDBC APIs. You will learn how to use Big SQL and quickly become productive with Big Data applications.

How about the labs? In the tutorial we will show hands on how to start exploiting the benefits of Hadoop using the IBM BigInsights Hadoop distribution. We will use the QuickStart edition where you can begin exploring Hadoop in a virtual machine - just unpack and run.  You will get the instructions on how to get it after the tutorial and run the examples yourselves.

I am looking forward to seeing you at the tutorial at the Information on Demand Conference, November 7th 2013. The tutorial is part of the Big Data and Analytics Tutorial Series. Register now here.


  1. Hi Vladimir,

    What is your take on Big SQL vs. Impala? How would you characterize Big SQL in Matt Aslett's taxonomy? "7 Hadoop questions. Q5: SQL in Hadoop, SQL on Hadoop, or SQL and Hadoop?"
    Jim Tommaney

  2. The expansion of internet and intelligence in business process lead the way to huge volume of data. It is important to maintain and process these data to be efficient in data handling. Hadoop Training in Chennai | Big Data Training in Chennai

  3. Nice post. Big data is a term that portrays the substantial volume of information; both organized and unstructured that immerses a business on an everyday premise. To know more details please visit Big Data Training in Chennai | Primavera Training in Chennai |

  4. Big data is a sophiaticated technology that helps to maintain the huge amount of data set.
    JAVA Training in Chennai|JAVA Course in Chennai|JAVA J2EE Training Institutes in Chennai

  5. Great article. I learned lot of things. Thanks for sharing.

    php training in chennai

  6. Great article. I learned lot of things. Thanks for sharing.
    qtp training in chennai

  7. This comment has been removed by the author.

  8. Big data can be used to improve training and understanding competitors, using sport sensors. It is also possible to predict winners in a match using big data analytics. Future performance of players could be predicted as well. Thus, players' value and salary is determined by data collected throughout the season.

    We provide best Primavera Training in Chennai with affordable Primavera course fees

  9. Great information. I have got some important suggestions from it.
    Web design institute chennai

  10. Thank you for the good write up. It in fact was a amusement account it.Look advanced to far added agreeable from you!
    Best Digital Marketing Academy

  11. Really an amazing post..! By reading your blog post i gained more information.
    Bulk SMS Chennai
    Internet Marketing Company Chennai

  12. Despite the fact that Hadoop is a full-fledged platform for developing any applications, it is most often used in the context of data storage and specifically SQL solutions. Actually, this is not surprising: large amounts of data almost always mean analytics, and analytics is much easier to do over tabular data. In addition, for SQL databases it is much easier to find tools and people than for NoSQL solutions. To know more visit Active Wizards Despite the popularity of SQL solutions for analytics based on Hadoop, sometimes you still have to deal with other problems for which NoSQL databases are better suited. In addition, both Hive and Impala work better with large data packets

  13. Interesting post! This is really helpful for me. I like it! Thanks for sharing!

    Webseiten Gestaltung Lüdenscheid

  14. I found a lot of interesting information here. A really good post
    office 2010 professional plus key deutsch

  15. Really useful information about hadoop, i have to know information about hadoop online training institutes.

  16. Thanks For Your valuable posting, it was very informative
    Internet Marketing Dienstleistungen

  17. Nice post about MSBI, are you looking for best msbi online training.

  18. Your website content nice nice and interesting to observe.
    jobbörse Neunkirchen

  19. This is most informative and also this post most user friendly and super navigation to all posts... Thank you so much for giving this information to me.. 

    rpa online training |
    rpa course in bangalore |
    rpa training in bangalore |
    rpa training institute in bangalore

  20. Great Article… I love to read your articles because your writing style is too good, its is very very helpful for all of us and I never get bored while reading your article because, they are becomes a more and more interesting from the starting lines until the end.
    Best Devops training in sholinganallur
    Devops training in velachery
    Devops training in annanagar
    Devops training in tambaram

  21. I would like to thank you for the efforts you have made in writing this article. I am hoping the same best work from you in the future as well. In fact your creative writing abilities has inspired me to start my own BlogEngine blog now. Really the blogging is spreading its wings rapidly. Your write up is a fine example of it.

    python training Course in chennai | python training in Bangalore | Python training institute in bangalore

  22. Truly a very good article on how to handle the future technology.

    Big Data Training in Chennai

  23. This is a terrific article, and that I would really like additional info if you have got any. I’m fascinated with this subject and your post has been one among the simplest I actually have read.
    angularjs Training in bangalore

    angularjs Training in bangalore

    angularjs Training in chennai

    automation anywhere online Training

    angularjs interview questions and answers

  24. Does your blog have a contact page? I’m having problems locating it but, I’d like to shoot you an email. I’ve got some recommendations for your blog you might be interested in hearing.
    AWS Training in Chennai |Best Amazon Web Services Training in Chennai
    Best AWS Amazon Web Services Training in Chennai | AWS Training in Chennai cost
    No.1 AWS Training in Chennai | Amazon Web Services Training Institute in Chennai

  25. I wondered upon your blog and wanted to say that I have really enjoyed reading your blog posts. Any way I’ll be subscribing to your feed and I hope you post again soon.
    Web Designing Course in chennai
    Web Designing training in chennai
    Hadoop Training in Chennai
    Python Training in Chennai
    Web designing Training in Porur
    Web designing Training in Adyar
    Web designing Training in Tnagar

  26. Superb. I really enjoyed very much with this article here. Really it is an amazing article I had ever read. I hope it will help a lot for all. Thank you so much for this amazing posts and please keep update like this excellent article. thank you for sharing such a great blog with us.
    microsoft azure training in bangalore
    rpa training in bangalore
    rpa training in pune
    best rpa training in bangalore

  27. Your music is amazing. You have some very talented artists. I wish you the best of success. Domain Name Transfer

  28. It’s a shame you don’t have a donate button! I’d certainly donate to this brilliant blog! I suppose for now I’ll settle for book-marking and adding your RSS feed to my Google account. I look forward to fresh updates and will talk about this blog with my Facebook group. Chat soon!
    python training Course in chennai
    python training in Bangalore
    Python training institute in bangalore

  29. I encourage you to read this text it is fun described ... xender download for pc

  30. I love visiting sites in my free time. I have visited many sites but did not find any site more efficient than yours. Thanks for the nudge! Fencing

  31. thanks for your information really good and very nice web design company in velachery

  32. Excellent article. Very interesting to read. I really love to read such a nice article. Thanks! keep rocking. Movavi Slideshow Maker 5.4 for Mac

  33. This is such a great resource that you are providing and you give it away for free. I love seeing blog that understand the value of providing a quality resource for free. Newton MRT Station

  34. Thank you very much for this useful article. I like it. Phoenix Heights Bukit Panjang

  35. Wow, cool post. I’d like to write like this too – taking time and real hard work to make a great article… but I put things off too much and never seem to get started. Thanks though. Kampong Java Bid Newton MRT Station