What Is Hadoop?
Apache Hadoop is a free and open source implementation of frameworks for reliable, scalable, distributed computing and data storage. It enables applications to work with thousands of nodes and petabytes of data as such is a great tool for research and business operations. Hadoop was inspired by Google's MapReduce Google File System (GFS) papers.Its being increasingly common to have data sets that are to large to be handled by traditional database or by any technique that runs on a single computer or even a small cluster of computers. In the age of Big-Data, Hadoop has evolved as the library of choice for handling it.
Why Hadoop is needed?
Companies continue to generate large amounts of data. Here are some statistics of 2011.-- Facebook : 6 billion messages per day.
-- EBay : 9 petabytes of storage per day.
Existing tools were not designed to handle such large amount of Big Data.
No comments:
Post a Comment