Posts

HDFS & Its Architecture

Image
HDFS & Its Architecture HDFS - Hadoop Distributed System HDFS is a file system designed for storing very large files with streaming data access patterns running on clusters of commoditive hardware. Inspired from Google File System which was developed using C++ during 2003 by Google to enhance its search engine, Hadoop Distributed File System (HDFS), a Java based file system, becomes the core components of Hadoop. With its fault tolerant and self-healing features, HDFS enables Hadoop to harness the true capability of distributed processing techniques by turning a cluster of industry standard servers or commodity servers into massively scalable pool of storage. Just to add another feather in its cap, HDFS can store structured, semi-structured or unstructured data in any format regardless of schema and is specially designed to work in an environment where scalability and throughput is critical.  HDFS Concepts Blocks A block of HDFS is 64MB, ...

Big Data Introduction Part 2

Image
What is Big Data? Big data means really a big data; it is a collection of large datasets that cannot be processed using traditional computing techniques. Big data is not merely a data; rather it has become a complete subject, which involves various tools, techniques and frameworks. What Comes Under Big Data? Big data involves the data produced by different devices and applications. Given below are some of the fields that come under the umbrella of Big Data. ·         Black Box Data : It is a component of helicopter, airplanes, and jets, etc. It captures voices of the flight crew, recordings of microphones and earphones, and the performance information of the aircraft. ·         Social Media Data : Social media such as Facebook and Twitter hold information and the views posted by millions of people across the globe. ·         Stock Exchange Data : The stock exchange da...