javapandit.net
  • Home
  • Quick Java
    • Exception Handling
    • Collections Framework
  • Java Best Practices
  • Web Services
    • Web Service Basics
    • Ten Basic webservice concepts
    • XML
    • Apache Axis
    • Restful Web Services
  • JMS Concepts
    • JMS- MySQL
  • Hadoop
    • NoSQL DATABASEs
    • Apache Sqoop
    • Hadoop Interview Questions
  • Java 5
  • Java 8
    • Java 8 : Lambda Expressions
  • JDBC
  • Java Architect
    • Enterprise application re-platforming strategies
    • Java Memory Management
  • Java Programs
  • Technical Tips
    • How to set JAVA_HOME environment variable
    • How to create an auto increment field in Oracle?
    • Linux Commands
  • Best Java Interview Questions
    • Java Interview Questions- YouTube
  • Interview Questions
    • Java Tech interview Questions
    • Core Java Interview Questions >
      • core tech questions1
      • Java Collection interview questions
      • Java Concurrency
    • Servlets Interview Questions
    • JSP Interview Questions
    • Java Web Services Interview Questions
    • EJB Interview Questions
    • XML Interview Questions
    • JMS Interview Questions
  • Struts Interview Questions
    • Struts 2 Interview Questions
  • Java EE Architects Interview Questions
    • Java Architect Interview Questions
    • Top 10 reasons for Java Enterprise Application Performance Problems
    • Web Application Scalability Questions for IT Architect
  • JavaPandit's Blog
  • Web Services Interview Questions
  • Servlets And JSP
  • Oracle SOA Interview Questions
    • Open ESB /JBI
    • BPEL Language
  • Log4J
  • Ant
  • Maven
  • JMeter
  • JUnit
  • Apache POI Framework
  • ORCALE SERVICE BUS (OSB) Interview Questions
  • J2EE Patterns
    • Model-View-Controller (MVC)
    • Front Controller
    • DAO
    • Business Delegate
    • Session Facade
    • Service Locator
    • Transfer Object
    • Design Patterns >
      • Creational Patterns >
        • Singleton
      • Behavioural Patterns
      • Structural Patterns
    • Intercepting Filter
  • SQL Interview Questions/Lab
  • Best Wall Papers
    • Devotional Songs
  • Java Community
  • HIBERNATE
  • ORACLE CC&B
    • Oracle CC&B Interview Questions
  • Docker
  • Little Princess
    • Sai Tanvi Naming Ceremony Celebrations
    • Rice Feeding Ceremony
    • Sai Tanvi Gallery
  • APPSC Career Guidance
    • AP History
    • Indian Polity
    • Indian Economy
    • Science & Technology
    • Mental Ability and Reasoning
    • Disaster Management
    • Current Affairs and Events
    • General Sciences >
      • Biology
      • Physics
      • Chemistry
    • Previous Question Papers
  • About Us
  • Contact US


1. What Is Apache Hadoop? 

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing.
​

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.
The project includes these modules:
  • Hadoop Common: The common utilities that support the other Hadoop modules.
  • Hadoop Distributed File System (HDFS): A distributed file system that provides high-throughput access to application data.
  • Hadoop YARN: A framework for job scheduling and cluster resource management.
  • Hadoop MapReduce: A YARN-based system for parallel processing of large data sets. Other Hadoop-related projects at Apache include:
  • Avro: A data serialization system.
  • Cassandra: A scalable multi-master database with no single points of failure.
  • Chukwa: A data collection system for managing large distributed systems.
  • HBase A scalable, distributed database that supports structured data storage for large tables.
  • Hive: A data warehouse infrastructure that provides data summarization and ad hoc querying.
  • Mahout: A Scalable machine learning and data mining library.
  • Pig: A high-level data-flow language and execution framework for parallel computation.
  • ZooKeeper: A high-performance coordination service for distributed applications. 


Powered by Create your own unique website with customizable templates.