Giter Site home page Giter Site logo

hcatalog's Introduction

Apache HCatalog
===============
HCatalog is a table and storage management service for data created using Apache 
Hadoop.

The vision of HCatalog is to provide table management and storage management layers
for Apache Hadoop. This includes:

 * Providing a shared schema and data type mechanism.
 * Providing a table abstraction so that users need not be concerned with where
   or how their data is stored.
 * Providing interoperability across data processing tools such as Pig, Map
   Reduce, Streaming, and Hive. 

Data processors using Apache Hadoop have a common need for table management
services. The goal of this table management service is to track data that exists in
a Hadoop grid and present that data to users in a tabular format. HCatalog
provides a single input and output format to users so that individual users need
not be concerned with the storage formats that are chosen for particular data
sets. Data is described by a schema and shares a datatype system.

Users are free to choose the best tools for their use cases. The Hadoop project
includes Map Reduce, Streaming, Pig, and Hive, and additional tools exist such
as Cascading. Each of these tools has users who prefer it, and there are use
cases best addressed by each of these tools. Two users on the same grid who
share data are not constrained to use the same tool but with HCatalog are free
to choose the best tool for their use case.  HCatalog presents data in the same
way to all of the tools, providing interfaces to each of them.

For the latest information about HCatalog, please visit our website at:

   http://incubator.apache.org/hcatalog

and our wiki, at:

   https://cwiki.apache.org/confluence/display/HCATALOG

hcatalog's People

Contributors

ashutoshc avatar omalley avatar traviscrawford avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.