Pig provides the query language Pig Latin. A Pig Latin script specifies a sequence of steps. Each steps defined only a single, high-level data transformation. When executing this script, it is first transformed into a logical plan that describes its execution. This plan is used to compile several MapReduce jobs that are executed on the Hadoop cluster.
- user defined functions as first-class citizens
- arbitrary input and output file formats
- nested data model
References[edit | edit source]
- Apache Pig - official web site
- Wikipedia Article - Apache Pig
- C. Olston and B. Reed and U. Srivastava and R. Kumar and A. Tomkins "Pig Latin: A Not-so-foreign Language for Data Processing" Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD '08), 2008, Pages 1099-1110, ACM New York, NY, USA