git clone ''

(ql:quickload :Netflix.PigPen)

PigPen is map-reduce for Clojure, or distributed Clojure. It compiles to Apache Pig or Cascading but you don't need to know much about either of them to use it.

Getting Started, Tutorials & Documentation

Getting started with Clojure and PigPen is really easy.

Note: It is strongly recommended to familiarize yourself with Clojure before using PigPen.

Note: PigPen is not a Clojure wrapper for writing Pig scripts you can hand edit. While entirely possible, the resulting scripts are not intended for human consumption.

Questions & Complaints


pigpen is available from Maven:

With Leiningen:

;; core library
[ "0.3.3"]

;; pig support
[ "0.3.3"]

;; cascading support
[ "0.3.3"]

;; rx support
[ "0.3.3"]

The platform libraries all reference the core library, so you only need to reference the platform specific one that you require and the core library should be included transitively.

Note: PigPen requires Clojure 1.5.1 or greater


To use the parquet loader, add this to your dependencies:

[ "0.3.3"]

Here an example of how to write parquet data.

(require '[pigpen.core :as pig])
(require '[pigpen.parquet :as pqt])

;; assuming that `data` is in tuples
;; [["John" "Smith" 28]
;;  ["Jane" "Doe"   21]]

(defn save-to-parquet
  [output-file data]
  (->> data
       ;; turning tuples into a map
       (pig/map (partial zipmap [:firstname :lastname :age]))
       ;; then storing to Parquet files
        (pqt/message "test-schema"
                     ;; the field names here MUST match the map's keys
                     (pqt/binary "firstname")
                     (pqt/binary "lastname")
                     (pqt/int64  "age")))))

And how to load the records back:

(defn load-from-parquet
  ;; the output will be a sequence of maps
   (pqt/message "test-schema"
                (pqt/binary "firstname")
                (pqt/binary "lastname")
                (pqt/int64  "age"))))

And check out the pigpen.parquet namespace for usage.

Note: Parquet is currently only supported by Pig


To use the avro loader (alpha), add this to your dependencies:

[ "0.3.3"]

And check out the pigpen.avro namespace for usage.

Note: Avro is currently only supported by Pig

Release Notes