Yi Wang's Tag: Daily Tag 08/02/2014

Friday, August 1, 2014

Daily Tag 08/02/2014

A Lazy Sequence

tags: ring intro blog
Making Pinterest — Powering big data at Pinterest

tags: pinterest data hive truth qubole
- We orchestrate all our jobs (whether Hive, Cascading, HadoopStreaming or otherwise) in such a way that they keep the HiveMetastore consistent with what data exists on disk. This makes is possible to update data on disk across multiple clusters and workflows without having to worry about any consumer getting partial data.
- To balance flexibility, speed and isolation, we created an isolated working directory for each developer on S3.
- As we scaled to a few hundred nodes, EMR became less stable and we started running into limitations of EMR’s proprietary versions of Hive.
Brane Dump: First Step with Clojure: Terror

tags: lein clojure terror

Posted from Diigo. The rest of my favorite links are here.