Rackspace gradually improve their inhouse logging through a few phases, eventually ended up forwarding logs to a distributed filesystem that ran hadoop on top of it, and could distribute an analysis mapreduce task against the logs to answer pretty much anything in a deterministic runtime.
See:
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data
http://blog.racklabs.com/?p=66
Wednesday, April 8, 2009
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment