Our Nerdometers Go To 11: improving mysql performance: some notes and links

NDB mysql clusters. Handles transactions, is about as fast as memory tables, can do replica / master-master clusters, but because it's multi-server it's fundamentally more scalable as a transaction engine.

Mysql + Hadoop: recently there has been a lot of noise about using them in conjunction as a way to dramatically scale the types of read operations you can do in a replicated relational database environment. E.g., either the data in the tables is enormous, and you want to do complex operations over it all at once that would typically cause a single database server (even though the data might all fit) to blow up. Not sure if someone has a setup working out of the box, but the idea is that you load a "snapshot" into hadoop as you would a new replica, then process binlogs (row replication updates are ideal for this) to update the dataset in the hadoop cluster. Would work pretty smoothly with, say, our ML database. Wonder if the INVERSE could be done too (take a dataset in a hadoop cluster, compute some set of updates, and then generate the appropriate row updates to send to your live DB cluster.

Our Nerdometers Go To 11