Sunday, December 6, 2009

Replication from drizzle to memcached / project voldemort

The last few days I've been working on a way to replicate changes from drizzle into a key value store, currently project voldemort and memcached. It is built in my rabbit replication project which means that the transactions are transfered over a message bus (rabbitmq currently). The picture below describes an example of how the involved components could be set up (not likely that you want both memcached and project voldemort though):




Current feature list of rabbitreplication:

  • Replication from drizzle into drizzle (or any database with a JDBC driver) / memcached / project voldemort.
  • Map inserts and updates onto java objects using annotated classes (see below for example).
  • Two different ways of marshalling objects, JSON ond Java object serialization
  • Full control over how the key is generated (just implement the KeyAware interface in your target object)
  • Simple interface to build new marshallers. 
  • Simple interface to build new object stores.
  • Simple interface to build new transports. (Will blog these extension points later)
Example:
The class below will catch any statements on the table unittests.test1 and take the column "id" and set it on the ssn field, and it will take the "test" column and set it on the name field. It will use the field annotated with @Id as key in the store and use the JSONMarshaller to marshal the object.




@Entity(schema = "unittests", table = "test1", marshaller = JSONMarshaller.class)
public class ExampleRepl {
    @Id
    @Column("id")
    private int ssn;


    @Column("test")
    private String name;
/*...*/



}



Add this to your config to use it:
managed_classes = org.drizzle.managedclasses.ExampleRepl, ...


Then you just start your slave like this: 
java -jar replication.jar objectslave.properties


You need to put your managed classes on the classpath (drop them in the lib dir)


(See earlier posts about rabbitreplication on how to get started)


Todo:
  • Clean up configuration, quite messy right now
  • Write blogposts about how to roll your own transport/marshalling/key-value store implementations
  • Increase test suite and set up hudson for continuous integration
  • Write proper usage documentation
  • Build more backends, marshallers and transports, evolve apis
  • Write a MySQL binlog master (needs to transform mysql binlog into drizzle's protobuf based log, not even sure it is possible)
  • Create a way to not have to write code on the slave (pin tables to a hash and store it)
  • ...
Getting involved
  • Get the code, bzr branch lp:rabbitreplication
  • Use it, give me feedback (krummas@gmail.com) <- most important!
Download
http://marcus.no-ip.biz/rabbitrepl.zip (yes, i will soon set up a proper download page).




No comments: