What databases or styles of database can use git (or another dvcs architecture) to manage cross-site replication

database-designdatabase-recommendation

I'm looking for a database software package for a dictionary (like the hunk(s) of dead tree that you might have somewhere at home). I would like the database to

  1. scale well to a million key-value pairs. The pairs would normally be fewer than 20 characters, and the normally fewer than 1000 characters.
  2. have perfect UTF-8 support.
  3. deal with distributed version control. I would like it to deal well with having lots of branches and doing merges of branches. This could be built-in or provided independently by something like git.
  4. have a well-supported interop with Ruby. I plan to write a desktop program in Ruby that talks to the database. (Ruby would also work for writing the web part if it ever happens.)
  5. be fairly responsive (though I would like it to be fast).
  6. deal well with changes both while just on a desktop and while providing data for something on the web. (Hopefully this project will get to the web part.)
  7. be free software (as defined by RMS) and to be available at no charge.

All of this data would be entered by hand.

It wouldn't hurt if there were something like github for this database. I don't know if there is a database for which someone provides something like that, though.

No one's job, career, or company rides on this decision.

Any suggestions?

Best Answer

For simple key/value storage, you might like to consider Berkeley DB, though I would not think Git would be the best for 'version control'. You can use Ruby bindings with BDB.

Since your edit, I'm left wondering If you want an open-source implementation of something like Amazon S3