[geeks] Versioning FIlesystem

Dave Kimmel crisco_kid at shaw.ca
Sat May 24 10:50:01 CDT 2003


On Sat, 24 May 2003, Jonathan C. Patschke wrote:

> The only reason storing diffs works is because you'd have to read all
> the versioning data in anyway just to -read- the file, so it'd be
> trivial to write it back the Right Way.  However, you could still be
> antisocial and just flatten the diffs into the new file.

If you're storing each version of the file as the complete data you won't
need to read the whole thing just to read the current version.  You just
need some way of quickly identifying the current version.

The problem with storing only the diffs is that you need to keep all of
the old data, or make the purge command flatten the data as it works
towards the present version.  Also, it would limit you to the number of
versions you could realistically have before reading the current version
becomes unacceptably slow.  The advantage is, of course, the space
savings.

I've written a program at work which stores old versions of its data in a
database.  I use two fields, startdate and enddate, which are timestamp
fields and identify the period that this data was active for.  The record
stores the complete data as it was during the period.  If I need the
current version of something, I can just say "select * from stuff where
startdate <= now() and enddate > now() and stuffid = 'the_stuff';".
Finding out what the data looked like at a specific point is a simple
matter of replacing the two "now()"s with the date/time that you need.
If I need diffs, I generate them on the fly.  It's certainly not perfect,
but it works pretty well for what we use it for.

-- Dave Kimmel
   crisco_kid at shaw.ca



More information about the geeks mailing list