So I attempted to explain what I had in mind at work today, and managed to determine that my ideas are pretty half-baked. The more I think about it, the more it’s clear that the new model needs to do everything that the tree model does.

The tree idea is pretty mature at this stage, and does almost everything. What it doesn’t do is the main take-away of the list idea: for the current version of a tree, every item is numbered in sequence. You can get all the Pinacea by finding the taxon, asking it what row range it covers, and then selecting and ordering by row number over that range.

Can this notion be back-fitted onto the tree idea? Yes it can.

First, the subnodes of a node need to have a definite order.

Next, the notion of “a checklist” corresponds to “a tree root”. This is to say, I need to distinguish between the whole history of a tree, and a particular point in that history. I’m thinking of calling them ‘tree’ and ‘checklist’. ‘Tree’ becomes a partition of the data set, to which things like permissions and user groups might be attacked.

Each node, in addition to belonging to a ‘tree’, belongs to some particular checklist and has a sequence within that checklist. This needs to get updated when a versioning is performed. Basically, these tree roots are much more explicitly managed by the system.

Old checklists still need to be tree-walked, as we are doing now. There’s no easy way to get item 50 in checklist 9 once checklist 9 has been replaced. But although there’s no easy way of doing it, it can still be done with a recursive query which can zip down the tree to exactly the right spot. The trick is to note the offset between where checklist 9 says the node is and where the current checklist (checklist 11) says it is, and to carry that offset down the tree walk. To tell the truth, I suspect it will work rather well, and it will work far better than the current model which just has the nodes floating around in space.

Yep – retain the existing versioning model, which I am confident about, and add a notion of ‘absolute position in the current checklist’. Keeping that correct will be an interesting and important addition to the existing update queries, but it probably wont need a load of completely new stuff. One extra table to explicitly hold the checklist roots, and some new fields to hold numbers.

(note to self – store the depth as well. Makes it easy to produce an indented list with just the select by row range.)


One Response to Tree-ing

  1. Paul Murray says:

    Oh my god – this is amazing! What a great idea! I should do it!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: