Archives, eh
-
# Some notes on my Threading model
In the last post I mentioned a threading model, and how I had forgotten how it worked. And having more recently discovered that others have also developed the same model and even given it a name, but that I have forgotten where I read about it.
With some small hope that someone will recognise it and remind me what it is called, I present
A Threading model
Posit the following set of nodes

The nodes are stored in the following tables:
nodes
id parent_id title 1 NULL A Post 2 1 First! 3 2 Lame! 4 2 re: First! 5 1 re: A Post 6 1 re: A Post_2 7 6 re: A Post_3 threads
root_id node_id 1 2 2 3 1 3 2 4 1 4 1 5 1 6 6 7 1 7 root_id is the id of a node, node_id is the id of any descendants.
For example, Node 3 (Lame!) is a descendant of nodes 2 (First!) and 1 (A Post), so two records appear, one with root_id 1 and another with root_id 2.
To select the descendants of a root node:
select n.* from nodes n, threads t where n.id = t.node_id and t.root_id = @root_id
If we were then to add an 8th node, and make it a child of node 3, the nodes table would become
id parent_id title 1 NULL A Post 2 1 First! 3 2 Lame! 4 2 re: First! 5 1 re: A Post 6 1 re: A Post_2 7 6 re: A Post_3 8 3 re: Lame! and the threads table would become:
root_id node_id 1 2 2 3 1 3 2 4 1 4 1 5 1 6 6 7 1 7 3 8 2 8 1 8 the query for updating the threading table being:
insert into threads (root_id, node_id) select @parent_id, @node_id union select root_id, @node_id from threads where node_id = @parent_id
The reason I came up with this system1 was because I didn’t like having to recursively call back to the database for every node to see if it had any child nodes. I was working in a bubble and I’m not sure that I would have found information on Nested Sets If I had searched. Even if I had, I’ve always had a problem with the idea that an INSERT requires followup UPDATEs to the same table, with potential locking problems, &c.
A slight variant that would probably make Database Normalisers cry
I actually used to use a slightly different model back when I wasn’t using ActiveRecord to abstract away queries. ActiveRecord insists that I use foreign keys – even if in the (MySQL) database they aren’t actual formally declared as such. When I originally designed this model, I didn’t worry so much about it and did this:
root_id parent_node_id 1 2 1 6 and
select distinct n.* from nodes n, threads t where (n.parent_id = t.parent_node_id and t.root_id = 2) union select * from nodes where parent_id = 2
Again, if I added an eigth node as a child of Node 3, the new threads table would be:
root_id parent_node_id 1 2 1 6 2 3 1 3 and the query to perform that update would be
insert into threads(root_id, parent_node_id) select root_id, @parent_id from threads t, nodes n where t.parent_node_id = n.parent_id and n.id = @parent_id
I have never done any performance testing on either of these two models, versus Nested Set or even each other. I don’t really need to, I’m not getting enough traffic to need any optimisation on that level. OTOH, I’m not getting enough traffic to warrant coming up with this boondoggle just to avoid multiple database hits to display a comment thread. But it was there.
1 Independently came up with. I know that it’s been developed elsewhere and long before I thought of it.
-
# Some navel-gazing
The software I use to blog with has always been custom written by me, from scratch. I started off with some PHP scripts I slammed out in an afternoon and deployed that very night. I added to those scripts for awhile, and then let it lie fallow for a couple of years. When I started playing with Ruby I started and threw away a couple of ambitious redesigns before Rails came along and I built the current software. Which I have then left to lie fallow for a couple of years. At the moment there is a another version that I have been fitfully working on every few weeks for the last year. It may well be left to lie fallow for a few years without ever being deployed.
I have this tendency to write just enough to do what I need to do, even if the “just enough” involves some very rough edges that need to be stepped carefully around. The first cut of the PHP software acknowledged this by having a script that allowed me to dump SQL statements into a text box and then execute them against the database. This being before I was on a host with phpmyadmin, and before I had bothered to write an admin interface. Before, for that matter, I had even bothered to have anything hiding behind a username and password. Ahh, memories. I also remember at one stage with the PHP code whenever a comment came in I had to manually add it to the threading because I had broken the code for doing it and had forgotten how my threading worked1.
It’s not that I am incompetent or dangerously lax. I do cross all these Ts, dot these Is when I am performing work for my employer. But when writing software for myself, I want to move onto the fruits of my labours. I was the same way when I was a kid and trying to DM, I was much more interested in the act than the process, I would always leap to playing the game and forget about doing the prep work, leading to some games with a suck rating that would have interested Hawking.
So why do I write my own code if I’m not deriving such a level of enjoyment from the process of creating the code that I am encouraged to push on through and do a complete job? Well, I am employed as a coder, so there is a not insignificant element of not being able to face more of the same when I get home from work. There’s also elements of being stubborn, that I should be able to, analogously, fix my own car. And a large element of Because It Is There, that irrational need to do something because you can.
Because any rational examination of the matter would lead me to install Movable Type. It’s madness that I should be solving – or rather not solving – problems that are Movable Type’s core business. Cross Site Scripting vulnerabilities? Fixing them is a high priority for them. Not having urls fall over because of some non-ASCII characters fell prey to a URI Decode? I reckon MT might be all over that. The list can go on and on; it makes no sense for me to try and maintain my own software when there are well-resourced groups to whom I can outsource the effort.
So? Why should that worry me? I’m under no delusions that this blog will ever be read by anyone other than my friends and acquaintances. I’m nobody, I can afford to run a lax process for my own stuff, because it doesn’t matter to anyone but me.
It is quantum of self-discovery I made while I was trying to fall asleep. I hope it allows me moments of comfort in the middle of spells of self-doubt.
1 It’s not a nested set, there’s a separate table that indexes each sub tree, so I can extract with a single query all nodes that are descended from any arbitrary node. I stumbled across a paper discussing the concept, but I can’t find the link any more.

