Sunday, May 9, 2010

Retrieving Data Hierarchies on a SQL Table: Adjacency List Model

Adjacency List model is more popularly identified by an obvious parent-to-child relationship in the table schema itself. This model represents the hierarchy tree in the form of a list.

Using the following table, we will discuss how to manipulate data hierarchies using this model.

+-------------+---------------+--------+
| category_id | name | parent |
+-------------+---------------+--------+
| 1 | COMPUTERS | NULL |
| 2 | NOTEBOOK | 1 |
| 3 | NETBOOK | 1 |
| 4 | MULTIMEDIA | 2 |
| 5 | BASIC | 2 |
| 6 | SILVER CASING | 5 |
+-------------+---------------+--------+

Full Hierarchy Tree
To retrieve the full hierarchy tree, find all the leaf nodes and a single path there are at least two ways to do it.

1) If you already know the depth of your tree, you can do a self-join based on that number. This is a "pure" SQL way of doing it.

SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4
FROM category AS t1
LEFT JOIN category AS t2 ON t2.parent = t1.category_id
LEFT JOIN category AS t3 ON t3.parent = t2.category_id
LEFT JOIN category AS t4 ON t4.parent = t3.category_id
WHERE t1.parent is NULL

+-----------+----------+------------+---------------+
| lev1 | lev2 | lev3 | lev4 |
+-----------+----------+------------+---------------+
| COMPUTERS | NOTEBOOK | MULTIMEDIA | NULL |
| COMPUTERS | NOTEBOOK | BASIC | SILVER CASING |
| COMPUTERS | NETBOOK | NULL | NULL |
+-----------+----------+------------+---------------+

2) On the application level, iterate from the parent to the children until it hits the leaf nodes (those with no children). This same technique can also be used to determine the depth of a full tree.

SELECT * FROM  category WHERE parent_id = NULL - Zero if it's defaulted to it. First iteration.
SELECT * FROM category WHERE parent_id = [result from prev iterations until the leaf node is hit]


Finding all the Leaf Nodes

We can find all the leaf nodes in our tree (those with no children) by using a LEFT JOIN query:

SELECT t1.category_id FROM
category AS t1 LEFT JOIN category as t2
ON t1.category_id = t2.parent
WHERE t2.category_id IS NULL;

+-------------+
| category_id |
+-------------+
| 3 |
| 4 |
| 6 |
+-------------+

Finding the Depth of Your Tree

Using the method of recursively iterating from child to parent in the application layer, counting the number of iterations will yield you the depth from where you started. If you target the leaf nodes, you will be able to come-up with the full-depth of tree. Knowing the depth of the tree reduces the number of recursive iterations that you have to do. Because once determined, you can then apply the self-join method and construct the SQL query based on the depth. You can cache the value as well, so your application doesn't have to do it the next time.

See the following example. The resulting value can be used to produce the number of self-joins required to produce a full tree or a single path.
function getDepth($child_id)
{
   $sql = "SELECT parent_id FROM category WHERE id = $child_id";
   $result = $db->query($sql); //assuming that there exist $db as database adapter
   $depth = 0;
   if ($result) {
    $depth += 1;
    getDepth($result[0]);
   }
   return $depth;
}

There's also a posting in MySQL for implementing the Nested Set Model for hierarchical data sets. Click here if you need more info.
  • Related Links Widget for Blogspot

No comments: