Library tutorials & articles

Tree structures in ASP.NET and SQL Server

Populating the TreeNode Children

The last thing we'll cover is actually re-creating our exact structure by populating the Children collection of our TreeNode. As you've seen - in many situations you don't actually need to do this - but it's still a very useful thing to understand.

We can use a similar method that we used in dfTreeGetSubChildren (except with less restrictions) to fetch all rows of a tree (underneath a specified root node). It will then be up to our .NET class to actually create the appropriate structure of TreeNode objects.

CREATE PROCEDURE dfTreeGetTree ( @id INT ) AS
SELECT * FROM dfTree
    WHERE lineage LIKE (SELECT lineage
        FROM dfTree
        WHERE id = @id) + '%'
ORDER BY lineage, name

As before, we're going to ensure that when we query the database, the rows are sorted in ascending order by the lineage column. This makes things much easier to deal with, as it will mean that

  • The root of the tree we're fetching will be the first row in the resultset (because it will have the shortest lineage value).
  • The rows will be ordered such that if the depth column increases in value, then we'll be dealing with children of the last node we saw, and as such the depth will only have increased by one.

With this knowledge in hand, we can devise an algorithm using a stack in order to create the TreeNode's we need. Let's also suppose we have a tree and a corresponding resultset that looks something like the ones below - with any luck we'll then be able to relate the steps in the algorithm using this concrete example.

A diagram displaying the structure of the table, with these fields: id (int, primary key), parentId (int), name (varchar), depth (int), lineage (varchar).

id parentId name depth lineage
1 NULL Root Node 0 /1/
2 1 A (Child of Root) 1 /1/2/
3 2 B (Child of A) 2 /1/2/3/
4 3 C (Child of B) 3 /1/2/3/4/
5 2 D (Child of Root) 1 /1/5/

One of our requirements was that the first row in the result set is the root of the tree- so we can use our existing TreeNodeFromDataReader function to create our root TreeNode object - this will represent the row with id 1 in the table above. In addition, a variable currentNode is set to this node, a currentDepth variable initialized to zero, and a nodes variable initialized to an empty stack.

  • Now, for each subsequent row in the resultset, we'll be adding the rows to the Children property of whatever TreeNode object is at the top of the stack.

    Obviously, on the first iteration we don't have anything in the stack, so we're going to need to consider some cases before trying to do this:
    • if the depth of the row we're looking at has increased since the last iteration (which it certainly will have in the first iteration), we now know that we're dealing with a child of the current node. Therefore, we add currentNode to the stack to save its current value (and so that the top of the stack contains the thing we want to add children to), and increment our currentDepth variable by one.
    • if the depth of the row has decreased then we know we've finished with the children of whatever was at the top of the stack. We now need to chuck items back off the stack so we can get back to the correct element for adding children to - and we simply chuck off the amount the depth has decreased by.

      For example, suppose using the resultset listed above we have reached node D (row 5). At this point, currentDepth is 3, currentNode points to the row with id 4, and the nodes stack will contain the root node, node A, node B and node C. The difference between currentDepth and the depth of node D is 2 - so we pop 2 items off the stack - leaving us with node A at the top - which is exactly the node we want to add D to.
  • We then set currentNode to be a new TreeNode representing the node we've just read from the data reader - and add it to the children collection of whatever is at the top of the stack.

Once we've run out of items, the original TreeNode will contain all appropriate children - each with their own children populated. The code is below.

protected TreeNode PopulateTreeNodesFromReader(SqlDataReader dr)
{
    //Stack<TreeNode> nodes = new Stack();
    Stack nodes = new Stack();
    // if we have no rows, then we don't have a tree!
    if (!dr.Read())
        return null;
    TreeNode rootNode = TreeNodeFromDataReader(dr);
   
    // we're currently at the root, so the
    // depth is zero
    int currentDepth = 0;
    TreeNode currentNode = rootNode;
    // now that we've got a "root" of our tree,
    // start populating its children
    while (dr.Read())
    {
        if ((int)dr["depth"] > currentDepth)
        {
            // assert dr["depth"] = currentDepth + 1
            // assert currentCategory!=null
            // we now have a child of the last category
            nodes.Push(currentNode);
            // set the current depth
            currentDepth = (int)dr["depth"];
        }
        else if ((int)dr["depth"] < currentDepth)
        {
            // pop off appropriate number of items from stack
            while ((int)dr["depth"] < currentDepth)
            {
                --currentDepth;
                nodes.Pop();
            }
        }
        currentNode = TreeNodeFromDataReader(dr);
        ((TreeNode)nodes.Peek()).Children.Add(currentNode);
    }
   
    nodes.Clear();
    // return the children
    return rootNode;
}

So, to get a TreeNode representing the a tree structure in the database, we can use this method:

public TreeNode GetTree(int uniqueID)
{
    return ProcessTree("dfTreeGetTree",
        new SqlParameter("id",uniqueID));
}

This TreeNode could then be passed to your business logic layer - for example - whilst totally abstracting the underlying representation with which you used to store the tree. In fact - we could modify all our existing calls to ProcessList (used by GetChildren, GetPath etc) and replace them with ProcessTree - changing the return type from ArrayList to TreeNode, you'll then get the expected structure as a TreeNode with populated Children rather than simply a list.

Conclusion

Hopefully by now you've got a much better idea of how you can deal with tree structures - and how you could create a web directory-like system in ASP.NET using the code provided here. I've also included a demonstration project available for download.

If you've got any questions - or think something in the article needs clarifying, then just drop me a line!

Comments

  1. 12 May 2009 at 12:46

    i think this is missing SP:

    create PROCEDURE [dbo].[dfTreeGetNode]
        (
        	@id INT
        )
    AS
    
    SELECT * FROM dfTree WHERE id=@id
    
  2. 12 May 2009 at 12:41

    hey guys nice article. i have problem adn i think you forgot one Sp: Could not find stored procedure 'dfTreeGetNode'

    any help?

  3. 23 Sep 2008 at 13:53

     Hi James,
    I have read your article in developer fusion about Tree structures in asp.net and sql server and I have successfully implemented into some diffuicult task.So,thank you about it...
    However,I would like to ask what are the words child,old and parent and how Sql server knows about them?
     
    Thanks in advance,
    Kostis.

  4. 11 Mar 2007 at 14:56
    Here is how id did it to use with asp:tree:

    add this methods to the TreeNode class

    public string GetXml
        {
            get
            {
                XmlDocument xDoc = new XmlDocument();
                XmlElement root = (XmlElement) xDoc.AppendChild(xDoc.CreateElement("node"));
                root.SetAttribute("id", this.UniqueID.ToString());
                root.SetAttribute("name", this.Name);
                foreach (TreeNode tn in this.Children)
                {
                    AppendChildren(root, tn);
                }
                return xDoc.OuterXml;
            }
        }

        private void AppendChildren(XmlElement root, TreeNode tn)
        {
            XmlElement node = (XmlElement)root.AppendChild(root.OwnerDocument.CreateElement("node"));
            node.SetAttribute("id", tn.UniqueID.ToString());
            node.SetAttribute("name", tn.Name);
            foreach (TreeNode child in tn.Children)
            {
                AppendChildren(node, child);
            }
        }

    the xml has all the info i need for the tree control, feel free to change it to your needs
    hope this helps

































  5. 24 Apr 2006 at 11:44

    Thanks this article has been a great help, I have one question.

    How hard would it be to get the tree from SQL in XML so that you could use ASP.net 2 tree control?

    Regards Geraint

  6. 04 Nov 2005 at 05:13
    Here is my standard "cut & paste" on the Nested sets model for hierarchies.

    There are many ways to represent a tree or hierarchy in SQL.  This is called an adjacency list model and it looks like this:

    CREATE TABLE OrgChart
    (emp CHAR(10) NOT NULL PRIMARY KEY,
     boss CHAR(10) DEFAULT NULL REFERENCES OrgChart(emp),
     salary DECIMAL(6,2) NOT NULL DEFAULT 100.00);

    OrgChart
    emp       boss      salary
    ===========================
    'Albert'  NULL    1000.00
    'Bert'    'Albert'   900.00
    'Chuck'   'Albert'   900.00
    'Donna'   'Chuck'    800.00
    'Eddie'   'Chuck'    700.00
    'Fred'    'Chuck'    600.00

    Another way of representing trees is to show them as nested sets.

    Since SQL is a set oriented language, this is a better model than the usual adjacency list approach you see in most text books. Let us define a simple OrgChart table like this.

    CREATE TABLE OrgChart
    (emp CHAR(10) NOT NULL PRIMARY KEY,
     lft INTEGER NOT NULL UNIQUE CHECK (lft > 0),
     rgt INTEGER NOT NULL UNIQUE CHECK (rgt > 1),
     CONSTRAINT order_okay CHECK (lft < rgt) );

    OrgChart
    emp         lft rgt
    ======================
    'Albert'      1   12
    'Bert'        2    3
    'Chuck'       4   11
    'Donna'       5    6
    'Eddie'       7    8
    'Fred'        9   10

    The organizational chart would look like this as a directed graph:

               Albert (1, 12)
               /        \
             /            \
       Bert (2, 3)    Chuck (4, 11)
                      /    |   \
                    /      |     \
                  /        |       \
                /          |         \
           Donna (5, 6) Eddie (7, 8) Fred (9, 10)

    The adjacency list table is denormalized in several ways. We are modeling both the Personnel and the organizational chart in one table. But for the sake of saving space, pretend that the names are job titles and that we have another table which describes the Personnel that hold those positions.

    Another problem with the adjacency list model is that the boss and employee columns are the same kind of thing (i.e. names of personnel), and therefore should be shown in only one column in a normalized table.  To prove that this is not normalized, assume that "Chuck" changes his name to "Charles"; you have to change his name in both columns and several places. The defining characteristic of a normalized table is that you have one fact, one place, one time.

    The final problem is that the adjacency list model does not model subordination. Authority flows downhill in a hierarchy, but If I fire Chuck, I disconnect all of his subordinates from Albert. There are situations (i.e. water pipes) where this is true, but that is not the expected situation in this case.

    To show a tree as nested sets, replace the nodes with ovals, and then nest subordinate ovals inside each other. The root will be the largest oval and will contain every other node.  The leaf nodes will be the innermost ovals with nothing else inside them and the nesting will show the hierarchical relationship. The (lft, rgt) columns (I cannot use the reserved words LEFT and RIGHT in SQL) are what show the nesting. This is like XML, HTML or parentheses.

    At this point, the boss column is both redundant and denormalized, so it can be dropped. Also, note that the tree structure can be kept in one table and all the information about a node can be put in a second table and they can be joined on employee number for queries.

    To convert the graph into a nested sets model think of a little worm crawling along the tree. The worm starts at the top, the root, makes a complete trip around the tree. When he comes to a node, he puts a number in the cell on the side that he is visiting and increments his counter.  Each node will get two numbers, one of the right side and one for the left. Computer Science majors will recognize this as a modified preorder tree traversal algorithm. Finally, drop the unneeded OrgChart.boss column which used to represent the edges of a graph.

    This has some predictable results that we can use for building queries.  The root is always (left = 1, right = 2 * (SELECT COUNT(*) FROM TreeTable)); leaf nodes always have (left + 1 = right); subtrees are defined by the BETWEEN predicate; etc. Here are two common queries which can be used to build others:

    1. An employee and all their Supervisors, no matter how deep the tree.

    SELECT O2.*
      FROM OrgChart AS O1, OrgChart AS O2
     WHERE O1.lft BETWEEN O2.lft AND O2.rgt
       AND O1.emp = :myemployee;

    2. The employee and all their subordinates. There is a nice symmetry here.

    SELECT O1.*
      FROM OrgChart AS O1, OrgChart AS O2
     WHERE O1.lft BETWEEN O2.lft AND O2.rgt
       AND O2.emp = :myemployee;

    3. Add a GROUP BY and aggregate functions to these basic queries and you have hierarchical reports. For example, the total salaries which each employee controls:

    SELECT O2.emp, SUM(S1.salary)
      FROM OrgChart AS O1, OrgChart AS O2,
           Salaries AS S1
     WHERE O1.lft BETWEEN O2.lft AND O2.rgt
       AND O1.emp = S1.emp
     GROUP BY O2.emp;

    4. To find the level of each emp, so you can print the tree as an indented listing.  Technically, you should declare a cursor to go with the ORDER BY clause.

    SELECT COUNT(O2.emp) AS indentation, O1.emp
      FROM OrgChart AS O1, OrgChart AS O2
     WHERE O1.lft BETWEEN O2.lft AND O2.rgt
     GROUP BY O1.lft, O1.emp
     ORDER BY O1.lft;

    5. The nested set model has an implied ordering of siblings which the adjacency list model does not. To insert a new node, G1, under part G.  We can insert one node at a time like this:

    BEGIN ATOMIC
    DECLARE rightmost_spread INTEGER;

    SET rightmost_spread -- can be put into the UPDATE
       = (SELECT rgt
            FROM Frammis
           WHERE part = 'G');
    UPDATE Frammis
      SET lft = CASE WHEN lft > rightmost_spread
                     THEN lft + 2
                     ELSE lft END,
          rgt = CASE WHEN rgt >= rightmost_spread
                     THEN rgt + 2
                     ELSE rgt END
    WHERE rgt >= rightmost_spread;

    INSERT INTO Frammis (part, lft, rgt)
    VALUES ('G1', rightmost_spread, (rightmost_spread + 1));
    COMMIT WORK;
    END;

    The idea is to spread the (lft, rgt) numbers after the youngest child of the parent, G in this case, over by two to make room for the new addition, G1.  This procedure will add the new
  7. 03 Nov 2005 at 23:48
    Okay, point taken ... but as far as I'm aware if you moved the code I talk about out of triggers just into stored procs (or whatever), then the SQL is pretty standard stuff?

    I know that you're a very well respected author on this area, and I'm still *very* new to this, but I had difficulty with the fact that it was so relatively "hard" to work out things like who a nodes parent is, and what its depth/level was with your solutions.... am I missing something obvious here? Thanks for your time!
  8. 03 Nov 2005 at 20:06
    Yoiu can do your heirarchies & trees in pure declarative SQL without using any proprieary 4GL like PL/SQL,  T-SQL, etc.
  9. 03 Nov 2005 at 15:14
    Unless I've misunderstood you... the point of the lineage field was not to soley indicate its parent, but the entire path up the tree, so we could easily perform queries against this... i'm not sure how easy the same queries would be to perform using the method you describe?
  10. 03 Nov 2005 at 15:12
    It's still there - you just need to change the drop down on the right hand side to extend the range of old posts that are displayed.
  11. 03 Nov 2005 at 15:11
    Hi Joe - thanks for posting! What do you mean by "proprietary" code exactly? I know this isn't the same method that you recommend (which I actually didn't come across until after I wrote this article)... but it's all still just SQL stuff....
  12. 22 Sep 2005 at 19:20

    Hasn't anyone read a copy of TREES & HIERARCHIES IN SQL?  There are sooo many better ways of doing this without any proprietary code.


  13. 19 Sep 2005 at 16:50

    Why was my post about making the database and code more efficient removed?

  14. 29 Apr 2005 at 09:54

    good article but wanted to know which of the two ways are more optimized
    using xml or the example you have given

  15. 22 Mar 2005 at 08:08
    HI. Is there a way to sort the Hierarchies? You used something like this:
    ''''''''''''''''''''''''''
    CREATE PROCEDURE dfTreeGetSubChildren ( @id INT, @depth INT ) AS
    SELECT t.* FROM dfTree AS children INNER JOIN dfTree AS actualNode
       ON children.lineage LIKE actualNode.lineage + '_%'
       WHERE actualNode.id = @id
    ORDER BY children.lineage, children.name
    '''''''''''''''''''''''''

    Since this do order by children.lineage, children.name, and the children.lineage is unique, the children.name order will not run.
    Is there a way to overcome this? Example, sort by the start of lineage except the last id /1/3/5 --> /1/3 this way we could sort by this field, cobined with name. But I do not know how to overcome this with T-SQL.

    I would appreciate if anyone could help me.

    Rgs
    Vidar
  16. 19 Feb 2005 at 18:33

    It's very performant.  Feel free to try it with a 10000 node, 10 level deep tree.  Works fine.

  17. 18 Feb 2005 at 10:19

    I'm not sure this is actually much better performance wise...


    Yes, the method I suggest requires some overhead when adding to the tree - but SELECTing from the tree is fast. Looking at the recursive function there, with multiple SELECT and DELETE statements... just to retrieve the tree, doesn't look especially nice to me when compared to


    SELECT t.* FROM dfTree t ORDER BY t.lineage


    ...

  18. 16 Feb 2005 at 20:34

    James:


    That was good reading!


    Good Job!

  19. 15 Feb 2005 at 20:27
    A much better way of doing this is to use a stack:

    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/acdata/ac_8_qd_14_5yk3.asp

    No triggers needs, no column needed to track depth...
  20. 31 Jan 2005 at 11:11

    The problem is fixed. Change the line in the UPDATE trigger that reads


    INNER JOIN deleted old ON child.lineage LIKE old.lineage + '%'


    to


    INNER JOIN inserted old ON child.lineage LIKE old.lineage + '%'


    I've updated the download.

  21. 27 Jan 2005 at 18:20

    Strange. I'll look into that and let you know....

  22. 27 Jan 2005 at 18:19

    Hello again,


    I was playing around with the updating trigger and I uncovered another little problem.


    It seems that I need to run the UPDATE statement twice in order to get the trigger to execute properly and update the depth and lineage columns.


    I have been playing around with the trigger to see if I could correct this, but haven't come up wiht anything yet.


    I guess its not a big deal as it isnt that hard to just run the statement twice, but do you have any ideas on how to get it working in a single pass?


    Thanks,
    Max

  23. 27 Jan 2005 at 17:01

    huh... isn't that interesting. Firefox WAS caching the zip I guess. Who knows... but I fired up Internet Explorer and I got the proper version. Thanks for the fixes, everything worked straight off when I fired it up this time. ;-)


    I do have a question though. In the original version of the zip file, you used  the computed columns. Intuitively it makes sense to not have to comput this each time we touch a row in the DB (as you say in the article), but have you done any tests (or heard anything back from anyone) as to the performance gain by calculating this via the triggers as opposed to the computed column approach.


    I'm just interested in any different thinking on the idea. Like I said before I've never really worked with computed columns before and while they don't really strike me as a good idea in general. better to do somehting once the first time, rather than recomputer on each access. I was wondering if you might know of any circumstances where computed columns would be the best approach (besides the obvious storage space considerations).


    Again, great article. I have done a bunch of different projects with hierarchies like this, and this article has a lot of good ideas for dealing with this kind of data.


    Thanks,
    Max

  24. 27 Jan 2005 at 15:51

    Hi - these problems have been fixed in the new ZIP file I uploaded. Are you sure your browser is not caching the old ZIP file?

  25. 27 Jan 2005 at 15:34

    Hi, I still have all the same problems with the sample code.


    TREEDEMOA:




    There is a missing Button for "CreateNodes" on the skin for TreeDemoA.



    TREEDEMOD:




    Get SQL error for unknown "id" parameter on execution of an SPROC.


    I looked into this and the problem is in SqlServerTreeProvider.cs


    There are a bunch of places where the SQL Params do not have the @ before the parameter names. so the "id" should be "@id"


    After fixing these too, I can get TreeDemoD to run.



    SQL SCRIPT:




    I get the following error on the execution:


    Server: Msg 271, Level 16, State 1, Procedure dfTree_UpdateTrigger, Line 10
    Column 'depth' cannot be modified because it is a computed column.


    It is not creating the UpdateTrigger. The problem here is that the Computed Column seems to be a late addition to the design and the code that originally updated the "depth" column in the UpdateTrigger has not been removed from the UpdateTrigger (corresponding code WAS removed from the InsertTrigger)



    Also, there is a reference in the file to the SPROC dfTreeGetValidParents, but that SPROC is not included in the SQL Script


           public ArrayList GetValidParents(int rootID, int uniqueID)
           {
               return ProcessList("dfTreeGetValidParents",
                   new SqlParameter("@rootId",rootID),
                   new SqlParameter("@id",uniqueID));
           }



    If you could update the Zip package with these changes it be most helpful for others trying the download.


    Again, very interesting article, and now I'm going to dive in and see how I can use the ideas and techniques in my projects. javascript:smilie('')
    smile


    Thanks,
    Max

  26. 27 Jan 2005 at 12:08

    Hey,


    Apologies! Thanks for pointing that out - I think I've fixed the sample download now - if you could try downloading it again, and let me know how it goes, that would be great


    Cheers,


    ~ James

  27. 26 Jan 2005 at 21:56

    Hi, very interesting article.


    I have run into a number of problems wiht the sample source code though.


    1. The SQL Statement will not fully execute. I get the following error:


    Server: Msg 271, Level 16, State 1, Procedure dfTree_UpdateTrigger, Line 10
    Column 'depth' cannot be modified because it is a computed column.


    2. The createNodes button was not on the page for TreeDemoA.



    3. when I try to run TreeDemoD I get the following error:


    id is not a parameter for procedure dfTreeGetTree.


    I don't know if this is due to the SQL statement failing or what.


    Any advice on getting the code running properly would be much appreciated.


    Thanks,
    Max

  28. 04 Jan 2005 at 17:11

    First of all, great article.  I agree with it and would like your feedback in regards to simplifying your approach on the node levels.  You use the 'lineage' varchar field to track where a node is in regards to order and depth.  Considering you have already created a depth field, and you have a unique node (somewhere in the structure), you have 2 out of the 3 pieces that you need to know where the node is.  You are simply missing the placement of the node at its depth.   So, why wouldn't you use an int field named NodeOrder instead of the lineage field?  This would help the doubling of data, and would be easier to maintain.  There are many similar fields in other databases that you could rob code from.  Just look for databases that use the 'sortorder' field.  By using a sort order index, you can fetch for a particular node, know its unique id, parent id, its depth and at what placement the node will sit.  and furthermore, you will know this for each node you retrieve in your select, which will in effect, create the /1/2/6/ .  The slash being the depth, and the number being the position and the unique id is in behind the postition number.... in other words (/position(unique_id)/)


    What do you think of this?


    Regards,


    Jay

  29. 01 Jan 1999 at 00:00

    This thread is for discussions of Tree structures in ASP.NET and SQL Server.

Leave a comment

Sign in or Join us (it's free).

James Crowley James first started this website when learning Visual Basic back in 1999 whilst studying his GCSEs. The site grew steadily over the years while being run as a hobby - to a regular monthly audience ...
AddThis

Related podcasts

Events coming up

  • Nov 19

    SQLBits V

    Newport, United Kingdom

    SQLBits is Europe's largest SQL Server conference, and SQLBits V will be the biggest and best yet. On November 19th we are holding a day of pre-conference seminars; on November 20th we have a pay-to-attend day of SQL Server 2008 and R2 content; and on Saturday November 21st we have our usual free community conference.

We'd love to hear what you think! Submit ideas or give us feedback