Friday 29 November 2013

Understanding Splay Trees

Splay Trees are an example of a binary search tree, which is a data structure for arranging and sorting data, which is sorted into nodes. They make good use of pointers. Splay Trees share the same structure as a standard binary search tree, but provide benefits such as the most accessed node always being the root or top of the tree. There three standard algorithms for splaying the node to the root of the tree, and these depend upon the positioning and depth of the node within the Splay Tree.

Splay Trees are used in IFS (Installable File Systems).

The length of the time taken to complete a algorithm for a Splay Tree, can be calculated with the Big O notation. There's no randomisation within Splay Trees, and therefore some operations can be expensive and time consuming, this is especially true when Splay Trees become unbalanced and very tall. It could take many Zig operations to splay a node to the root of the tree. Each rotation is called a Splay Step, and the process is called Splaying.

Here is an example of a very tall Splay Tree:



Let's discuss the three algorithms, used to search or walk through a Splay Tree, and then Splay this tree to rebalance it.

Zig Zag Algorithm:

The algorithm, begins like a ordinary binary search tree, you start at the root, and then begin to walk down the tree, deciding if you take the left child or the right child of the current grandparent (root), depending upon the data sorted within the node your looking for. You continue this process, until you come across the node and the key your looking for, or until you reach a dead end, since the node may not be sorted within the search tree.

 The general rule here, is if x (node) which we are splaying up the tree, is the left child of right child, or the right child of the left child. It's easier to understand it, in a picture or in a video.


Zig Zig Algorithm:

The node (x) is left child of the left child, or right child of the right child. The Parent is rotated through the Grandparent, before the splay of the node takes place. The Grandparent is the root of the tree, and the Parent is the child which the node is the child of: right child of the right child.


Zig Algorithm:

This is the most easy algorithm, the node (x) is the child of the Grandparent or the root.



Splay Tree Node Insertion:

Now, let's discuss the Insertion and Deletion of nodes within a Splay Tree. The general rules apply to a binary search tree, and since a Splay Tree is a binary search tree, it inherits these properties. You can use your own defined relationships (properties) with the insertion of new nodes.
  • The left subtree of a node is less than the key stored in the Parent node.
  • The right subtree of a node is greater than the key stored in the Parent node.
  • There can't be any duplicate nodes. 
Remember the Parent Node, would the node above the subtree. All subtrees must be binary search trees too.  A key typically refers to the data stored within the tree.


This Splay Tree contains one node, with a data value of 60. It is the Root Node, and as a result will not have any associated Parents, but it doesn't have any children either. Until, we insert another node into the Splay Tree. RtlIsRoot can be called to decide if the current node is the root node, or techically the Grandparent node. Each node contains usually three different pointers. The Parent, Left Child and Right Child.

RtlSplay is used to call, and add a node to the Splay Tree, and rebalance the Splay Tree, so the node becomes the root.


This process can continue with the adding of new nodes. Remember you will need to call
RtlInitializeSplayLinks, to create a new link between the newly inserted node and the original node, before even adding the node to the tree and calling other routines.

Splay Tree Node Deletion:

A node can be deleted with the RtlDelete, which deletes a node and then splays the tree. You can specify not to re-balance the tree, with the RtlDeleteNoSplay.


Synchronization in Splay Trees:

 This really depends upon the IRQL Level, if your going to need to raise the IRQL Level to greater or to DISPATCH_LEVEL (IRQL Level 2), then spinlocks must be used along with non-paged pool. Otherwise, if lower than DISPATCH_LEVEL, then general dispatcher objects can be used along with paged pool. 

References: 

Kernel Mode Basics: Splay Trees

Binary Search Trees

Binary Search Trees - Wikipedia

Splay Tree - Wikipedia

All the routines should be fully documented within the WDK. 

3 comments: