# Trees

## Traverse

## Binary Search Tree

BST.

### Time complexity

Search, Insert and Delete in average O(log `n`) and worst O(`n`).

## B-tree

- B-tree is a self-balancing tree that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time O(log
`n`). - B-tree is a generalization of a
*binary search tree*in that a node can have more than two children (within some pre-defined range).- E.g. 2-3 B-tree, each internal node may have only 2 or 3 child nodes.
- Because a range of child nodes is permitted, B-trees do not need re-balancing as frequently as other self-balancing search trees, but may waste some space, since nodes are not entirely full.

- It is commonly used in databases and filesystems.

### Structure

#### Keys

Each internal node of a B-tree will contain a number of keys. The keys act as separation values which divide its subtrees.

The following is a 3-5 B-tree.The numbers here are the "keys", and the dot denotes the branching point.

### Etymology

Rudolf Bayer and Ed McCreight invented the B-tree while working at Boeing Research Labs in 1971.

Donald Knuth speculated that the "B" may have originated from Boeing or from Bayer's name.

## Red-Black tree

### Intro

A red–black tree is a kind of self-balancing binary search tree.

Each node of the binary tree has an extra bit, and that bit is often interpreted as the color (red or black) of the node.

These color bits are used to ensure the tree remains *approximately* balanced during insertions and deletions.

When the tree is modified, the new tree is subsequently rearranged and repainted to restore the coloring properties.

The properties are designed in such a way that this rearranging and recoloring can be performed efficiently.

The balancing of the tree is not perfect, but it is good enough to allow it to guarantee the following operations in O(log n) time:

- searching.
- insertion
- deletion
- the tree rearrangement and recoloring

Red–black trees, like all binary search trees, allow efficient in-order traversal.

Space complexity: O(n).

红黑树和AVL树类似。区别在于它使用颜色来标识结点的高度，它所追求的是局部平衡而不是AVL树中的非常严格的平衡。

典型的用途是实现关联数组。在STL中set和multiset都是基于红黑树实现的。

### History

- In a 1978 paper, "A Dichromatic Framework for Balanced Trees", Leonidas J. Guibas and Robert Sedgewick derived the red-black tree from the symmetric binary B-tree.
- In 1993, Andersson introduced the idea of right leaning tree to simplify insert and delete operations.
- Sedgewick originally allowed nodes whose two children are red making his trees more like 2-3-4 trees but later this restriction was added making new trees more like 2-3 trees.

### Detail

#### Leaf nodes

The leaf nodes of red–black trees do not contain data.

Usually null child pointer can encode the fact that this child is a leaf. When the leaves really need to be explicit nodes, to save memory, a single sentinel node performs the role of all leaf nodes; all references from internal nodes to leaf nodes then point to the sentinel node.

#### Properties

- A node is either red or black.
- The root is black.
- All leaves (NIL) are black.
- (Prop 4) If a node is red, then both its children are black.
- (Prop 5) Every path from a given node to any of its descendant NIL nodes contains the same number of black nodes.
- Requirements imposed on a binary search tree.

说它不严格是因为它不是严格控制左、右子树高度或节点数之差小于等于1。但红黑树高度依然是平均log(n)，且最坏情况高度不会超过2log(n)，数学证明见下。

A critical property of red–black trees: the path from the root to the farthest leaf is no more than twice as long as the path from the root to the nearest leaf. The result is that the tree is roughly height-balanced.

##### Proof of critical property

It is guanranteed by properties 4 and 5 together.

- For a red–black tree T, let the shortest possible path from the root of T to any leaf consist of B black nodes.
- Longer possible paths may be constructed by inserting red nodes. However, property 4 makes it impossible to insert more than one consecutive red node. Therefore, the longest possible path consists of 2*B nodes, alternating black and red (this is the worst case).

The shortest possible path has all black nodes, and the longest possible path alternates between red and black nodes.

#### Definitions

The number of black nodes from the root to a node is *the node's black depth*;

The uniform number of black nodes in all paths from root to the leaves is called the *black-height of the red–black tree*.