All notes
Algorithms

Trees

Traverse

Binary Search Tree

BST.

Time complexity

Search, Insert and Delete in average O(log n) and worst O(n).

B-tree

Structure

Keys

Each internal node of a B-tree will contain a number of keys. The keys act as separation values which divide its subtrees.

The following is a 3-5 B-tree.The numbers here are the "keys", and the dot denotes the branching point.

3-5 B-tree

Etymology

Rudolf Bayer and Ed McCreight invented the B-tree while working at Boeing Research Labs in 1971.

Donald Knuth speculated that the "B" may have originated from Boeing or from Bayer's name.

Red-Black tree

Intro

A red–black tree is a kind of self-balancing binary search tree.

Each node of the binary tree has an extra bit, and that bit is often interpreted as the color (red or black) of the node.
These color bits are used to ensure the tree remains approximately balanced during insertions and deletions.

When the tree is modified, the new tree is subsequently rearranged and repainted to restore the coloring properties.
The properties are designed in such a way that this rearranging and recoloring can be performed efficiently.

The balancing of the tree is not perfect, but it is good enough to allow it to guarantee the following operations in O(log n) time:

Red–black trees, like all binary search trees, allow efficient in-order traversal.

Space complexity: O(n).

红黑树和AVL树类似。区别在于它使用颜色来标识结点的高度,它所追求的是局部平衡而不是AVL树中的非常严格的平衡。

典型的用途是实现关联数组。在STL中set和multiset都是基于红黑树实现的。

History

  1. In a 1978 paper, "A Dichromatic Framework for Balanced Trees", Leonidas J. Guibas and Robert Sedgewick derived the red-black tree from the symmetric binary B-tree.
  2. In 1993, Andersson introduced the idea of right leaning tree to simplify insert and delete operations.
  3. Sedgewick originally allowed nodes whose two children are red making his trees more like 2-3-4 trees but later this restriction was added making new trees more like 2-3 trees.

Detail

Leaf nodes

The leaf nodes of red–black trees do not contain data.
Usually null child pointer can encode the fact that this child is a leaf. When the leaves really need to be explicit nodes, to save memory, a single sentinel node performs the role of all leaf nodes; all references from internal nodes to leaf nodes then point to the sentinel node.

Properties

说它不严格是因为它不是严格控制左、右子树高度或节点数之差小于等于1。但红黑树高度依然是平均log(n),且最坏情况高度不会超过2log(n),数学证明见下。

A critical property of red–black trees: the path from the root to the farthest leaf is no more than twice as long as the path from the root to the nearest leaf. The result is that the tree is roughly height-balanced.

Proof of critical property

It is guanranteed by properties 4 and 5 together.

  1. For a red–black tree T, let the shortest possible path from the root of T to any leaf consist of B black nodes.
  2. Longer possible paths may be constructed by inserting red nodes. However, property 4 makes it impossible to insert more than one consecutive red node. Therefore, the longest possible path consists of 2*B nodes, alternating black and red (this is the worst case).

The shortest possible path has all black nodes, and the longest possible path alternates between red and black nodes.

Definitions

The number of black nodes from the root to a node is the node's black depth;
The uniform number of black nodes in all paths from root to the leaves is called the black-height of the red–black tree.

Hashing

Consistent hashing

Consistent hashing is a special kind of hashing such that when a hash table is resized, only K/n keys need to be remapped on average, where K is the number of keys, and n is the number of slots.
In contrast, in most traditional hash tables, a change in the number of array slots causes nearly all keys to be remapped.
Consistent hashing achieves some of the same goals as Rendezvous hashing (also called HRW Hashing).

The term "consistent hashing" was introduced by Karger et al. at MIT for use in distributed caching.

Recommendations algorithms

Examples include: