LCA Mixin
A mixin that implements Constant-Time Lowest Common Ancestor Retrieval.
“With the ability to solve lowest common ancestor queries in constant time, suffix trees can be used to solve many additional string problems.” — [Gusfield1997] §9, 196
“Definition In a rooted tree
Definition In a rooted tree
- uint(x)
Convert a number into unsigned 32-bit representation.
Used only for debugging.
- nlz(x)
Get the number of leadings zeros in a 32 bit word.
>>> nlz(0) 32 >>> nlz(0x1) 31 >>> nlz(0xFF) 24 >>> nlz(0xFFFFFFFF) 0
See: http://www.hackersdelight.org/hdcodetxt/nlz.c.txt
See: [Warren2013] pages 99ff
- msb(x)
Get the position of the most dignificat bit.
Get the position of the most significant set bit counting from the right and starting from 0.
>>> msb(0xF) 3 >>> msb(0xFF) 7 >>> msb(0) -1
- h(k)
Return the h value.
“Definition For any number
, denotes the position (counting from the right) of the least-significant 1-bit in the binary representation of .” — [Gusfield1997] §8.5, 184ff“Lemma 8.5.1. For any node
(node with path number k) in , equals the height of node in .For example, node 8 (binary 1000) is at height 4, and the path from it to a leaf has four nodes (three edges).” — [Gusfield1997] §8.5, 184ff
N.B. in this implementation we start counting with 0, so you get:
>>> h(5) 0 >>> h(8) 3
- class Node
Mixin for Node to allow LCA retireval.
- __init__()
This is inlined on all subclasses to save one __init__ call per symbol.
- lca_id
Number of the node given in a depth-first traversal of the tree, starting with 1. See [Gusfield1997] Figure 8.1, 182
- I
For a node
of , let be a node in such that is maximum over all nodes in the subtree of (including itself). — [Gusfield1997] §8.5, 184ffFor any node
, node is the deepest node in the run containing node . — [Gusfield1997] Lemma 8.6.1., 187N.B. This is the id of the node
.
- A
Bit
is set to 1 if and only if node has some ancestor in that maps to height in , i.e. if and only if has an ancestor such that . — [Gusfield1997] §8.7, 188fN.B. A node is an ancestor of itself. — [Gusfield1997] §8.1, 181
- compute_A(A)
Compute A.
- compute_I_and_L(L)
Compute I and L.
- prepare_lca(counter)
Prepare the node for LCA retrievals.
- class Leaf
A mixin for leaf nodes to allow for LCA retrievals.
- __str__()
Return str(self).
- compute_A(A)
Compute A.
- compute_I_and_L(L)
Compute I and L.
- prepare_lca(counter)
Prepare the leaf node for LCA retrieval.
- lca_id
Number of the node given in a depth-first traversal of the tree, starting with 1. See [Gusfield1997] Figure 8.1, 182
- I
For a node
of , let be a node in such that is maximum over all nodes in the subtree of (including itself). — [Gusfield1997] §8.5, 184ffFor any node
, node is the deepest node in the run containing node . — [Gusfield1997] Lemma 8.6.1., 187N.B. This is the id of the node
.
- A
Bit
is set to 1 if and only if node has some ancestor in that maps to height in , i.e. if and only if has an ancestor such that . — [Gusfield1997] §8.7, 188fN.B. A node is an ancestor of itself. — [Gusfield1997] §8.1, 181
- class Internal
A mixin for internal nodes to allow for LCA retrievals.
- __str__()
Return str(self).
- compute_A(A)
Compute A.
- compute_I_and_L(L)
Compute I and L.
- prepare_lca(counter)
Prepare the internal node for LCA retrieval.
- lca_id
Number of the node given in a depth-first traversal of the tree, starting with 1. See [Gusfield1997] Figure 8.1, 182
- I
For a node
of , let be a node in such that is maximum over all nodes in the subtree of (including itself). — [Gusfield1997] §8.5, 184ffFor any node
, node is the deepest node in the run containing node . — [Gusfield1997] Lemma 8.6.1., 187N.B. This is the id of the node
.
- A
Bit
is set to 1 if and only if node has some ancestor in that maps to height in , i.e. if and only if has an ancestor such that . — [Gusfield1997] §8.7, 188fN.B. A node is an ancestor of itself. — [Gusfield1997] §8.1, 181
- class Tree(_d)
A mixin for suffix trees to allow for LCA retrievals.
- __init__(_d)
- nodemap
map from string position to leaf node
- prepare_lca()
Preprocess the tree for Lowest Common Ancestor retrieval.
[Gusfield1997] §8.7
- lca(x, y)
Return the lowest common ancestor node of nodes x and y.
>>> from suffix_tree import Tree >>> tree = Tree({"A": "xabxac", "B": "awyawxawxz"}) >>> tree.prepare_lca() >>> tree.lca(tree.nodemap["A"][1], tree.nodemap["B"][3]).lca_id 8