pygtrie¶
pygtrie is a Python library implementing a trie data structure.
Trie data structure, also known as radix or prefix tree, is a tree associating keys to values where all the descendants of a node have a common prefix (associated with that node).
The trie module contains trie.Trie
, trie.CharTrie
and trie.StringTrie
classes each implementing a mutable
mapping interface, i.e. dict
interface. As such, in most
circumstances, trie.Trie
could be used as a drop-in
replacement for a dict
, but the prefix nature of the data
structure is trie’s real strength.
The module also contains trie.PrefixSet
class which uses
a trie to store a set of prefixes such that a key is contained in the
set if it or its prefix is stored in the set.
Features¶
- A full mutable mapping implementation.
- Supports iterating over as well as deleting a subtrie.
- Supports prefix checking as well as shortest and longest prefix look-up.
- Extensible for any kind of user-defined keys.
- A PrefixSet supports “all keys starting with given prefix” logic.
- Can store any value including None.
Installation¶
To install bz2file, run:
pip install pygtrie
Or download the sources and save trie.py
file with your project.
Trie classes¶
-
class
Trie
(*args, **kwargs)[source]¶ A trie implementation with dict interface plus some extensions.
Keys used with the
trie.Trie
must be iterable, yielding hashable objects. In other words, for a given key,dict.fromkeys(key)
must be valid.In particular, strings work fine as trie keys, however when getting keys back from iterkeys() method for example, instead of strings, tuples of characters are produced. For that reason,
trie.CharTrie
ortrie.StringTrie
may be preferred when usingtrie.Trie
with string keys.-
classmethod
fromkeys
(keys, value=None)[source]¶ Creates a new trie with given keys set.
- Args:
- keys: An iterable of keys that should be set in the new trie. value: Value to associate with given keys.
- Returns:
- A new trie where each key from
keys
has been set to the given value.
-
has_node
(key)[source]¶ Returns whether given node is in the trie.
- Args:
- key: A key to look for.
- Returns:
- A bitwise or of
HAS_VALUE
andHAS_SUBTRIE
values indicating that node has a value associated with it and that it is a prefix of another existing key respectively.
-
items
(prefix=<object object>, shallow=False)[source]¶ Returns a list of
(key, value)
pairs in given subtrie.
-
iteritems
(prefix=<object object>, shallow=False)[source]¶ Yields all nodes with associated values with given prefix.
- Args:
prefix: Prefix to limit iteration to. shallow: Perform a shallow traversal, i.e. do not yield items if their
prefix has been yielded.- Yields:
(key, value)
tuples.- Raises:
- KeyError: If
prefix
does not match any node.
-
iterkeys
(prefix=<object object>, shallow=False)[source]¶ Yields all keys having associated values with given prefix.
- Args:
prefix: Prefix to limit iteration to. shallow: Perform a shallow traversal, i.e. do not yield keys if their
prefix has been yielded.- Yields:
- All the keys (with given prefix) with associated values in the trie.
- Raises:
- KeyError: If
prefix
does not match any node.
-
itervalues
(prefix=<object object>, shallow=False)[source]¶ Yields all values associated with keys with given prefix.
- Args:
prefix: Prefix to limit iteration to. shallow: Perform a shallow traversal, i.e. do not yield values if their
prefix has been yielded.- Yields:
- All the values associated with keys (with given prefix) in the trie.
- Raises:
- KeyError: If
prefix
does not match any node.
-
keys
(prefix=<object object>, shallow=False)[source]¶ Returns a list of all the keys, with given prefix, in the trie.
-
longest_prefix
(key)[source]¶ Finds the longest prefix of a key with a value.
- Args:
- key: Key to look for.
- Returns:
(k, value)
wherek
is the longest prefix ofkey
(it may equalkey
) andvalue
is a value associated with that key. If no node is found,(None, None)
is returned.
-
pop
(key, default=<object object>)[source]¶ Deletes value associated with given key and returns it.
- Args:
key: A key to look for. Must be iterable. default: If specified, value that will be returned if given key has no
value associated with it. If not specified, method will throw KeyError in such cases.- Returns:
- Removed value, if key had value associated with it, or
default
(if given). - Raises:
- ShortKeyError: If
default
has not been specified and the key has no - value associated with it but is a prefix of some key with a value.
Note that
ShortKeyError
is subclass ofKeyError
. - KeyError: If default has not been specified and key has no value
- associated with it nor is a prefix of an existing key.
- ShortKeyError: If
-
popitem
()[source]¶ Deletes a value from the trie.
- Returns:
(key, value)
tuple indicating deleted key.- Raises:
- KeyError: If the trie is empty.
-
prefixes
(key)[source]¶ Walks towards the node specified by key and yields all found values.
- Args:
- key: Key to look for.
- Yields:
(k, value)
pairs denoting keys with associated values encountered on the way towards the specified key.
-
setdefault
(key, value)[source]¶ Sets value of a given node if not set already. Returns it afterwards.
-
classmethod
-
class
CharTrie
(*args, **kwargs)[source]¶ A variant of a
trie.Trie
which accepts strings as keys.The only difference between
trie.CharTrie
andtrie.Trie
is that whentrie.CharTrie
returns keys back to the client (for instance in keys() method is called), those keys are returned as strings.
PrefixSet class¶
-
class
PrefixSet
(iterable=None, factory=<class 'trie.Trie'>, **kwargs)[source]¶ A set of prefixes.
trie.PrefixSet
works similar to a normal set except it is said to contain a key if the key or it’s prefix is stored in the set. For instance, if “foo” is added to the set, the set contains “foo” as well as “foobar”.The set supports addition of elements but does not support removal of elements. This is because there’s no obvious consistent and intuitive behaviour for element deletion.
-
add
(key)[source]¶ Adds given key to the set.
If the set already contains prefix of the key being added, this operation has no effect. If the key being added is a prefix of some existing keys in the set, those keys are deleted and replaced by a single entry for the key being added.
For example, if the set contains key “foo” adding a key “foobar” does not change anything. On the other hand, if the set contains keys “foobar” and “foobaz”, adding a key “foo” will replace those two keys with a single key “foo”.
This makes a difference when iterating over the keys or counting number of keys. Counter intuitively, adding of a key can decrease size of the set.
- Args:
- key: Key to add.
-
iter
(prefix=<object object>)[source]¶ Iterates over all keys in the set optionally starting with a prefix.
Since a key does not have to be explicitly added to the set to be an element of the set, this method does not iterate over all possible keys that the set contains, but only over the shortest set of prefixes of all the keys the set contains.
For example, if “foo” has been added to the set, the set contains also “foobar”, but this method will not iterate over “foobar”.
If
prefix
argument is given, method will iterate over keys with given prefix only. The keys yielded from the function if prefix is given does not have to be a subset (in mathematical sense) of the keys yielded when there is not prefix. This happens, if the set contains a prefix of the given prefix.For example, if only “foo” has been added to the set, iter method called with no arguments will yield “foo” only. However, when called with “foobar” argument, it will yield “foobar” only.
-