The bytestring-trie package provides an efficient implementation
of tries mapping ByteString
to values. Actually, it provides two
implementations:
The old implementation (prior to version 0.3) is based on Okasaki's
big-endian patricia trees, ร la IntMap
. We first trie on the
elements of ByteString
and then trie on the big-endian bit
representation of those elements. Patricia trees have efficient
algorithms for union and other merging operations, but they're also
quick for lookups and insertions.
The new implementation (since version 0.3) replaces the big-endian
patricia trees with Bagwell's AMTs (array mapped tries), which are
the precursor to the HAMTs used by the hashmap
library. This
implementation is still a work in progress, but once completed will
replace the old implementation.
If you are only interested in being able to associate individual
ByteString
s to values, then you may prefer the hashmap
package
which is faster for those only needing a map-like structure. This
package is intended for those who need the extra capabilities that
a trie-like structure can offer (e.g., structure sharing to reduce
memory costs for highly redundant keys, taking the submap of all
keys with a given prefix, contextual mapping, extracting the minimum
and maximum keys, etc.)
Note that the GitHub repository is just a clone of the Darcs repo. I'm testing out whether to switch things over to GitHub in order to use TravisCI, and an official ticket tracker, etc.
This is a simple package and should be easy to install. You should be able to use one of the following standard methods to install it.
-- With cabal-install and without the source:
$> cabal install bytestring-trie
-- With cabal-install and with the source already:
$> cd bytestring-trie
$> cabal install
-- Without cabal-install, but with the source already:
$> cd bytestring-trie
$> runhaskell Setup.hs configure --user
$> runhaskell Setup.hs build
$> runhaskell Setup.hs haddock --hyperlink-source
$> runhaskell Setup.hs copy
$> runhaskell Setup.hs register
The Haddock step is optional.
An attempt has been made to keep this library relatively portable. The old implementation is quite portable, relying only on a few basic language extensions. However, the new implementation relies on a number of GHC-specific language extensions (namely MagicHash and UnboxedTuples) in order to fully optimize things. The complete list of extensions used is:
- BangPatterns - BangPatterns are supported in GHC as far back as version 6.6.1, and are also supported by JHC and UHC. As of 2010, they were not supported by Hugs; but alas Hugs is pretty much dead now.
- CPP
- ForeignFunctionInterface
- MagicHash
- NoImplicitPrelude
- Rank2Types
- UnboxedTuples