1
Fork 0

It was fixed some English mistakes and It was included the files BMZ.t2t, CZECH.t2t and COMPARISON.t2t

This commit is contained in:
fc_botelho 2005-01-24 18:15:50 +00:00
parent 783e633b6c
commit e5f0aef11c
4 changed files with 152 additions and 50 deletions

26
BMZ.t2t Normal file
View File

@ -0,0 +1,26 @@
BMZ Algorithm
----------------------------------------
**History**
**The Algorithm**
**The Heuristic**
**Papers**
----------------------------------------
[Home README.html]
----------------------------------------
Enjoy!
Davi de Castro Reis
Fabiano Cupertino Botelho
%preproc(html): '^%html% ' ''
%html% <a href="http://sourceforge.net"><img src="http://sourceforge.net/sflogo.php?group_id=96251&amp;type=1" width="88" height="31" border="0" alt="SourceForge.net Logo" /></a>
Last Updated: %%date(%c)

After

Width:  |  Height:  |  Size: 513 B

27
COMPARISON.t2t Normal file
View File

@ -0,0 +1,27 @@
Comparison Between BMZ And CZECH Algorithms
----------------------------------------
**Features**
**Constructing Minimal Perfect Hash Functions**
**Memory Consumption**
**Run times**
----------------------------------------
[Home README.html]
----------------------------------------
Enjoy!
Davi de Castro Reis
Fabiano Cupertino Botelho
%preproc(html): '^%html% ' ''
%html% <a href="http://sourceforge.net"><img src="http://sourceforge.net/sflogo.php?group_id=96251&amp;type=1" width="88" height="31" border="0" alt="SourceForge.net Logo" /></a>
Last Updated: %%date(%c)

24
CZECH.t2t Normal file
View File

@ -0,0 +1,24 @@
CZECH Algorithm
----------------------------------------
**History**
**The Algorithm**
**Papers**
----------------------------------------
[Home README.html]
----------------------------------------
Enjoy!
Davi de Castro Reis
Fabiano Cupertino Botelho
%preproc(html): '^%html% ' ''
%html% <a href="http://sourceforge.net"><img src="http://sourceforge.net/sflogo.php?group_id=96251&amp;type=1" width="88" height="31" border="0" alt="SourceForge.net Logo" /></a>
Last Updated: %%date(%c)

View File

@ -1,41 +1,65 @@
== cmph - C Minimal Perfect Hashing Library == CMPH - C Minimal Perfect Hashing Library
----------------------------------------
**Description** **Description**
C Minimal Perfect Hashing Library is a portable LGPLed library to create and C Minimal Perfect Hashing Library is a portable LGPLed library to create and
work with minimal perfect hashes. The cmph library encapsulates the newest to work with minimal perfect hashing functions. The cmph library encapsulates the newest
and more efficient algorithms in the literature in a ease-to-use, and more efficient algorithms (available in the literature) in an easy-to-use,
production-quality, fast API. The library is designed to work big entries that production-quality and fast API. The library is designed to work with big entries that
won't fit in the main memory. It has been used successfully to create hashes can not be fit in the main memory. It has been used successfully for constructing minimal perfect
bigger than 100 million entries. Although there is a lack of similar libraries hashing functions for sets with more than 100 million of keys.
in the free software world, we can point out some of the "distinguishing" Although there is a lack of similar libraries
in the free software world, we can point out some of the distinguishable
features of cmph: features of cmph:
- Fast - Fast.
- Space-efficient with main memory usage carefully documented - Space-efficient with main memory usage carefully documented.
- The best modern algorithms are available (or at least scheduled for implementation :-)) - The best modern algorithms are available (or at least scheduled for implementation :-)).
- Works with in-disk key sets through use of adapter pattern - Works with in-disk key sets through of using the adapter pattern.
- Serialization of hash functions - Serialization of hash functions.
- Portable C code (currently works on GNU/Linux and WIN32) - Portable C code (currently works on GNU/Linux and WIN32).
- Object oriented implementation - Object oriented implementation.
- Easily extensible - Easily extensible.
- Well encapsulated API aiming binary compatibility through releases - Well encapsulated API aiming binary compatibility through releases.
- Free Software - Free Software.
----------------------------------------
**Supported Algorithms**
- [BMZ Algorithm BMZ.html]. A very fast algorithm based on cyclic random graphs to construct minimal
perfect hash functions in linear time. The resulting functions are not order preserving and
can be stored in only 4cn bytes, where c is between 0.93 and 1.15.
- [CZECH Algorithm CZECH.html]. An algorithm based on acyclic random graphs to construct minimal
perfect hash functions in linear time. The resulting functions are order preserving and
are stored in 4cn bytes, where c is greater than 2.
[Click Here COMPARISON.html] to see a comparison of the supported algorithms.
----------------------------------------
**News for version 0.3** **News for version 0.3**
- New heuristics in bmz algorithm, providing hash creation with only - New heuristic added to the bmz algorithm permits to generate a mphf with only
(0.93 * 16 + 4)*n bytes and hash query with (0.93*4)n bytes (xxx)*n bytes. The resulting function can be stored in (0.93*4)n bytes.
[click here BMZ.html] for details.
----------------------------------------
**Examples** **Examples**
Using cmph is quite ease. Take a look. Using cmph is quite simple. Take a look.
``` ```
// Create minimal perfect hash from in-memory vector // Create minimal perfect hash function from in-memory vector
#include <cmph.h> #include <cmph.h>
... ...
@ -44,7 +68,7 @@ Using cmph is quite ease. Take a look.
//Fill vector //Fill vector
//... //...
//Create minimal perfect hash //Create minimal perfect hashing function using the default(czech) algorithm.
cmph_config_t *config = cmph_config_new(cmph_io_vector_adapter(vector, nkeys)); cmph_config_t *config = cmph_config_new(cmph_io_vector_adapter(vector, nkeys));
cmph_t *hash = cmph_new(config); cmph_t *hash = cmph_new(config);
cmph_config_destroy(config); cmph_config_destroy(config);
@ -59,7 +83,7 @@ Using cmph is quite ease. Take a look.
------------------------------- -------------------------------
``` ```
// Create minimal perfect hash from in-disk keys using BMZ algorithm // Create minimal perfect hash function from in-disk keys using BMZ algorithm
#include <cmph.h> #include <cmph.h>
... ...
@ -83,18 +107,18 @@ Using cmph is quite ease. Take a look.
``` ```
-------------------------------------- --------------------------------------
**The cmph application** **The cmph application**
cmph is the name of both the library and the utility cmph is the name of both the library and the utility
application that comes with this package. You can use the cmph application that comes with this package. You can use the cmph
application to create minimal perfect hashes from command line. The cmph utility application for constructing minimal perfect hashing functions from the command line.
comes with a number of flags, but it is very simple to create and query The cmph utility
minimal perfect hashes: comes with a number of flags, but it is very simple to create and to query
minimal perfect hashing functions:
``` ```
$ # Create mph for keys in file keys_file $ # Using the czech algorithm (default one) for constructing a mphf for keys in file keys_file
$ ./cmph keys_file $ ./cmph -g keys_file
$ # Query id of keys in the file keys_query $ # Query id of keys in the file keys_query
$ ./cmph -m keys_file.mph keys_query $ ./cmph -m keys_file.mph keys_query
``` ```
@ -105,34 +129,35 @@ utility.
``` ```
usage: cmph [-v] [-h] [-V] [-k] [-g [-s seed] ] [-m file.mph] [-a algorithm] keysfile usage: cmph [-v] [-h] [-V] [-k nkeys] [-f hash_function] [-g [-c value][-s seed] ] [-m file.mph] [-a algorithm] keysfile
Minimum perfect hashing tool Minimum perfect hashing tool
-h print this help message -h print this help message
-c c value that determines the number of vertices in the graph -c c value that determines the number of vertices in the graph
-a algorithm - valid values are -a algorithm - valid values are
* czech * bmz
* bmz * czech
-f hash function (may be used multiple times) - valid values are -f hash function (may be used multiple times) - valid values are
* jenkins * djb2
* djb2 * fnv
* sdbm * glib
* fnv * jenkins
* glib * pjw
* pjw * sdbm
-V print version number and exit -V print version number and exit
-v increase verbosity (may be used multiple times) -v increase verbosity (may be used multiple times)
-k number of keys -k number of keys
-g generation mode -g generation mode
-s random seed -s random seed
-m minimum perfect hash function file -m minimum perfect hash function file
keysfile line separated file with keys keysfile line separated file with keys
``` ```
**Downloads** **Downloads**
Use the project page at sourceforge: http://sf.net/projects/cmph Use the project page at sourceforge: http://sf.net/projects/cmph
**License Stuff** **License Stuff**
Code is under the LGPL. Code is under the LGPL.