It was fixed some English mistakes and It was included the files BMZ.t2t, CZECH.t2t and COMPARISON.t2t
This commit is contained in:
parent
783e633b6c
commit
e5f0aef11c
26
BMZ.t2t
Normal file
26
BMZ.t2t
Normal file
@ -0,0 +1,26 @@
|
|||||||
|
BMZ Algorithm
|
||||||
|
|
||||||
|
|
||||||
|
----------------------------------------
|
||||||
|
|
||||||
|
**History**
|
||||||
|
|
||||||
|
**The Algorithm**
|
||||||
|
|
||||||
|
**The Heuristic**
|
||||||
|
|
||||||
|
**Papers**
|
||||||
|
|
||||||
|
----------------------------------------
|
||||||
|
[Home README.html]
|
||||||
|
----------------------------------------
|
||||||
|
Enjoy!
|
||||||
|
|
||||||
|
Davi de Castro Reis
|
||||||
|
|
||||||
|
Fabiano Cupertino Botelho
|
||||||
|
|
||||||
|
|
||||||
|
%preproc(html): '^%html% ' ''
|
||||||
|
%html% <a href="http://sourceforge.net"><img src="http://sourceforge.net/sflogo.php?group_id=96251&type=1" width="88" height="31" border="0" alt="SourceForge.net Logo" /></a>
|
||||||
|
Last Updated: %%date(%c)
|
After Width: | Height: | Size: 513 B |
27
COMPARISON.t2t
Normal file
27
COMPARISON.t2t
Normal file
@ -0,0 +1,27 @@
|
|||||||
|
Comparison Between BMZ And CZECH Algorithms
|
||||||
|
|
||||||
|
|
||||||
|
----------------------------------------
|
||||||
|
|
||||||
|
**Features**
|
||||||
|
|
||||||
|
**Constructing Minimal Perfect Hash Functions**
|
||||||
|
|
||||||
|
**Memory Consumption**
|
||||||
|
|
||||||
|
|
||||||
|
**Run times**
|
||||||
|
|
||||||
|
----------------------------------------
|
||||||
|
[Home README.html]
|
||||||
|
----------------------------------------
|
||||||
|
Enjoy!
|
||||||
|
|
||||||
|
Davi de Castro Reis
|
||||||
|
|
||||||
|
Fabiano Cupertino Botelho
|
||||||
|
|
||||||
|
|
||||||
|
%preproc(html): '^%html% ' ''
|
||||||
|
%html% <a href="http://sourceforge.net"><img src="http://sourceforge.net/sflogo.php?group_id=96251&type=1" width="88" height="31" border="0" alt="SourceForge.net Logo" /></a>
|
||||||
|
Last Updated: %%date(%c)
|
24
CZECH.t2t
Normal file
24
CZECH.t2t
Normal file
@ -0,0 +1,24 @@
|
|||||||
|
CZECH Algorithm
|
||||||
|
|
||||||
|
|
||||||
|
----------------------------------------
|
||||||
|
|
||||||
|
**History**
|
||||||
|
|
||||||
|
**The Algorithm**
|
||||||
|
|
||||||
|
**Papers**
|
||||||
|
|
||||||
|
----------------------------------------
|
||||||
|
[Home README.html]
|
||||||
|
----------------------------------------
|
||||||
|
Enjoy!
|
||||||
|
|
||||||
|
Davi de Castro Reis
|
||||||
|
|
||||||
|
Fabiano Cupertino Botelho
|
||||||
|
|
||||||
|
|
||||||
|
%preproc(html): '^%html% ' ''
|
||||||
|
%html% <a href="http://sourceforge.net"><img src="http://sourceforge.net/sflogo.php?group_id=96251&type=1" width="88" height="31" border="0" alt="SourceForge.net Logo" /></a>
|
||||||
|
Last Updated: %%date(%c)
|
125
README.t2t
125
README.t2t
@ -1,41 +1,65 @@
|
|||||||
== cmph - C Minimal Perfect Hashing Library ==
|
CMPH - C Minimal Perfect Hashing Library
|
||||||
|
|
||||||
|
|
||||||
|
----------------------------------------
|
||||||
|
|
||||||
**Description**
|
**Description**
|
||||||
|
|
||||||
C Minimal Perfect Hashing Library is a portable LGPLed library to create and
|
C Minimal Perfect Hashing Library is a portable LGPLed library to create and
|
||||||
work with minimal perfect hashes. The cmph library encapsulates the newest
|
to work with minimal perfect hashing functions. The cmph library encapsulates the newest
|
||||||
and more efficient algorithms in the literature in a ease-to-use,
|
and more efficient algorithms (available in the literature) in an easy-to-use,
|
||||||
production-quality, fast API. The library is designed to work big entries that
|
production-quality and fast API. The library is designed to work with big entries that
|
||||||
won't fit in the main memory. It has been used successfully to create hashes
|
can not be fit in the main memory. It has been used successfully for constructing minimal perfect
|
||||||
bigger than 100 million entries. Although there is a lack of similar libraries
|
hashing functions for sets with more than 100 million of keys.
|
||||||
in the free software world, we can point out some of the "distinguishing"
|
Although there is a lack of similar libraries
|
||||||
|
in the free software world, we can point out some of the distinguishable
|
||||||
features of cmph:
|
features of cmph:
|
||||||
|
|
||||||
- Fast
|
- Fast.
|
||||||
- Space-efficient with main memory usage carefully documented
|
- Space-efficient with main memory usage carefully documented.
|
||||||
- The best modern algorithms are available (or at least scheduled for implementation :-))
|
- The best modern algorithms are available (or at least scheduled for implementation :-)).
|
||||||
- Works with in-disk key sets through use of adapter pattern
|
- Works with in-disk key sets through of using the adapter pattern.
|
||||||
- Serialization of hash functions
|
- Serialization of hash functions.
|
||||||
- Portable C code (currently works on GNU/Linux and WIN32)
|
- Portable C code (currently works on GNU/Linux and WIN32).
|
||||||
- Object oriented implementation
|
- Object oriented implementation.
|
||||||
- Easily extensible
|
- Easily extensible.
|
||||||
- Well encapsulated API aiming binary compatibility through releases
|
- Well encapsulated API aiming binary compatibility through releases.
|
||||||
- Free Software
|
- Free Software.
|
||||||
|
|
||||||
|
|
||||||
|
----------------------------------------
|
||||||
|
|
||||||
|
**Supported Algorithms**
|
||||||
|
|
||||||
|
- [BMZ Algorithm BMZ.html]. A very fast algorithm based on cyclic random graphs to construct minimal
|
||||||
|
perfect hash functions in linear time. The resulting functions are not order preserving and
|
||||||
|
can be stored in only 4cn bytes, where c is between 0.93 and 1.15.
|
||||||
|
|
||||||
|
- [CZECH Algorithm CZECH.html]. An algorithm based on acyclic random graphs to construct minimal
|
||||||
|
perfect hash functions in linear time. The resulting functions are order preserving and
|
||||||
|
are stored in 4cn bytes, where c is greater than 2.
|
||||||
|
|
||||||
|
[Click Here COMPARISON.html] to see a comparison of the supported algorithms.
|
||||||
|
|
||||||
|
|
||||||
|
----------------------------------------
|
||||||
|
|
||||||
**News for version 0.3**
|
**News for version 0.3**
|
||||||
|
|
||||||
- New heuristics in bmz algorithm, providing hash creation with only
|
- New heuristic added to the bmz algorithm permits to generate a mphf with only
|
||||||
(0.93 * 16 + 4)*n bytes and hash query with (0.93*4)n bytes
|
(xxx)*n bytes. The resulting function can be stored in (0.93*4)n bytes.
|
||||||
|
[click here BMZ.html] for details.
|
||||||
|
|
||||||
|
|
||||||
|
----------------------------------------
|
||||||
|
|
||||||
**Examples**
|
**Examples**
|
||||||
|
|
||||||
Using cmph is quite ease. Take a look.
|
Using cmph is quite simple. Take a look.
|
||||||
|
|
||||||
|
|
||||||
```
|
```
|
||||||
// Create minimal perfect hash from in-memory vector
|
// Create minimal perfect hash function from in-memory vector
|
||||||
#include <cmph.h>
|
#include <cmph.h>
|
||||||
...
|
...
|
||||||
|
|
||||||
@ -44,7 +68,7 @@ Using cmph is quite ease. Take a look.
|
|||||||
//Fill vector
|
//Fill vector
|
||||||
//...
|
//...
|
||||||
|
|
||||||
//Create minimal perfect hash
|
//Create minimal perfect hashing function using the default(czech) algorithm.
|
||||||
cmph_config_t *config = cmph_config_new(cmph_io_vector_adapter(vector, nkeys));
|
cmph_config_t *config = cmph_config_new(cmph_io_vector_adapter(vector, nkeys));
|
||||||
cmph_t *hash = cmph_new(config);
|
cmph_t *hash = cmph_new(config);
|
||||||
cmph_config_destroy(config);
|
cmph_config_destroy(config);
|
||||||
@ -59,7 +83,7 @@ Using cmph is quite ease. Take a look.
|
|||||||
-------------------------------
|
-------------------------------
|
||||||
|
|
||||||
```
|
```
|
||||||
// Create minimal perfect hash from in-disk keys using BMZ algorithm
|
// Create minimal perfect hash function from in-disk keys using BMZ algorithm
|
||||||
#include <cmph.h>
|
#include <cmph.h>
|
||||||
...
|
...
|
||||||
|
|
||||||
@ -83,18 +107,18 @@ Using cmph is quite ease. Take a look.
|
|||||||
```
|
```
|
||||||
--------------------------------------
|
--------------------------------------
|
||||||
|
|
||||||
|
|
||||||
**The cmph application**
|
**The cmph application**
|
||||||
|
|
||||||
cmph is the name of both the library and the utility
|
cmph is the name of both the library and the utility
|
||||||
application that comes with this package. You can use the cmph
|
application that comes with this package. You can use the cmph
|
||||||
application to create minimal perfect hashes from command line. The cmph utility
|
application for constructing minimal perfect hashing functions from the command line.
|
||||||
comes with a number of flags, but it is very simple to create and query
|
The cmph utility
|
||||||
minimal perfect hashes:
|
comes with a number of flags, but it is very simple to create and to query
|
||||||
|
minimal perfect hashing functions:
|
||||||
|
|
||||||
```
|
```
|
||||||
$ # Create mph for keys in file keys_file
|
$ # Using the czech algorithm (default one) for constructing a mphf for keys in file keys_file
|
||||||
$ ./cmph keys_file
|
$ ./cmph -g keys_file
|
||||||
$ # Query id of keys in the file keys_query
|
$ # Query id of keys in the file keys_query
|
||||||
$ ./cmph -m keys_file.mph keys_query
|
$ ./cmph -m keys_file.mph keys_query
|
||||||
```
|
```
|
||||||
@ -105,34 +129,35 @@ utility.
|
|||||||
|
|
||||||
|
|
||||||
```
|
```
|
||||||
usage: cmph [-v] [-h] [-V] [-k] [-g [-s seed] ] [-m file.mph] [-a algorithm] keysfile
|
usage: cmph [-v] [-h] [-V] [-k nkeys] [-f hash_function] [-g [-c value][-s seed] ] [-m file.mph] [-a algorithm] keysfile
|
||||||
Minimum perfect hashing tool
|
Minimum perfect hashing tool
|
||||||
|
|
||||||
-h print this help message
|
-h print this help message
|
||||||
-c c value that determines the number of vertices in the graph
|
-c c value that determines the number of vertices in the graph
|
||||||
-a algorithm - valid values are
|
-a algorithm - valid values are
|
||||||
* czech
|
* bmz
|
||||||
* bmz
|
* czech
|
||||||
-f hash function (may be used multiple times) - valid values are
|
-f hash function (may be used multiple times) - valid values are
|
||||||
* jenkins
|
* djb2
|
||||||
* djb2
|
* fnv
|
||||||
* sdbm
|
* glib
|
||||||
* fnv
|
* jenkins
|
||||||
* glib
|
* pjw
|
||||||
* pjw
|
* sdbm
|
||||||
-V print version number and exit
|
-V print version number and exit
|
||||||
-v increase verbosity (may be used multiple times)
|
-v increase verbosity (may be used multiple times)
|
||||||
-k number of keys
|
-k number of keys
|
||||||
-g generation mode
|
-g generation mode
|
||||||
-s random seed
|
-s random seed
|
||||||
-m minimum perfect hash function file
|
-m minimum perfect hash function file
|
||||||
keysfile line separated file with keys
|
keysfile line separated file with keys
|
||||||
```
|
```
|
||||||
|
|
||||||
**Downloads**
|
**Downloads**
|
||||||
|
|
||||||
Use the project page at sourceforge: http://sf.net/projects/cmph
|
Use the project page at sourceforge: http://sf.net/projects/cmph
|
||||||
|
|
||||||
|
|
||||||
**License Stuff**
|
**License Stuff**
|
||||||
|
|
||||||
Code is under the LGPL.
|
Code is under the LGPL.
|
||||||
|
Loading…
Reference in New Issue
Block a user