1
Fork 0
NSS plugin for passwd and groups databases
Go to file
fc_botelho 4951dedce9 BMZ documentation was finished 2005-01-31 18:50:58 +00:00
figs Initial version 2005-01-28 20:12:58 +00:00
papers Initial version 2005-01-28 20:12:58 +00:00
src It was removed pjw and glib functions from cmph_hash_names vector 2005-01-27 14:12:28 +00:00
tests Fixed wingetopt.c 2005-01-21 21:14:55 +00:00
AUTHORS It was added the authors' email 2005-01-27 16:23:11 +00:00
BMZ.t2t BMZ documentation was finished 2005-01-31 18:50:58 +00:00
CHM.t2t BMZ documentation was finished 2005-01-31 18:50:58 +00:00
COMPARISON.t2t BMZ documentation was finished 2005-01-31 18:50:58 +00:00
CONCEPTS.t2t BMZ documentation was finished 2005-01-31 18:50:58 +00:00
CONFIG.t2t BMZ documentation was finished 2005-01-31 18:50:58 +00:00
COPYING Fixed a lot of warnings. Added visual studio project. Make needed changes to work with windows. 2005-01-18 12:18:51 +00:00
ChangeLog Added Doxyfile. 2005-01-21 21:19:18 +00:00
DOC.css BMZ documentation was finished 2005-01-31 18:50:58 +00:00
Doxyfile Added Doxyfile. 2005-01-21 21:19:18 +00:00
FAQ.t2t BMZ documentation was finished 2005-01-31 18:50:58 +00:00
FOOTER.t2t It was added FOOTER.t2t file 2005-01-27 16:21:49 +00:00
GPERF.t2t BMZ documentation was finished 2005-01-31 18:50:58 +00:00
LOGO.html It was included the PreProc macro through the CONFIG.t2t file and the LOGO through the LOGO.html file 2005-01-25 20:40:08 +00:00
LOGO.t2t BMZ documentation was finished 2005-01-31 18:50:58 +00:00
Makefile.am Initial revision 2004-12-23 13:16:30 +00:00
README.t2t BMZ documentation was finished 2005-01-31 18:50:58 +00:00
TABLE1.t2t BMZ documentation was finished 2005-01-31 18:50:58 +00:00
TABLE4.t2t BMZ documentation was finished 2005-01-31 18:50:58 +00:00
TABLE5.t2t BMZ documentation was finished 2005-01-31 18:50:58 +00:00
acinclude.m4 Added macros for large file support. 2005-01-19 12:40:22 +00:00
cmph.spec Initial revision 2004-12-23 13:16:30 +00:00
cmph.vcproj Added missing files. 2005-01-18 17:10:28 +00:00
cmphapp.vcproj Added missing files. 2005-01-18 17:10:28 +00:00
configure.ac Version was updated 2005-01-25 20:44:50 +00:00
gendocs Fix to alternate hash functions code. Removed htonl stuff from chm algorithm. Added faq. 2005-01-27 13:01:45 +00:00
wingetopt.c Fixed wingetopt.c 2005-01-21 21:14:55 +00:00
wingetopt.h Added initial txt2tags documentation. 2005-01-20 12:28:42 +00:00

README.t2t

CMPH - C Minimal Perfect Hashing Library


%!includeconf: CONFIG.t2t

-------------------------------------------------------------------

==Description==

C Minimal Perfect Hashing Library is a portable LGPLed library to create and
to work with [minimal perfect hash functions concepts.html]. 
The cmph library encapsulates the newest
and more efficient algorithms (available in the literature) in an easy-to-use, 
production-quality and fast API. The library is designed to work with big entries that 
can not fit in the main memory. It has been used successfully for constructing minimal perfect
hash functions for sets with more than 100 million of keys. 
Although there is a lack of similar libraries
in the free software world ([gperf is a bit different gperf.html]), we can point out some 
of the distinguishable features of cmph:

- Fast.
- Space-efficient with main memory usage carefully documented.
- The best modern algorithms are available (or at least scheduled for implementation :-)).
- Works with in-disk key sets through of using the adapter pattern.
- Serialization of hash functions.
- Portable C code (currently works on GNU/Linux and WIN32).
- Object oriented implementation.
- Easily extensible.
- Well encapsulated API aiming binary compatibility through releases.
- Free Software.


----------------------------------------

==Supported Algorithms==

 
%html% - [BMZ Algorithm bmz.html].
%txt% - BMZ Algorithm.
  A very fast algorithm based on cyclic random graphs to construct minimal
  perfect hash functions in linear time. The resulting functions are not order preserving and
  can be stored in only //4cn// bytes, where //c// is between 0.93 and 1.15.  
%html% - [CHM Algorithm chm.html].
%txt% - CHM Algorithm.
  An algorithm based on acyclic random graphs to construct minimal
  perfect hash functions in linear time. The resulting functions are order preserving and
  are stored in //4cn// bytes, where //c// is greater than 2.

%html% [Click Here comparison.html] to see a comparison of the supported algorithms. 


----------------------------------------

==News for version 0.3==

- New heuristic added to the bmz algorithm permits to generate a mphf with only
  //24.6n + O(1)// bytes. The resulting function can be stored in //3.72n// bytes.
%html% [click here bmz.html#heuristic] for details.


----------------------------------------

==Examples==

Using cmph is quite simple. Take a look.


```
 // Create minimal perfect hash function from in-memory vector
 #include <cmph.h>
 ...
 
 const char **vector;
 unsigned int nkeys;
 //Fill vector
 //...
 
 //Create minimal perfect hash function using the default(chm) algorithm.
 cmph_config_t *config = cmph_config_new(cmph_io_vector_adapter(vector, nkeys));
 cmph_t *hash = cmph_new(config);
 cmph_config_destroy(config);
 
 //Find key
 const char *key = "sample key";
 unsigned int id = cmph_search(hash, key);
 
 //Destroy hash
 cmph_destroy(hash);
```
-------------------------------

```
 // Create minimal perfect hash function from in-disk keys using BMZ algorithm
 #include <cmph.h>
 ...
 
 //Open file with newline separated list of keys
 FILE *fd = fopen("keysfile_newline_separated", "r");
 //check for errors
 //...
 
 cmph_config_t *config = cmph_config_new(cmph_io_nlfile_adapter(fd));
 cmph_config_set_algo(config, CMPH_BMZ);
 cmph_t *hash = cmph_new(config);
 cmph_config_destroy(config);
 fclose(fd);
 
 //Find key
 const char *key = "sample key";
 unsigned int id = cmph_search(hash, key);
 
 //Destroy hash
 cmph_destroy(hash);
```
--------------------------------------

==The cmph application==

cmph is the name of both the library and the utility
application that comes with this package. You can use the cmph
application for constructing minimal perfect hash functions from the command line. 
The cmph utility
comes with a number of flags, but it is very simple to create and to query 
minimal perfect hash functions:

```
 $ # Using the chm algorithm (default one) for constructing a mphf for keys in file keys_file
 $ ./cmph -g keys_file
 $ # Query id of keys in the file keys_query
 $ ./cmph -m keys_file.mph keys_query
```

The additional options let you set most of the parameters you have
available through the C API. Below you can see the full help message for the 
utility.


```
 usage: cmph [-v] [-h] [-V] [-k nkeys] [-f hash_function] [-g [-c value][-s seed] ] [-m file.mph] [-a algorithm] keysfile
 Minimum perfect hashing tool

   -h     print this help message
   -c     c value that determines the number of vertices in the graph
   -a     algorithm - valid values are
           * bmz
           * chm
   -f     hash function (may be used multiple times) - valid values are
           * djb2
           * fnv
           * jenkins
           * sdbm
   -V     print version number and exit
   -v     increase verbosity (may be used multiple times)
   -k     number of keys
   -g     generation mode
   -s     random seed
   -m     minimum perfect hash function file
   keysfile       line separated file with keys
```

==Additional Documentation==

[FAQ faq.html]

==Downloads==

Use the project page at sourceforge: http://sf.net/projects/cmph


==License Stuff==

Code is under the LGPL. 
----------------------------------------

%!include: FOOTER.t2t

%!include(html): ''LOGO.t2t''
Last Updated: %%date(%c)