From 70796d938381baa25e84695cd454fd97f23ed4e7 Mon Sep 17 00:00:00 2001 From: davi Date: Thu, 27 Jan 2005 00:04:11 +0000 Subject: [PATCH] Added gperf notes. --- BMZ.t2t | 3 --- CHM.t2t | 3 --- COMPARISON.t2t | 3 --- GPERF.t2t | 40 ++++++++++++++++++++++++++++++++++++++++ README | 8 ++++---- README.t2t | 7 +++---- gendocs | 1 + 7 files changed, 48 insertions(+), 17 deletions(-) create mode 100644 GPERF.t2t diff --git a/BMZ.t2t b/BMZ.t2t index eea2831..4a9e2bf 100644 --- a/BMZ.t2t +++ b/BMZ.t2t @@ -21,6 +21,3 @@ Enjoy! Davi de Castro Reis Fabiano Cupertino Botelho - -%!include(html): ''LOGO.html'' -Last Updated: %%date(%c) diff --git a/CHM.t2t b/CHM.t2t index ae1fef5..d53dab8 100644 --- a/CHM.t2t +++ b/CHM.t2t @@ -19,6 +19,3 @@ Enjoy! Davi de Castro Reis Fabiano Cupertino Botelho - -%!include(html): ''LOGO.html'' -Last Updated: %%date(%c) diff --git a/COMPARISON.t2t b/COMPARISON.t2t index bc0fbb2..b6f9226 100644 --- a/COMPARISON.t2t +++ b/COMPARISON.t2t @@ -22,6 +22,3 @@ Enjoy! Davi de Castro Reis Fabiano Cupertino Botelho - -%!include(html): ''LOGO.html'' -Last Updated: %%date(%c) diff --git a/GPERF.t2t b/GPERF.t2t new file mode 100644 index 0000000..9c3fa4b --- /dev/null +++ b/GPERF.t2t @@ -0,0 +1,40 @@ +GPERF versus CMPH + + + +You might ask why cmph if [gperf http://www.gnu.org/software/gperf/gperf.html] +already works perfectly. Actually, gperf and cmph have different goals. +Basically, these are the requirements for each of them: + + +- GPERF + + - Create very fast hash functions for **small** sets + + - Create **perfect** hash functions + +- CMPH + + - Create very fast hash function for **very large** sets + + - Create **minimal perfect** hash functions + +As result, cmph can be used to create hash functions where gperf would run +forever without finding a perfect hash function, because of the running +time of the algorithm and the large memory usage. +On the other side, functions created by cmph are about 2x slower than those +created by gperf. + +So, if you have large sets, or memory usage is a key restriction for you, stick +to cmph. If you have small sets, and do not care about memory usage, go with +gperf. The first problem is common in the information retrieval field (e.g. +assigning ids to millions of documents), while the former is usually found in +the compiler programming area (detect reserved keywords). + +---------------------------------------- +[Home index.html] +---------------------------------------- + +Davi de Castro Reis + +Fabiano Cupertino Botelho diff --git a/README b/README index 8fc9cd9..f7a02e4 100644 --- a/README +++ b/README @@ -1,6 +1,6 @@ CMPH - C Minimal Perfect Hashing Library ----------------------------------------- +------------------------------------------------------------------- Description @@ -11,8 +11,8 @@ production-quality and fast API. The library is designed to work with big entrie can not fit in the main memory. It has been used successfully for constructing minimal perfect hashing functions for sets with more than 100 million of keys. Although there is a lack of similar libraries -in the free software world, we can point out some of the distinguishable -features of cmph: +in the free software world (gperf is a bit different (gperf.html)), we can point out some +of the distinguishable features of cmph: - Fast. - Space-efficient with main memory usage carefully documented. @@ -162,7 +162,7 @@ Davi de Castro Reis Fabiano Cupertino Botelho -Last Updated: Tue Jan 25 18:43:38 2005 +Last Updated: Wed Jan 26 22:37:36 2005 diff --git a/README.t2t b/README.t2t index d03151f..8fb99cf 100644 --- a/README.t2t +++ b/README.t2t @@ -3,7 +3,7 @@ CMPH - C Minimal Perfect Hashing Library %!includeconf: CONFIG.t2t ----------------------------------------- +------------------------------------------------------------------- **Description** @@ -14,8 +14,8 @@ production-quality and fast API. The library is designed to work with big entrie can not fit in the main memory. It has been used successfully for constructing minimal perfect hashing functions for sets with more than 100 million of keys. Although there is a lack of similar libraries -in the free software world, we can point out some of the distinguishable -features of cmph: +in the free software world ([gperf is a bit different gperf.html]), we can point out some +of the distinguishable features of cmph: - Fast. - Space-efficient with main memory usage carefully documented. @@ -28,7 +28,6 @@ features of cmph: - Well encapsulated API aiming binary compatibility through releases. - Free Software. - ---------------------------------------- **Supported Algorithms** diff --git a/gendocs b/gendocs index 85ead77..71021f7 100755 --- a/gendocs +++ b/gendocs @@ -2,4 +2,5 @@ txt2tags -t html -i README.t2t -o index.html txt2tags -t html -i BMZ.t2t -o bmz.html txt2tags -t html -i CHM.t2t -o chm.html txt2tags -t html -i COMPARISON.t2t -o comparison.html +txt2tags -t html -i GPERF.t2t -o gperf.html txt2tags -t txt -i README.t2t -o README