Pre-release comments.

2012-06-03 04:17:14 -03:00
parent 0d7a176458
commit cc42ab3b74
4 changed files with 41 additions and 13 deletions
--- a/cxxmph/mph_index.h
+++ b/cxxmph/mph_index.h
@@ -10,16 +10,12 @@
 // This is a pretty uncommon data structure, and if you application has a real
 // use case for it, chances are that it is a real win. If all you are doing is
 // a straightforward implementation of an in-memory associative mapping data
-// structure, then it will probably be slower, since that the
-// evaluation of index() is typically slower than the total cost of running a
-// traditional hash function over a key and doing 2-3 conflict resolutions on
-// 100byte-ish strings. If you still want to do, take a look at mph_map.h
+// structure, then it will probably be slower. Take a look at mph_map.h
 // instead.
 //
 // Thesis presenting this and similar algorithms:
 // http://homepages.dcc.ufmg.br/~fbotelho/en/talks/thesis2008/thesis.pdf
 //
-//
 // Notes:
 //
 // Most users can use the SimpleMPHIndex wrapper instead of the MPHIndex which
--- a/cxxmph/mph_map.h
+++ b/cxxmph/mph_map.h
@@ -3,15 +3,25 @@
 // Implementation of the unordered associative mapping interface using a
 // minimal perfect hash function.
 //
-// This class not necessarily faster than unordered_map (or ext/hash_map).
-// Benchmark your code before using it. If you do not call rehash() before
-// starting your reads, it will be very likely slower than unordered_map.
+// Since these are header-mostly libraries, make sure you compile your code
+// with -DNDEBUG and -O3. The code requires a modern C++11 compiler.
+//
+// The container comes in 3 flavors, all in the cxxmph namespace and drop-in
+// replacement for the popular classes with the same names.
+// * dense_hash_map
+//    -> fast, uses more memory, 2.93 bits per bucket, ~50% occupation
+// * unordered_map (aliases:  hash_map, mph_map)
+//    -> middle ground, uses 2.93 bits per bucket, ~81% occupation
+// * sparse_hash_map -> slower, uses 3.6 bits per bucket
+//    -> less fast, uses 3.6 bits per bucket, 100% occupation
+//
+// Those classes are not necessarily faster than their existing counterparts.
+// Benchmark your code before using it. The larger the key, the larger the
+// number of elements inserted, and the bigger the number of failed searches,
+// the more likely those classes will outperform existing code.
 //
 // For large sets of urls (>100k), which are a somewhat expensive to compare, I
-// found this class to be about 10%-30% faster than unordered_map.
-//
-// The space overhead of this map is 2.6 bits per bucket and it achieves 100%
-// occupation with a rehash call.
+// found those class to be about 10%-50% faster than unordered_map.

 #include <algorithm>
 #include <iostream>