turbonss/deps/cmph/docs/fch.html

111 lines
4.1 KiB
HTML
Raw Permalink Normal View History

2018-12-29 03:53:09 +02:00
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<META NAME="generator" CONTENT="http://txt2tags.org">
<LINK REL="stylesheet" TYPE="text/css" HREF="DOC.css">
<TITLE>FCH Algorithm</TITLE>
</HEAD><BODY BGCOLOR="white" TEXT="black">
<CENTER>
<H1>FCH Algorithm</H1>
</CENTER>
<HR NOSHADE SIZE=1>
<H2>The Algorithm</H2>
<P>
The algorithm is presented in <A HREF="#papers">[1</A>].
</P>
<HR NOSHADE SIZE=1>
<H2>Memory Consumption</H2>
<P>
Now we detail the memory consumption to generate and to store minimal perfect hash functions
using the FCH algorithm. The structures responsible for memory consumption are in the
following:
</P>
<UL>
<LI>A vector containing all the <I>n</I> keys.
<LI>Data structure to speed up the searching step:
<OL>
<LI><B>random_table</B>: is a vector used to remember currently empty slots in the hash table. It stores <I>n</I> 4 byte long integer numbers. This vector initially contains a random permutation of the <I>n</I> hash addresses. A pointer called filled_count is used to keep the invariant that any slots to the right side of filled_count (inclusive) are empty and any ones to the left are filled.
<LI><B>hash_table</B>: Table used to check whether all the collisions were resolved. It has <I>n</I> entries of one byte.
<LI><B>map_table</B>: For any unfilled slot <I>x</I> in hash_table, the map_table vector contains <I>n</I> 4 byte long pointers pointing at random_table such that random_table[map_table[x]] = x. Thus, given an empty slot x in the hash_table, we can locate its position in the random_table vector through map_table.
<P></P>
</OL>
<LI>Other auxiliary structures
<OL>
<LI><B>sorted_indexes</B>: is a vector of <I>cn/(log(n) + 1)</I> 4 byte long pointers to indirectly keep the buckets sorted by decreasing order of their sizes.
<P></P>
<LI><B>function <I>g</I></B>: is represented by a vector of <I>cn/(log(n) + 1)</I> 4 byte long integer numbers, one for each bucket. It is used to spread all the keys in a given bucket into the hash table without collisions.
</OL>
</UL>
<P>
Thus, the total memory consumption of FCH algorithm for generating a minimal
perfect hash function (MPHF) is: <I>O(n) + 9n + 8cn/(log(n) + 1)</I> bytes.
The value of parameter <I>c</I> must be greater than or equal to 2.6.
</P>
<P>
Now we present the memory consumption to store the resulting function.
We only need to store the <I>g</I> function and a constant number of bytes for the seed of the hash functions used in the resulting MPHF. Thus, we need <I>cn/(log(n) + 1) + O(1)</I> bytes.
</P>
<HR NOSHADE SIZE=1>
<A NAME="papers"></A>
<H2>Papers</H2>
<OL>
<LI>E.A. Fox, Q.F. Chen, and L.S. Heath. <A HREF="papers/fch92.pdf">A faster algorithm for constructing minimal perfect hash functions.</A> In Proc. 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 266-273, 1992.
</OL>
<HR NOSHADE SIZE=1>
<TABLE ALIGN="center" CELLPADDING="4">
<TR>
<TD><A HREF="index.html">Home</A></TD>
<TD><A HREF="chd.html">CHD</A></TD>
<TD><A HREF="bdz.html">BDZ</A></TD>
<TD><A HREF="bmz.html">BMZ</A></TD>
<TD><A HREF="chm.html">CHM</A></TD>
<TD><A HREF="brz.html">BRZ</A></TD>
<TD><A HREF="fch.html">FCH</A></TD>
</TR>
</TABLE>
<HR NOSHADE SIZE=1>
<P>
Enjoy!
</P>
<P>
<A HREF="mailto:davi@users.sourceforge.net">Davi de Castro Reis</A>
</P>
<P>
<A HREF="mailto:db8192@users.sourceforge.net">Djamel Belazzougui</A>
</P>
<P>
<A HREF="mailto:fc_botelho@users.sourceforge.net">Fabiano Cupertino Botelho</A>
</P>
<P>
<A HREF="mailto:nivio@dcc.ufmg.br">Nivio Ziviani</A>
</P>
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
try {
var pageTracker = _gat._getTracker("UA-7698683-2");
pageTracker._trackPageview();
} catch(err) {}</script>
<!-- html code generated by txt2tags 2.6 (http://txt2tags.org) -->
<!-- cmdline: txt2tags -t html -i FCH.t2t -o docs/fch.html -->
</BODY></HTML>