111 lines
4.1 KiB
HTML
111 lines
4.1 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
|
<HTML>
|
|
<HEAD>
|
|
<META NAME="generator" CONTENT="http://txt2tags.org">
|
|
<LINK REL="stylesheet" TYPE="text/css" HREF="DOC.css">
|
|
<TITLE>FCH Algorithm</TITLE>
|
|
</HEAD><BODY BGCOLOR="white" TEXT="black">
|
|
<CENTER>
|
|
<H1>FCH Algorithm</H1>
|
|
</CENTER>
|
|
|
|
|
|
<HR NOSHADE SIZE=1>
|
|
|
|
<H2>The Algorithm</H2>
|
|
|
|
<P>
|
|
The algorithm is presented in <A HREF="#papers">[1</A>].
|
|
</P>
|
|
|
|
<HR NOSHADE SIZE=1>
|
|
|
|
<H2>Memory Consumption</H2>
|
|
|
|
<P>
|
|
Now we detail the memory consumption to generate and to store minimal perfect hash functions
|
|
using the FCH algorithm. The structures responsible for memory consumption are in the
|
|
following:
|
|
</P>
|
|
|
|
<UL>
|
|
<LI>A vector containing all the <I>n</I> keys.
|
|
<LI>Data structure to speed up the searching step:
|
|
<OL>
|
|
<LI><B>random_table</B>: is a vector used to remember currently empty slots in the hash table. It stores <I>n</I> 4 byte long integer numbers. This vector initially contains a random permutation of the <I>n</I> hash addresses. A pointer called filled_count is used to keep the invariant that any slots to the right side of filled_count (inclusive) are empty and any ones to the left are filled.
|
|
<LI><B>hash_table</B>: Table used to check whether all the collisions were resolved. It has <I>n</I> entries of one byte.
|
|
<LI><B>map_table</B>: For any unfilled slot <I>x</I> in hash_table, the map_table vector contains <I>n</I> 4 byte long pointers pointing at random_table such that random_table[map_table[x]] = x. Thus, given an empty slot x in the hash_table, we can locate its position in the random_table vector through map_table.
|
|
<P></P>
|
|
</OL>
|
|
<LI>Other auxiliary structures
|
|
<OL>
|
|
<LI><B>sorted_indexes</B>: is a vector of <I>cn/(log(n) + 1)</I> 4 byte long pointers to indirectly keep the buckets sorted by decreasing order of their sizes.
|
|
<P></P>
|
|
<LI><B>function <I>g</I></B>: is represented by a vector of <I>cn/(log(n) + 1)</I> 4 byte long integer numbers, one for each bucket. It is used to spread all the keys in a given bucket into the hash table without collisions.
|
|
</OL>
|
|
</UL>
|
|
|
|
<P>
|
|
Thus, the total memory consumption of FCH algorithm for generating a minimal
|
|
perfect hash function (MPHF) is: <I>O(n) + 9n + 8cn/(log(n) + 1)</I> bytes.
|
|
The value of parameter <I>c</I> must be greater than or equal to 2.6.
|
|
</P>
|
|
<P>
|
|
Now we present the memory consumption to store the resulting function.
|
|
We only need to store the <I>g</I> function and a constant number of bytes for the seed of the hash functions used in the resulting MPHF. Thus, we need <I>cn/(log(n) + 1) + O(1)</I> bytes.
|
|
</P>
|
|
|
|
<HR NOSHADE SIZE=1>
|
|
|
|
<A NAME="papers"></A>
|
|
<H2>Papers</H2>
|
|
|
|
<OL>
|
|
<LI>E.A. Fox, Q.F. Chen, and L.S. Heath. <A HREF="papers/fch92.pdf">A faster algorithm for constructing minimal perfect hash functions.</A> In Proc. 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 266-273, 1992.
|
|
</OL>
|
|
|
|
<HR NOSHADE SIZE=1>
|
|
|
|
<TABLE ALIGN="center" CELLPADDING="4">
|
|
<TR>
|
|
<TD><A HREF="index.html">Home</A></TD>
|
|
<TD><A HREF="chd.html">CHD</A></TD>
|
|
<TD><A HREF="bdz.html">BDZ</A></TD>
|
|
<TD><A HREF="bmz.html">BMZ</A></TD>
|
|
<TD><A HREF="chm.html">CHM</A></TD>
|
|
<TD><A HREF="brz.html">BRZ</A></TD>
|
|
<TD><A HREF="fch.html">FCH</A></TD>
|
|
</TR>
|
|
</TABLE>
|
|
|
|
<HR NOSHADE SIZE=1>
|
|
|
|
<P>
|
|
Enjoy!
|
|
</P>
|
|
<P>
|
|
<A HREF="mailto:davi@users.sourceforge.net">Davi de Castro Reis</A>
|
|
</P>
|
|
<P>
|
|
<A HREF="mailto:db8192@users.sourceforge.net">Djamel Belazzougui</A>
|
|
</P>
|
|
<P>
|
|
<A HREF="mailto:fc_botelho@users.sourceforge.net">Fabiano Cupertino Botelho</A>
|
|
</P>
|
|
<P>
|
|
<A HREF="mailto:nivio@dcc.ufmg.br">Nivio Ziviani</A>
|
|
</P>
|
|
<script type="text/javascript">
|
|
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
|
|
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
|
|
</script>
|
|
<script type="text/javascript">
|
|
try {
|
|
var pageTracker = _gat._getTracker("UA-7698683-2");
|
|
pageTracker._trackPageview();
|
|
} catch(err) {}</script>
|
|
|
|
<!-- html code generated by txt2tags 2.6 (http://txt2tags.org) -->
|
|
<!-- cmdline: txt2tags -t html -i FCH.t2t -o docs/fch.html -->
|
|
</BODY></HTML>
|