diff options
Diffstat (limited to 'doc/technical/hostmask.txt')
-rw-r--r-- | doc/technical/hostmask.txt | 136 |
1 files changed, 136 insertions, 0 deletions
diff --git a/doc/technical/hostmask.txt b/doc/technical/hostmask.txt new file mode 100644 index 0000000..892bc93 --- /dev/null +++ b/doc/technical/hostmask.txt @@ -0,0 +1,136 @@ +The Hostmask and Netmask System +Copyright(c) 2001 by Andrew Miller(A1kmm)<a1kmm@mware.virtualave.net> + +$Id$ +------------------------------------------------------------------------ + +Contents :: +============ +* Section 1: Motivation +* Section 2: Underlying Mechanism + - 2.1: General Overview + - 2.2: IPv4 Netmasks + - 2.3: IPv6 Netmasks + - 2.4: Hostmasks +* Section 3: Exposed Abstraction Layer + - 3.1: Parsing Masks + - 3.2: Adding Configuration Items + - 3.3: Initialising or Rehashing + - 3.4: Finding IP/Hostname Confs + - 3.5: Deleting Entries + - 3.6: Reporting Entries + +Section 1: Motivation +===================== + +Looking up configuration hostnames and IP addresses (such as for I-Lines +and K-Lines) needs to be implemented efficiently. It turns out a hash +based algorithm like that employed here performs very will on the average +case, which is what we should be the most concerned about. A profiling +comparison with the mtre code using data from a real network confirmed +that this algorithm performs much better. + + +Section 2: Underlying Mechanism +=============================== + +2.1: General Overview +--------------------- + +In short, a hash-table with linked lists for buckets is used to locate +the correct hostname/netmask entries. In order to support CIDR IPs and +wildcard masks, the entire key cannot be hashed, and there is a need to +rehash. The means for deciding how much to hash differs between the +hostmasks and IPv4/6 netmasks. + +2.2: IPv4 Netmasks +------------------ + +In order to hash IPv4 netmasks for addition to the hash, the mask is first +processed into a 32-bit address and a number of bits is used. All unused +bits are set to 0. The mask could be in these forms: + +1.2.3.4 => 1.2.3.4 : 32 +1.2.3.* => 1.2.3.0 : 24 +1.2.*.* => 1.2.0.0 : 16 +1.2.3.64/26 => 1.2.3.64 : 26 + +The number of whole bytes is then calculated, and only those bytes are +hashed (e.g. 1.2.3.64/26 and 1.2.3.0/24 hash the same). When a complete +IPv4 address is given so that an IPv4 match can be found the entire IP +address is first hashed, and then looked up in the table. Then the most +significant three bytes are hashed, followed by the most significant two, +the most significant one, and finally the "identity hash" bucket is +searched (to match masks like 192/7). + +2.3: IPv6 Netmasks +------------------ + +As per the IPv4 netmasks, except that instead of rehashing with one byte +granularity, a 16-bit (two byte) granularity is used, as 16 rehashes is +considered too great a fixed offset to be justified for a (possible) +slight reduction in hash collisions. + +2.4: Hostmasks +-------------- + +On adding a hostmask to the hash, all of the hostmask right of the next +dot after the last wildcard character in the string is hashed, or in the +case that there are no wildcards in the hostmask, the entire string is +hashed. + +On searching for a hostmask match, the entire hostname is hashed, followed +by the entire hostmask after the first dot, followed by the entire hostmask +after the second dot, and so on. Finally the "identity hash" bucket is checked +to catch hostnames like *test*. + +Section 3: Exposed Abstraction Layer +==================================== + +Section 3.1: Parsing Masks +-------------------------- + +Call "parse_netmask()" with the netmask and a pointer to an irc_inaddr +structure to be filled in, as well as a pointer to an integer where the +number of bits will be placed. + +Always check the return value, if it returns HM_MOST, it means that the +mask is probably a hostmask. If it returns HM_IPV4, it means it was an +IPv4 address. If it returns HM_IPV6, it means it was an IPv6 address. +If parse_netmask() returns HM_MOST however, no change is made to the +irc_inaddr structure or the number of bits. + +Section 3.2: Adding Configuration Items +--------------------------------------- + +Call "add_conf_by_address()" with the hostname or IP mask, the username, +and the ConfItem* to associate with this mask. + +Section 3.3: Initialising and Rehashing +--------------------------------------- + +To initialise, call "init_host_hash()". This only needs to be done once +on start-up. On rehash, to wipe out the old unwanted configuration, and +free them if there are no references to them, call +"clear_out_address_conf()". + +Section 3.4: Finding IP/Hostname Confs +--------------------------------------- + +Call "find_address_conf()" with the hostname, the username, the address, +the address family and the client-supplied password. To find a D-Line, +call "find_dline()" with the address and address family. + +Section 3.5: Deleted Entries +---------------------------- + +Call "delete_one_address_conf()" with the hostname and the ConfItem*. + +Section 3.6: Reporting Entries +------------------------------ + +Call "report_dlines()", "report_exemptlines()", "report_Klines()", or +"report_Ilines()" with the client pointer to report to. Note these walk +the hash, which is inefficient, but these are not called often enough +to justify the memory and maintenance clockcycles to for more efficient +data structuring. |