General Kernel Hash Table Based on list_head

Because the struct list_head in the Linux kernel has defined a prev pointer to a predecessor and a next pointer to a successor, and provides related linked list operation methods, in order to facilitate reuse, this article encapsulates and implements a kind of use on The universal kernel hash table glib_htable , which resolves conflicts by chain method , provides six operations of initialization, addition, lookup, deletion, emptying, and destruction. Except for initialization and destruction, other operations are synchronized, which is suitable for interrupt and process context. It is different from the general general Hash table (such as hash_map in C ++ and some generic hash tables implemented in C language):

● The object stored in glib_htable is created externally instead of internally. This object must be directly or indirectly I have combined the list_head members (indirect combination, including glib_hentry below ), and here I use the term combination in UML to emphasize that it is not an aggregation relationship.
● The semantics of the delete operation is to remove the link of the object from the hash table, but releasing the object is optional.
● The number of buckets is specified externally rather than internally.
In summary, it can be seen that glib_htable is linked to the Hash table using the existing embedded member list_head of the object. Compared with the general Hash table, each table entry saves 1 pointer space, as shown below.

This is image title

Structure definition

struct glib_hentry {
     struct list_head list;
     void *data;
 };
 
 typedef unsigned int (*glib_htable_hashfun_t)(const void*,unsigned int);
 typedef int (*glib_htable_cmpfun_t)(const void*, const void*);
 typedef void (*glib_htable_cbfun_t)(struct glib_hentry*);
 typedef void (*glib_htable_freefun_t)(struct glib_hentry*);

struct glib_htable {
    struct list_head *bucket;
    unsigned int size;
    unsigned int vmalloced;
    
    rwlock_t lock;
    
    glib_htable_hashfun_t hashfun;
    glib_htable_cmpfun_t cmpfun;
    glib_htable_cbfun_t cbfun;
    glib_htable_freefun_t freefun;
};
  1. glibhentry abstracts the embedded members of the storage object, representing the Hash item, or the entire object. At this time, the embedded member is the object itself, and the member data represents any data associated with the object, which is used to calculate the hash value. When the associated data When the size is <= sizeof (void *), you can directly cast to data storage instead of the data address.
  2. glibhtable abstracts the Hash table, and size indicates the number of buckets. Considering that the size may be large and requires large blocks of memory, so in the case of failure to allocate continuous physical pages, use vmalloc to try to allocate discontinuous physical pages, so it is introduced Vmalloced indicates the allocation mode, non-zero indicates vmalloc, zero is __get_free_pages; hashfun and cmpfun are two indispensable key functions to implement the Hash table. cbfun is used to find callback processing when it is successful, such as printing, increasing reference count, etc. freefun is used to release objects. This callback interface is provided for convenience Objects can be released after being removed from the Hash table without having to be released externally, increasing flexibility.

main interface

represents the glib_htable object.
● initialization
int glib_htable_init(struct glib_htable *ht, unsigned int size, glib_htable_hashfun_t hashfun, glib_htable_cmpfun_t cmpfun);
size indicates the number of hash table buckets, hashfun is a hash function, cmpfun is a comparison function; 0 is returned on success, cbfun and freefun on the ht member are set to empty, and ENOMEM is returned on failure. Since vmalloc may be used to allocate memory, it cannot be used in an interrupt context.
● increase

void glib_htable_add(struct glib_htable *ht, struct glib_hentry *he, int num);

Add multiple objects in one synchronization, he is a pointer to the hash item of the object, and num is the number.

● Find

struct glib_hentry* glib_htable_get(struct glib_htable *ht, const void *data);
struct glib_hentry* glib_htable_rget(struct glib_htable *ht, const void *data);
struct glib_hentry* glib_htable_cget(struct glib_htable *ht, const void *data, int(*cmp)(const struct glib_hentry*, void*), void *arg);
struct glib_hentry* glib_htable_crget(struct glib_htable *ht, const void *data, int(*cmp)(const struct glib_hentry*, void*), void *arg);
struct glib_hentry* glib_htable_cget_byidx(struct glib_htable *ht, unsigned int *bucket, int(*cmp)(const struct glib_hentry*, void*), void *arg);
struct glib_hentry* glib_htable_crget_byidx(struct glib_htable *ht, unsigned int *bucket, int(*cmp)(const struct glib_hentry*, void*), void *arg);

From top to bottom: forward search, reverse search, forward condition search, reverse condition search, forward condition search by bucket positioning, reverse condition search by bucket positioning, data is object-associated data, and cmp is Custom comparison function, arg is the custom parameter that cmp brings, bucket is the bucket index, and if the lookup is successful, the bucket is updated to the bucket index where the object is located. All the above operations return NULL when it fails.

● Delete

void glib_htable_del(struct glib_htable *ht, struct glib_hentry *he, int num);
void glib_htable_del_bydata(struct glib_htable *ht, const void **data, int num);

The first deletes by object hash item, and the second deletes by object-related data. Num represents the number. If the ht member freefun is not empty, the object is released.

● Empty

void glib_htable_clear(struct glib_htable *ht);

Delete all objects in one synchronization. If the ht member freefun is not empty, the objects are released.

● Destroy
void glib_htable_free(struct glib_htable *ht);
Only release the memory occupied by all buckets, which should be called after glib_htable_clear. Since it is possible to free memory with vfree, it cannot be used in interrupt context.

Interface implementation

Other interface implementation is relatively simple, skip the explanation. For the lookup interface, if a parameter is added to indicate the traversal direction, then although the total number of interfaces is halved, when using it, especially in a loop, unnecessary direction judgment is performed every time and performance is reduced, so for the forward direction And reverse traversal, each gives an interface, just like strchr and strrchr in the c library, iterator and reverse_iterator in the c ++ container, which makes it clearer. Except for different traversal directions, the other codes are the same, so in order to avoid manual coding redundancy, 3 sets of macros are used to generate.

Helper function macro generation

#define DEFINE_GLIB_HTABLE_GET_HELP(name) \
static struct glib_hentry* __glib_htable_##name(struct glib_htable *ht, unsigned int hash, const void *data)  \
{\
     struct glib_hentry *he; \
\
    glib_htable_list_##name(he,&ht->bucket[hash],list){ \
        if(ht->cmpfun(he->data,data)){ \
            if(ht->cbfun) \
                ht->cbfun(he); \
           return he; \
       } \
   } \
\
    return NULL; \
}

DEFINE_GLIB_HTABLE_GET_HELP(get)
DEFINE_GLIB_HTABLE_GET_HELP(rget)

#define DEFINE_GLIB_HTABLE_COND_GET_HELP(name) \
static struct glib_hentry* __glib_htable_c##name(struct glib_htable *ht, unsigned int hash, int(*cmp)(const struct glib_hentry*, void*), void *arg) \
{ \
    struct glib_hentry *he; \
\
    glib_htable_list_##name(he,&ht->bucket[hash],list){ \
        if(cmp(he, arg)){ \
            if(ht->cbfun) \
                ht->cbfun(he); \
            return he; \
        } \
    } \
\
    return NULL; \
}

DEFINE_GLIB_HTABLE_COND_GET_HELP(get)
DEFINE_GLIB_HTABLE_COND_GET_HELP(rget)

The generated macros are DEFINE_GLIB_HTABLE_GET_HELP and DEFINE_GLIB_HTABLE_COND_GET_HELP. After expansion, there are __glib_htable_get(rget) and __glib_htable_cget(crget). 4 unlocked functions are used to implement the corresponding lock interface. glib_htable_list_get and glib_htable_list_rget are aliases for the macros list_for_each_entry and list_for_each_entry_reverse, respectively.

Normal find macro generation

#define DEFINE_GLIB_HTABLE_GET(name) \
struct glib_hentry* glib_htable_##name(struct glib_htable *ht, const void *data) \
{ \
    struct glib_hentry *he; \
    unsigned int h = ht->hashfun(data,ht->size); \
\
    read_lock_bh(&ht->lock); \
    he = __glib_htable_##name(ht, h, data); \
     read_unlock_bh(&ht->lock); \
\
    return he; \
}

DEFINE_GLIB_HTABLE_GET(get)
DEFINE_GLIB_HTABLE_GET(rget)

The auxiliary function __glib_htable_get(rget) is called to implement, and the generated macro is DEFINE_GLIB_HTABLE_GET. After expansion, there is the glib_htable_get(rget) interface.

Conditional search macro generation

#define DEFINE_GLIB_HTABLE_COND_GET(name) \
struct glib_hentry* glib_htable_c##name(struct glib_htable *ht, const void *data, int(*cmp)(const struct glib_hentry*, void*), void *arg) \
{ \
    struct glib_hentry *he;    \
    unsigned int h = ht->hashfun(data,ht->size); \
\
    read_lock_bh(&ht->lock); \
    he = __glib_htable_c##name(ht, h, cmp, arg); \
    read_unlock_bh(&ht->lock); \
\
    return he; \
}

DEFINE_GLIB_HTABLE_COND_GET(get)
DEFINE_GLIB_HTABLE_COND_GET(rget)

#define DEFINE_GLIB_HTABLE_COND_GET_BYIDX(name) \
struct glib_hentry* glib_htable_c##name##_byidx(struct glib_htable *ht, unsigned int *bucket, int(*cmp)(const struct glib_hentry*, void*), void *arg) \
{ \
    unsigned int h; \
    struct glib_hentry *he = NULL; \
\
    read_lock_bh(&ht->lock); \
\
    for (h = *bucket; h < ht->size; h = (*bucket)++){ \
        he = __glib_htable_c##name(ht, h, cmp, arg); \
        if(he) \
            break; \
    } \
\
    read_unlock_bh(&ht->lock); \
\
    return he; \
}

DEFINE_GLIB_HTABLE_COND_GET_BYIDX(get)
DEFINE_GLIB_HTABLE_COND_GET_BYIDX(rget)
The former calls the helper function `__glib_htable_cget(rget)`, and the generated macro is DEFINE_GLIB_HTABLE_COND_GET. After expansion, there is the glib_htable_cget(rget) interface; the latter calls the helper function __glib_htable_cget(rget) _byidx, and the macro is DEFINE_GLIB_HTABLE_COND. After expansion, it is expanded glib_htable_cget(rget) _byidx interface.

Full source download: glib_hash , including glib_htable.h and glib_htable.c files.

#algorithm #linux #c++

General Kernel Hash Table Based on list_head
5.25 GEEK