Menu

LRU Cache-Like Array-Based Linked List

2023-02-20 - General Programming

In some circumstances, we may need to implement a limited-size list that works like LRU cache. Most implementations are done by recording the latest time in the each element of array. This works in a linear time complexity, O(n), in most operations (adding entries, searching for latest entries, etc.), which is not the best solution.

The reason that recording the latest time of each element will take linear running time in most operations is that we must search the array linearly. The array is out-of-order. In this regard, we can implement this with a new kind of data structure that works like LRU cache. The approach is done via linked list. Each node of the linked-list would look like:

Overview

typedef struct _lru_cache_list
{
    struct _lru_cache_list *next;
    struct _lru_cache_list *prev;
    // Data...
}lru_cache_list,*lru_cache_list_p;

lru_cache_list_p list_pool,list_head,list_tail;

This doubly-linked list is acyclic, meaning that this does not work like a ring. The head node must have prev being null and the tail node must have next being null. The head node represents the most recent used node while the tail node represents the least recent used node or an unused node.

On initialization, you will allocate an array for the pool that represents the cache.
On finalization, you just need to free the pool.

Insertion

According to the LRU logic, the least-recent used node or unused node will be replaced by the new node.
In other words, the tail node will be moved ahead of the head node.

void insert(...)
{
    // Pick the oldest node.
    lru_cache_list_p node=tail,tailx=tail->prev;
    // Set the node up.
    node->prev=null;
    // Remove it from the tail.
    tail->next=head;
    tail->prev=null;
    // Reset the tail.
    tailx->next=null;
    tail=tailx;
    // Reset the head.
    head->prev=node;
    head=node;
    // Set the data....
}

Reference

Referencing a node means it is being used. In other words, the recently referenced node must be placed at the head.

void reference(lru_cache_list_p node)
{
    // Remove the node from the middle.
    if(node->next)node->next->prev=node->prev;
    if(node->prev)node->prev->next=node->next;
    // Place the node at the head.
    node->prev=null;
    node->next=head;
    // Reset the head.
    head->prev=node;
    head=node;
}

Removal

Removing a node means it is no longer used. In other words, a removed node must be placed at the tail. If you transfer the data into somewhere else during the removal, this operation becomes eviction.

void remove(lru_cache_list_p node)
{
    // Remove the node from the middle.
    if(node->next)node->next->prev=node->prev;
    if(node->prev)node->prev->next=node->next;
    // Place the node at the tail.
    node->next=null;
    node->prev=tail;
    // Reset the head.
    tail->next=node;
    tail=node;
}

Search

Unfortunately, due to the nature of linked-list and out-or-order array, searching a node with specific data takes linear running time. However, it is perfect at keeping track of the ages of each node without recording the latest time.

AVL Cache-List

We may further extend the data structure of cache list. For example, AVL-tree can be implemented:

typedef struct _avl_cache_list
{
    // AVL tree structure
    struct _avl_cache_list *bst_parent;
    struct _avl_cache_list *bst_left;
    struct _avl_cache_list *bst_right;
    size_t height;
    // Cache List structure
    struct _avl_cache_list *cl_next;
    struct _avl_cache_list *cl_prev;
    // Data Content
}avl_cache_list,*avl_cache_list_p;

// Allocation goes here.
avl_cache_list_p node_pool,root,head,tail;

However, due to the nature of AVL-tree, insert operation takes logarithmic running time complexity, rather than the constant running time complexity in the original design.

Bucket Cache-List

We may extend the data structure of the cache list into bucket based so that searching a piece of data would take constant running time complexity. Using buckets in implementing cache list will increase the loss rate due to hash collisions.

typedef struct _lru_cache_list
{
    struct _lru_cache_list *next;
    struct _lru_cache_list *prev;
    // Data...
}lru_cache_list,*lru_cache_list_p;

lru_cache_list_p list_pool,list_head,list_tail;
lru_cache_list_p list_bucket[];

You may notice we added list_bucket array that contains a few pointers to nodes. In order to search, you will hash the data and use the hash value as the index to find the node from this bucket array.
We can also make it n-way set associative like a real modern cache implementation by adding a few bucket arrays. This reduces loss rate but increases running time to search.

Insertion will hash the data being inserted, and replace the data in the bucket. This means the original node stored in the bucket will be placed to the tail. It takes constant running time.

Reference has nothing to do with the bucket.

Leave a Reply

Your email address will not be published. Required fields are marked *