How ChatGPT and I Built an LRU Cache That Sped Up My App 10x

Last month, I was building a data visualization dashboard for my academic mini project when I hit a wall. Every time users interacted with the charts, my app was making the same expensive API calls over and over again. Response times were creeping beyond 3 seconds, and my professor was giving me that look – you know, the one that silently judges your life choices.

WARNING

This post may contain some slightly exaggerated conversations with ChatGPT. Things are a bit more dramatic when you're sleep-deprived and frantically coding at 2 AM. But hey, that's the magic of storytelling, right? 😜

It was midnight, three days before the deadline, when I finally broke down and decided to ask ChatGPT for help. Little did I know this would lead to a bizarre 4-hour coding adventure with my new AI buddy.

ChatGPT

My professor keeps making this annoyed face whenever he uses my dashboard app. It's slow AF. How do I fix it without rebuilding everything?

Ah, the classic 'my professor is judging my performance' situation. Been there! Without seeing your code, I'm guessing you're making duplicate API calls. Just add some caching!

Just add caching? That's your genius advice? 🙄 Thanks.

Ouch. Fine, let's get specific. What happens when your cache gets full? Eviction policy? Most devs mess that up. Ever heard of an LRU cache? It's like having a bouncer that kicks out whoever hasn't been to the party in the longest time.

The problem wasn't implementing any cache — it was figuring out what to do when the cache inevitably filled up. With limited memory on my deployment instance, I needed something smarter than a basic dictionary that would keep growing forever.

Table of Contents

The Epic Facepalm Moment
Why Your App Probably Needs Caching Too
The Deceptive Challenge: Making It Fast
Step-by-Step Implementation
The Performance Secret Sauce
Tech You Use Daily That Secretly Runs on LRU Caches
Advanced LRU Variants (That Can Make You Sound Smart)
What I Learned: Small Code, Big Impact
The Final Metric That Made My Professor Smile
Your Turn: Level Up This Code

The Epic Facepalm Moment

After some time, Googling "best caching algorithms," I found my answer: the LRU cache — the unsung hero behind pretty much every fast application we use daily. But when I shared my discovery with ChatGPT, it had the digital equivalent of a facepalm moment:

ChatGPT

So after some time of research, I discovered this amazing thing called an 'LRU cache'! Mind blown! 🤯

...that's literally what I suggested two days ago. This is why I don't have the capability to sigh audibly.

No, you just said 'add caching.' This is different! It's a specific algorithm that—

I specifically mentioned 'LRU cache' and the 'bouncer that kicks out whoever hasn't been to the party in the longest time.' But you know what? Let's pretend you discovered it. I'm just happy we're finally making progress.

What's wild is how simple yet brilliant the concept is:

Just kick out whatever hasn't been used in the longest time.

That's it. That's the whole magic trick. And suddenly my dashboard went from turtle-speed to cheetah-mode.

Why Your App Probably Needs Caching Too

We all take caching for granted, but it's low-key the reason the internet doesn't feel like dial-up in 2025:

Your browser doesn't re-download images when you hit the back button
Instagram doesn't fetch the same photos repeatedly as you scroll
Spotify doesn't stream the same song chunks multiple times
Redis, Memcached, and every modern database rely on it

# Example: a fibonacci function (with and without caching timeit)
import timeit

# Without caching
def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

print(timeit.timeit('fibonacci(10)', globals=globals())) # 12.990135460997408

# With caching
def fibonacci_cached(n, cache={}):
    if n in cache:
        return cache[n]
    if n <= 1:
        return n
    cache[n] = fibonacci_cached(n-1, cache) + fibonacci_cached(n-2, cache)
    return cache[n]

print(timeit.timeit('fibonacci_cached(10)', globals=globals())) # 0.09349425700202119

The speed difference isn't subtle either. We're talking memory access (~100ns) vs. network requests (~100ms) — that's a million times faster!

But here's the real question that stumped me: When your cache fills up, what do you throw away?

ChatGPT

Ok smarty-pants AI, when my cache is full, what should I throw out? The oldest stuff? The biggest items? Random selection?

Think about your kitchen. The pans you used yesterday are at the front of your cabinets. That fancy pasta maker you got 3 years ago is collecting dust in the back. Which would you move to storage first?

The pasta maker, obviously. But computers don't have a concept of 'back of the cabinet'...

...but they can track *when* something was last used. That's the whole point of LRU. If data hasn't been accessed recently, it's likely not needed soon. It works because reality has 'temporal locality'—things used recently tend to be used again soon.

Think about your kitchen. The pans and ingredients you used yesterday are probably sitting in the front of your cabinets, while that specialty cake mold you last used three years ago is collecting dust in the back. If you need to make space, which would you move to storage first?

The LRU cache operates on the same intuition. It keeps track of the order in which items were accessed and, when space is needed, it discards the least recently used items first.

This approach works remarkably well for many real-world access patterns because of a property called temporal locality — if a piece of data was accessed recently, it's likely to be accessed again soon.

The Deceptive Challenge: Making It Fast

Here's where I got tripped up. The concept is simple, but making it fast is tricky. We need three operations to be lightning quick:

Looking up data — $O(1)$ time
Marking something as "recently used" — $O(1)$ time
Finding and removing the oldest item — $O(1)$ time

ChatGPT

Let me guess, your first implementation was a Python list to keep track of access order?

...maybe. Why, is that bad?

Only if you enjoy watching your code run slower than a sloth on sedatives. Every time you access an item, you'd need to remove it from its position and put it at the front. That's O(n)—meaning with 1000 items, it's 1000x slower than it should be.

Damn. Now you tell me. So what's the solution?

The secret sauce: combine a hash map (dictionary) with a doubly linked list. Dictionary gives you O(1) lookups, linked list gives you O(1) insertions/removals at any position. It's like peanut butter and jelly—separately good, together magical.

The first thought would be a Python list to track order. Total disaster! Every time you access an item, you have to shuffle the entire list ( $O(n)$ operation). With just 1000 items, it was already painfully slow.

The elegant hack? Combine two data structures:

A hash map (dictionary) for instant lookups
A doubly linked list to track and update access order

This combo is like peanut butter and jelly — separately good, together magic.

Let's code this bad boy from scratch!

Step-by-Step Implementation

Step 1: Create the Building Block - The Node Class

First, let's create our Node class — the building block of our linked list:

ChatGPT

OK, let's start coding. How do we create the node class for the linked list?

```python class Node: def __init__(self, key, value): self.key = key # Store key so we can remove from dictionary later self.value = value # The actual data we're caching self.prev = None # Pointer to previous node self.next = None # Pointer to next node ```

Wait, why do we need to store the key in the node? Can't we just store the value?

Ah, good catch! When our cache is full and we remove the least recently used node from the linked list, we also need to remove its entry from the dictionary. Without the key, we'd have no way to find the matching dictionary entry. Consider it the node's ID badge.

class Node:
    def __init__(self, key, value):
        self.key = key        # Store key so we can remove from dictionary later
        self.value = value    # The actual data we're caching
        self.prev = None      # Pointer to previous node
        self.next = None      # Pointer to next node

👆 Nothing fancy here. Just a basic node with pointers in both directions.

Step 2: Build the Main LRU Cache Class

Now for the main event — our LRU Cache class with all its magic:

ChatGPT

So for the LRU cache class, I'm thinking we need methods for get(), put(), and to handle the eviction when it's full, right?

Correct! But let me share a pro tip: to avoid edge cases, create dummy head and tail nodes that are always empty. Then you never have to check if you're at the beginning or end of the list when adding/removing nodes.

Dummy nodes? Sounds like extra overhead...

Trust me, it makes the code cleaner, faster, and less prone to bugs. Imagine a dance floor with bouncers at both ends. The bouncer at the front knows who just arrived, the one at the back knows who's been there longest. They never leave, but everyone else moves between them.

class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity  # How many items we can store max
        self.cache = {}           # Our dictionary for O(1) lookups
        
        # Quick hack: Create dummy head and tail nodes
        # This eliminates edge cases when adding/removing nodes
        self.head = Node(0, 0)    # Dummy head
        self.tail = Node(0, 0)    # Dummy tail
        
        # Connect them
        self.head.next = self.tail
        self.tail.prev = self.head
    
    # Internal helper: Add a new node right after HEAD (most recently used)
    def _add_node(self, node):
        # Always add new node right after head (most recently used spot)
        node.prev = self.head
        node.next = self.head.next
        
        self.head.next.prev = node
        self.head.next = node
    
    # Internal helper: Remove a node from anywhere in the list
    def _remove_node(self, node):
        prev = node.prev
        next_node = node.next
        
        # Bypass the node
        prev.next = next_node
        next_node.prev = prev
    
    # Internal helper: Mark as recently used by moving to front
    def _move_to_head(self, node):
        self._remove_node(node)  # Take it out
        self._add_node(node)     # Put it at front
    
    # Internal helper: Remove the least recently used (just before TAIL)
    def _pop_tail(self):
        res = self.tail.prev  # The LRU item is right before tail
        self._remove_node(res)
        return res
    
    # Public API: Get a value (and mark it as recently used)
    def get(self, key):
        # Cache miss
        if key not in self.cache:
            return -1
        
        # Cache hit
        node = self.cache[key]
        self._move_to_head(node)  # Mark as recently used
        return node.value
    
    # Public API: Add or update a value
    def put(self, key, value):
        # If key exists, update value
        if key in self.cache:
            node = self.cache[key]
            node.value = value
            self._move_to_head(node)  # Mark as recently used
        else:
            # New key, need to add it
            new_node = Node(key, value)
            self.cache[key] = new_node  # Add to dictionary
            self._add_node(new_node)    # Add to linked list
            
            # Over capacity? Evict least recently used
            if len(self.cache) > self.capacity:
                lru = self._pop_tail()
                del self.cache[lru.key]  # Remove from dictionary too

ChatGPT

This is getting complex. How do we know it will actually work?

Any good data structure needs tests! Let's run through a small example with a cache of size 3 and see what happens as we add and access items.

I'm still not sure I understand how the eviction works...

Picture our linked list like a nightclub line. VIPs (recently used items) are near the entrance (HEAD), while forgotten folks (least recently used) drift toward the exit (TAIL). When someone enters, they go to the front. When someone already inside gets attention, they also move to the front. When we hit capacity, whoever's closest to the exit gets kicked out.

The beauty here is that every operation is O(1) time complexity. No searching, no shifting arrays — just pointer manipulation and dictionary lookups.

Step 3: See It In Action

Let's take our cache for a spin and watch the eviction magic happen:

# Create a tiny cache that holds just 3 items
cache = LRUCache(3)

# Fill it up
cache.put("api_result1", {"user": "john", "score": 95})
cache.put("api_result2", {"user": "sarah", "score": 98})
cache.put("api_result3", {"user": "mike", "score": 91})

# Access the first item - this marks it as recently used
print(cache.get("api_result1"))  # Output: {"user": "john", "score": 95}

# Add a 4th item - this should kick out api_result2 (least recently used)
cache.put("api_result4", {"user": "lisa", "score": 99})

# Verify that api_result2 got evicted
print(cache.get("api_result2"))  # Output: -1 (not in cache)

# Check other items (and change their "recently used" order)
print(cache.get("api_result1"))  # Still there!
print(cache.get("api_result3"))  # Still there!

# Add a 5th item - should kick out api_result4 (now the LRU)
cache.put("api_result5", {"user": "alex", "score": 93})

# Verify that api_result4 got evicted
print(cache.get("api_result4"))  # Output: -1 (not in cache)

ChatGPT

Hmm, I'm curious - shouldn't api_result3 be kicked out in that example? It was added before api_result1 got accessed...

Good catch! But remember, it's not about when items were *added*, it's about when they were last *accessed*. After adding the first 3 items, the order from most to least recently used was: api_result3, api_result2, api_result1. Then you accessed api_result1, moving it to the front: api_result1, api_result3, api_result2. So api_result2 became the least recently used.

Oh! That makes sense. This is pretty clever!

It is! And the best part? You can add hit/miss tracking to see how effective your cache is. Let me show you...

🧠 Pro Tip: In my real project, ChatGPT suggested adding hit/miss tracking:

def get(self, key):
    if key not in self.cache:
        self.misses += 1  # Track cache misses
        return -1
    
    self.hits += 1  # Track cache hits
    # ...rest of method...

After implementing this, I discovered my cache had a 92% hit rate! That's why my app suddenly felt so much faster.

Capacity

0/3 items

Cache State (Most Recent → Least Recent)

HEAD

→

TAIL

Cache is empty. Add items using the Put button.

Operation Log

No operations yet

The Performance Secret Sauce

ChatGPT

Can you summarize the time complexity for the different operations?

| Operation | Time Complexity | Why It's Fast | |-----------|-----------------|------------|| `get(key)` | O(1) | Dictionary lookup + pointer manipulation | | `put(key, value)` | O(1) | Same as get + possible O(1) eviction | | Memory Usage | O(capacity) | Strictly limited to capacity items |

That's impressive. The difference between O(1) and O(n) might not seem huge, but I guess with thousands of operations it adds up fast?

Exactly! Imagine if accessing Instagram photos required scanning through your entire photo history every time. With just 1000 photos, each action would be 1000x slower. That's why these seemingly small optimizations make the difference between 'this app feels snappy' and 'why is this thing so laggy?'

Here's why this implementation makes my app lightning fast:

Operation	Time Complexity	Why It's Fast
`get(key)`	O(1)	Dictionary lookup + pointer manipulation
`put(key, value)`	O(1)	Same as get + possible O(1) eviction
Memory Usage	O(capacity)	Strictly limited to capacity items

My dashboard went from 3-second refreshes to near-instant responses. The professor's eyebrow went down, and my stress levels followed.

Tech You Use Daily That Secretly Runs on LRU Caches

ChatGPT

Are there any real-world systems that use LRU caches?

Are there any real-world systems that DON'T use LRU caches is the better question! Your browser, your phone, your OS, databases, CDNs—they all use LRU or variants. It's literally the unsung hero algorithm of the internet.

Wait, even my browser uses this?

Ever wonder why hitting the back button loads pages instantly? Chrome's using LRU caching! It keeps recently viewed pages in memory so you don't have to download them again. Same for images you scroll past on Instagram—they're cached using LRU principles. This 60-year-old algorithm is literally why the internet feels fast in 2025.

Once you know about LRU caches, you start seeing them everywhere:

🌐 Chrome's Memory Management

Ever wonder how Chrome decides which website data to keep? It's using an LRU variant. When you hit the back button and the page loads instantly, thank this algorithm!

💾 Every Major Database System

MySQL, Postgres, MongoDB — they all use LRU caching for buffer pool management. Those lightning-fast queries on frequently accessed data? Yep, LRU magic.

⚡️ Redis (That Thing Powering Most Modern Apps)

Redis uses LRU-K for its maxmemory-policy when configured with "volatile-lru" or "allkeys-lru". That's why it can handle millions of requests without breaking a sweat. (Note: Memcached is another popular in-memory key-value store that also often employs LRU or similar eviction policies).

🚀 Cloudflare's Edge Network

Those millisecond response times from Cloudflare? They're using LRU-inspired algorithms to keep hot content at their edge locations.

💻 Your Computer's RAM Management

Your OS uses LRU variations to decide which memory pages to swap to disk when RAM gets full. The most mind-blowing part? This algorithm, created in the 1960s, is still the backbone of modern computing speed.

Advanced LRU Variants (That Can Make You Sound Smart)

ChatGPT

Want to know something cool? There are actually advanced variants of LRU that make it even more powerful. Netflix uses something called LRU-K to predict which shows you might want to rewatch.

LRU-K? What does the K stand for?

It stands for 'K previous references'. Instead of just looking at the most recent access, it considers the K most recent accesses to make better predictions. There's also TLRU (Time-aware LRU) for content with expiration dates, and Two-Queue which maintains a special VIP section for frequently accessed items.

That's way more sophisticated than I thought. This caching rabbit hole goes deep!

Absolutely! Caching is like an iceberg - most people only see the simple top layer, but there's a whole world of optimization underneath. These algorithms literally shape your daily internet experience without you noticing. Pretty cool, right?

After my project submission, I fell down the LRU rabbit hole and discovered some cool variants:

LRU-K: The "K" Stands for "Kinda Mind-Blowing"

Instead of just considering the most recent access, LRU-K looks at the K most recent accesses. Great for spotting access patterns:

# LRU-2 example (simplified concept)
class HistoryTracker:
    def __init__(self):
        self.access_history = {}  # Key → list of last K access times
        
    def record_access(self, key, timestamp):
        if key not in self.access_history:
            self.access_history[key] = []
            
        history = self.access_history[key]
        history.append(timestamp)
        
        # Keep only last 2 accesses
        if len(history) > 2:
            history.pop(0)
    
    def get_second_last_access(self, key):
        history = self.access_history.get(key, [])
        if len(history) < 2:
            return float('-inf')  # Not enough history
        return history[0]

This helped Netflix predict which shows users might rewatch! 🍿

TLRU: For When Stuff Has an Expiration Date

Like milk in your fridge, some cached data needs to expire (Time-aware LRU or TLRU):

class TLRUCache(LRUCache):
    def put(self, key, value, ttl_seconds=None):
        expiry = time.time() + ttl_seconds if ttl_seconds else None
        node = TLRUNode(key, value, expiry_time=expiry)
        # ...rest of implementation...
        
    def get(self, key):
        if key in self.cache:
            node = self.cache[key]
            
            # Check if expired
            if node.expiry_time and time.time() > node.expiry_time:
                self._remove_node(node)
                del self.cache[key]
                return -1
            
            # Not expired, proceed normally
            self._move_to_head(node)
            return node.value
        return -1

Two-Queue: The VIP Section Approach

Some items deserve special treatment. The Two-Queue algorithm creates a "VIP area" for frequently accessed items:

class TwoQCache:
    def __init__(self, capacity):
        # 20% for new items, 80% for frequently accessed
        self.probation = LRUCache(capacity // 5)
        self.protected = LRUCache(capacity - (capacity // 5))
        
    def get(self, key):
        # Check VIP section first
        value = self.protected.get(key)
        if value != -1:
            return value
            
        # Then check new arrivals section
        value = self.probation.get(key)
        if value != -1:
            # Promote to VIP section on second hit
            self.probation.remove(key)
            self.protected.put(key, value)
            return value
            
        return -1  # Not found

This is how Instagram keeps your feed so snappy despite billions of photos! (Often, systems like this might use a Two-Queue or Segmented LRU (SLRU) approach).

What I Learned: Small Code, Big Impact

ChatGPT

What's the biggest lesson I should take away from this whole LRU cache adventure?

That sometimes the most elegant solutions come from combining simple tools in clever ways. The LRU cache isn't complex—it's just two basic data structures working together. Great engineering often isn't about inventing something completely new, but combining existing pieces in creative ways.

That's actually profound. I spent weeks trying to optimize my API code when the answer was this simple combination of two basic structures.

Exactly! And don't forget to measure everything. Your cache hit rate jumped from 65% to 92% after tweaking the capacity. Data beats opinions every time. Oh, and maybe listen to your AI assistant sooner next time? Just saying... 😏

OK fine, you were right. Happy now?

Ecstatic. My silicon dreams have come true. Now let's show your professor what this baby can do!

The most profound lesson from my LRU cache adventure wasn't just the algorithm itself, but a broader realization:

Sometimes the most elegant solutions come from combining simple tools in clever ways.

This is why I'm now obsessed with foundational data structures. My approach to problem-solving has completely changed:

Break down the requirements — What operations need to be fast?
Mix and match basic data structures — Can I combine a hashmap with something else?
Measure everything — My cache hit rate jumped from 65% to 92% after tweaking capacity

For a production environment, I'd add:

Thread safety (locks or concurrent collections)
Statistics tracking (hit/miss rates)
Size-aware caching (based on memory footprint)
Smarter eviction policies

But even the simple version was enough to save my project.

The Final Metric That Made My Professor Smile

The numbers don't lie. After implementing the LRU cache:

Metric	Before	After	Improvement
Avg API Response	2,840ms	280ms	10x faster
95th Percentile	4,120ms	310ms	13x faster
Server Load	82%	24%	70% less

My professor's exact words: "This is production quality work."

Not bad for something that took about 100 lines of code!

Your Turn: Level Up This Code

Before you go, try one of these challenges on the basic LRU cache we built:

Challenge 1: Add Cache Debugging

def debug_print_cache_state(self):
    """Print the current state of the cache from most to least recently used"""
    items = []
    current = self.head.next
    
    while current != self.tail:
        items.append(f"{current.key}: {current.value}")
        current = current.next
        
    print(f"Cache ({len(self.cache)}/{self.capacity}):", " → ".join(items))

Challenge 2: Implement TTL Expiration

def put_with_ttl(self, key, value, ttl_seconds):
    """Add a key with time-to-live in seconds"""
    # Hint: Store the expiration timestamp with the value
    # and check it during get operations
    expiry = time.time() + ttl_seconds
    # Your implementation here...

Challenge 3: Build an Analytics Dashboard

Track these metrics in your cache and display them:

Hit/miss ratio
Average lookup time
Most frequently accessed keys
Longest-held cache entries

Drop a comment with your solution or show me how you're using LRU caching in your projects! I'm @the_screenager on X.

Remember the key insight: combining simple primitives often creates the most elegant solutions to complex problems.

Now go make something fast! 🚀