How ChatGPT and I Built an LRU Cache That Sped Up My App 10x

Last month, I was building a data visualization dashboard for my academic mini project when I hit a wall. Every time users interacted with the charts, my app was making the same expensive API calls over and over again. Response times were creeping beyond 3 seconds, and my professor was giving me that look – you know, the one that silently judges your life choices.
WARNING
This post may contain some slightly exaggerated conversations with ChatGPT. Things are a bit more dramatic when you're sleep-deprived and frantically coding at 2 AM. But hey, that's the magic of storytelling, right? 😜
It was midnight, three days before the deadline, when I finally broke down and decided to ask ChatGPT for help. Little did I know this would lead to a bizarre 4-hour coding adventure with my new AI buddy.
The problem wasn't implementing any cache — it was figuring out what to do when the cache inevitably filled up. With limited memory on my deployment instance, I needed something smarter than a basic dictionary that would keep growing forever.

Table of Contents
- The Epic Facepalm Moment
- Why Your App Probably Needs Caching Too
- The Deceptive Challenge: Making It Fast
- Step-by-Step Implementation
- The Performance Secret Sauce
- Tech You Use Daily That Secretly Runs on LRU Caches
- Advanced LRU Variants (That Can Make You Sound Smart)
- What I Learned: Small Code, Big Impact
- The Final Metric That Made My Professor Smile
- Your Turn: Level Up This Code
The Epic Facepalm Moment
After some time, Googling "best caching algorithms," I found my answer: the LRU cache — the unsung hero behind pretty much every fast application we use daily. But when I shared my discovery with ChatGPT, it had the digital equivalent of a facepalm moment:
What's wild is how simple yet brilliant the concept is:
Just kick out whatever hasn't been used in the longest time.
That's it. That's the whole magic trick. And suddenly my dashboard went from turtle-speed to cheetah-mode.
Why Your App Probably Needs Caching Too
We all take caching for granted, but it's low-key the reason the internet doesn't feel like dial-up in 2025:
- Your browser doesn't re-download images when you hit the back button
- Instagram doesn't fetch the same photos repeatedly as you scroll
- Spotify doesn't stream the same song chunks multiple times
- Redis, Memcached, and every modern database rely on it
# Example: a fibonacci function (with and without caching timeit)
import timeit
# Without caching
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n-1) + fibonacci(n-2)
print(timeit.timeit('fibonacci(10)', globals=globals())) # 12.990135460997408
# With caching
def fibonacci_cached(n, cache={}):
if n in cache:
return cache[n]
if n <= 1:
return n
cache[n] = fibonacci_cached(n-1, cache) + fibonacci_cached(n-2, cache)
return cache[n]
print(timeit.timeit('fibonacci_cached(10)', globals=globals())) # 0.09349425700202119
The speed difference isn't subtle either. We're talking memory access (~100ns) vs. network requests (~100ms) — that's a million times faster!
But here's the real question that stumped me: When your cache fills up, what do you throw away?
Think about your kitchen. The pans and ingredients you used yesterday are probably sitting in the front of your cabinets, while that specialty cake mold you last used three years ago is collecting dust in the back. If you need to make space, which would you move to storage first?
The LRU cache operates on the same intuition. It keeps track of the order in which items were accessed and, when space is needed, it discards the least recently used items first.
This approach works remarkably well for many real-world access patterns because of a property called temporal locality — if a piece of data was accessed recently, it's likely to be accessed again soon.
The Deceptive Challenge: Making It Fast
Here's where I got tripped up. The concept is simple, but making it fast is tricky. We need three operations to be lightning quick:
- Looking up data — time
- Marking something as "recently used" — time
- Finding and removing the oldest item — time
The first thought would be a Python list to track order. Total disaster! Every time you access an item, you have to shuffle the entire list ( operation). With just 1000 items, it was already painfully slow.
The elegant hack? Combine two data structures:
- A hash map (dictionary) for instant lookups
- A doubly linked list to track and update access order
This combo is like peanut butter and jelly — separately good, together magic.
Let's code this bad boy from scratch!
Step-by-Step Implementation
Step 1: Create the Building Block - The Node Class
First, let's create our Node class — the building block of our linked list:
class Node:
def __init__(self, key, value):
self.key = key # Store key so we can remove from dictionary later
self.value = value # The actual data we're caching
self.prev = None # Pointer to previous node
self.next = None # Pointer to next node
👆 Nothing fancy here. Just a basic node with pointers in both directions.
Step 2: Build the Main LRU Cache Class
Now for the main event — our LRU Cache class with all its magic:
class LRUCache:
def __init__(self, capacity):
self.capacity = capacity # How many items we can store max
self.cache = {} # Our dictionary for O(1) lookups
# Quick hack: Create dummy head and tail nodes
# This eliminates edge cases when adding/removing nodes
self.head = Node(0, 0) # Dummy head
self.tail = Node(0, 0) # Dummy tail
# Connect them
self.head.next = self.tail
self.tail.prev = self.head
# Internal helper: Add a new node right after HEAD (most recently used)
def _add_node(self, node):
# Always add new node right after head (most recently used spot)
node.prev = self.head
node.next = self.head.next
self.head.next.prev = node
self.head.next = node
# Internal helper: Remove a node from anywhere in the list
def _remove_node(self, node):
prev = node.prev
next_node = node.next
# Bypass the node
prev.next = next_node
next_node.prev = prev
# Internal helper: Mark as recently used by moving to front
def _move_to_head(self, node):
self._remove_node(node) # Take it out
self._add_node(node) # Put it at front
# Internal helper: Remove the least recently used (just before TAIL)
def _pop_tail(self):
res = self.tail.prev # The LRU item is right before tail
self._remove_node(res)
return res
# Public API: Get a value (and mark it as recently used)
def get(self, key):
# Cache miss
if key not in self.cache:
return -1
# Cache hit
node = self.cache[key]
self._move_to_head(node) # Mark as recently used
return node.value
# Public API: Add or update a value
def put(self, key, value):
# If key exists, update value
if key in self.cache:
node = self.cache[key]
node.value = value
self._move_to_head(node) # Mark as recently used
else:
# New key, need to add it
new_node = Node(key, value)
self.cache[key] = new_node # Add to dictionary
self._add_node(new_node) # Add to linked list
# Over capacity? Evict least recently used
if len(self.cache) > self.capacity:
lru = self._pop_tail()
del self.cache[lru.key] # Remove from dictionary too
The beauty here is that every operation is O(1) time complexity. No searching, no shifting arrays — just pointer manipulation and dictionary lookups.
Step 3: See It In Action
Let's take our cache for a spin and watch the eviction magic happen:
# Create a tiny cache that holds just 3 items
cache = LRUCache(3)
# Fill it up
cache.put("api_result1", {"user": "john", "score": 95})
cache.put("api_result2", {"user": "sarah", "score": 98})
cache.put("api_result3", {"user": "mike", "score": 91})
# Access the first item - this marks it as recently used
print(cache.get("api_result1")) # Output: {"user": "john", "score": 95}
# Add a 4th item - this should kick out api_result2 (least recently used)
cache.put("api_result4", {"user": "lisa", "score": 99})
# Verify that api_result2 got evicted
print(cache.get("api_result2")) # Output: -1 (not in cache)
# Check other items (and change their "recently used" order)
print(cache.get("api_result1")) # Still there!
print(cache.get("api_result3")) # Still there!
# Add a 5th item - should kick out api_result4 (now the LRU)
cache.put("api_result5", {"user": "alex", "score": 93})
# Verify that api_result4 got evicted
print(cache.get("api_result4")) # Output: -1 (not in cache)
🧠 Pro Tip: In my real project, ChatGPT suggested adding hit/miss tracking:
def get(self, key):
if key not in self.cache:
self.misses += 1 # Track cache misses
return -1
self.hits += 1 # Track cache hits
# ...rest of method...
After implementing this, I discovered my cache had a 92% hit rate! That's why my app suddenly felt so much faster.
Cache State (Most Recent → Least Recent)
Cache is empty. Add items using the Put button.
Operation Log
No operations yet
The Performance Secret Sauce
Here's why this implementation makes my app lightning fast:
Operation | Time Complexity | Why It's Fast |
---|---|---|
get(key) | O(1) | Dictionary lookup + pointer manipulation |
put(key, value) | O(1) | Same as get + possible O(1) eviction |
Memory Usage | O(capacity) | Strictly limited to capacity items |
My dashboard went from 3-second refreshes to near-instant responses. The professor's eyebrow went down, and my stress levels followed.
Tech You Use Daily That Secretly Runs on LRU Caches
Once you know about LRU caches, you start seeing them everywhere:
🌐 Chrome's Memory Management
Ever wonder how Chrome decides which website data to keep? It's using an LRU variant. When you hit the back button and the page loads instantly, thank this algorithm!
💾 Every Major Database System
MySQL, Postgres, MongoDB — they all use LRU caching for buffer pool management. Those lightning-fast queries on frequently accessed data? Yep, LRU magic.
⚡️ Redis (That Thing Powering Most Modern Apps)
Redis uses LRU-K for its maxmemory-policy
when configured with "volatile-lru" or "allkeys-lru". That's why it can handle millions of requests without breaking a sweat. (Note: Memcached is another popular in-memory key-value store that also often employs LRU or similar eviction policies).
🚀 Cloudflare's Edge Network
Those millisecond response times from Cloudflare? They're using LRU-inspired algorithms to keep hot content at their edge locations.
💻 Your Computer's RAM Management
Your OS uses LRU variations to decide which memory pages to swap to disk when RAM gets full. The most mind-blowing part? This algorithm, created in the 1960s, is still the backbone of modern computing speed.
Advanced LRU Variants (That Can Make You Sound Smart)
After my project submission, I fell down the LRU rabbit hole and discovered some cool variants:
LRU-K: The "K" Stands for "Kinda Mind-Blowing"
Instead of just considering the most recent access, LRU-K looks at the K most recent accesses. Great for spotting access patterns:
# LRU-2 example (simplified concept)
class HistoryTracker:
def __init__(self):
self.access_history = {} # Key → list of last K access times
def record_access(self, key, timestamp):
if key not in self.access_history:
self.access_history[key] = []
history = self.access_history[key]
history.append(timestamp)
# Keep only last 2 accesses
if len(history) > 2:
history.pop(0)
def get_second_last_access(self, key):
history = self.access_history.get(key, [])
if len(history) < 2:
return float('-inf') # Not enough history
return history[0]
This helped Netflix predict which shows users might rewatch! 🍿
TLRU: For When Stuff Has an Expiration Date
Like milk in your fridge, some cached data needs to expire (Time-aware LRU or TLRU):
class TLRUCache(LRUCache):
def put(self, key, value, ttl_seconds=None):
expiry = time.time() + ttl_seconds if ttl_seconds else None
node = TLRUNode(key, value, expiry_time=expiry)
# ...rest of implementation...
def get(self, key):
if key in self.cache:
node = self.cache[key]
# Check if expired
if node.expiry_time and time.time() > node.expiry_time:
self._remove_node(node)
del self.cache[key]
return -1
# Not expired, proceed normally
self._move_to_head(node)
return node.value
return -1
Two-Queue: The VIP Section Approach
Some items deserve special treatment. The Two-Queue algorithm creates a "VIP area" for frequently accessed items:
class TwoQCache:
def __init__(self, capacity):
# 20% for new items, 80% for frequently accessed
self.probation = LRUCache(capacity // 5)
self.protected = LRUCache(capacity - (capacity // 5))
def get(self, key):
# Check VIP section first
value = self.protected.get(key)
if value != -1:
return value
# Then check new arrivals section
value = self.probation.get(key)
if value != -1:
# Promote to VIP section on second hit
self.probation.remove(key)
self.protected.put(key, value)
return value
return -1 # Not found
This is how Instagram keeps your feed so snappy despite billions of photos! (Often, systems like this might use a Two-Queue or Segmented LRU (SLRU) approach).
What I Learned: Small Code, Big Impact
The most profound lesson from my LRU cache adventure wasn't just the algorithm itself, but a broader realization:
Sometimes the most elegant solutions come from combining simple tools in clever ways.
This is why I'm now obsessed with foundational data structures. My approach to problem-solving has completely changed:
- Break down the requirements — What operations need to be fast?
- Mix and match basic data structures — Can I combine a hashmap with something else?
- Measure everything — My cache hit rate jumped from 65% to 92% after tweaking capacity
For a production environment, I'd add:
- Thread safety (locks or concurrent collections)
- Statistics tracking (hit/miss rates)
- Size-aware caching (based on memory footprint)
- Smarter eviction policies
But even the simple version was enough to save my project.
The Final Metric That Made My Professor Smile
The numbers don't lie. After implementing the LRU cache:
Metric | Before | After | Improvement |
---|---|---|---|
Avg API Response | 2,840ms | 280ms | 10x faster |
95th Percentile | 4,120ms | 310ms | 13x faster |
Server Load | 82% | 24% | 70% less |
My professor's exact words: "This is production quality work."
Not bad for something that took about 100 lines of code!
Your Turn: Level Up This Code
Before you go, try one of these challenges on the basic LRU cache we built:
Challenge 1: Add Cache Debugging
def debug_print_cache_state(self):
"""Print the current state of the cache from most to least recently used"""
items = []
current = self.head.next
while current != self.tail:
items.append(f"{current.key}: {current.value}")
current = current.next
print(f"Cache ({len(self.cache)}/{self.capacity}):", " → ".join(items))
Challenge 2: Implement TTL Expiration
def put_with_ttl(self, key, value, ttl_seconds):
"""Add a key with time-to-live in seconds"""
# Hint: Store the expiration timestamp with the value
# and check it during get operations
expiry = time.time() + ttl_seconds
# Your implementation here...
Challenge 3: Build an Analytics Dashboard
Track these metrics in your cache and display them:
- Hit/miss ratio
- Average lookup time
- Most frequently accessed keys
- Longest-held cache entries
Drop a comment with your solution or show me how you're using LRU caching in your projects! I'm @the_screenager on X.
Remember the key insight: combining simple primitives often creates the most elegant solutions to complex problems.
Now go make something fast! 🚀