Redis Changed How I Think About Databases

I thought Redis was a cache. You set a key, you get a key, it lives in memory, it's fast. That was my entire mental model for about three years. Then I needed a real-time leaderboard and wrote this:

import redis

r = redis.Redis(host='localhost', port=6379)

r.zadd('leaderboard', {'alice': 4500, 'bob': 3200, 'charlie': 4100})

top_players = r.zrevrange('leaderboard', 0, 9, withscores=True)
print(top_players)

Three lines to add scores. One line to get the top 10, already sorted, in under a millisecond. No ORDER BY. No index tuning. No query planner. That was the moment I realized Redis isn't a cache that happens to have data structures. It's a data structure server that happens to be useful as a cache.

That shift in thinking changed how I architect almost everything now.

The Numbers Tell a Story

Redis has been around since 2009 and it's still growing. In the Stack Overflow 2025 Developer Survey, Redis hit 28% usage among professional developers, growing 8% year over year while Docker jumped to 71.1%. That's not a technology coasting on legacy adoption. That's active, accelerating growth.

The business side is even more telling. Redis passed $300 million in annualized recurring revenue with 12,000 paying customers, including a third of the Fortune 100. When a third of the largest companies on the planet pay for something they could technically run for free, the product is doing something right.

And here's a newer stat that surprised me: 43% of developers building AI agents chose Redis for memory and data storage. Redis is showing up in AI agent architectures as the default choice for fast state management. That's a use case nobody predicted five years ago.

Data Structures That Change Your Thinking

The reason Redis rewired my brain is that it maps directly to programming concepts I already understood. Instead of modeling everything as rows and columns and then writing SQL to transform them, Redis lets you work with the data structure that fits the problem.

Sorted Sets

This is the one that got me. A sorted set is a collection of unique members, each with a floating-point score. Redis keeps them sorted by score automatically. Insertions, removals, and range queries are all O(log N).

Sorted sets are perfect for anything that involves ranking: leaderboards, priority queues, rate limiters using sliding windows, even matchmaking systems in games. Any time you'd reach for ORDER BY score DESC LIMIT 10 in SQL, a sorted set does it faster and with zero query planning overhead.

# Sliding window rate limiter with sorted sets
import time
import redis

r = redis.Redis(host='localhost', port=6379)

def is_rate_limited(user_id: str, max_requests: int = 100, window_seconds: int = 60) -> bool:
    key = f"ratelimit:{user_id}"
    now = time.time()
    window_start = now - window_seconds

    pipe = r.pipeline()
    pipe.zremrangebyscore(key, 0, window_start)  # Remove old entries
    pipe.zadd(key, {f"{now}": now})               # Add current request
    pipe.zcard(key)                                # Count requests in window
    pipe.expire(key, window_seconds)               # Auto-cleanup
    results = pipe.execute()

    request_count = results[2]
    return request_count > max_requests

That's a production-grade sliding window rate limiter in about 15 lines. The pipeline sends all four commands in a single round trip. The sorted set automatically keeps requests ordered by timestamp. Old entries get pruned on every call. Try building this with the same performance characteristics in PostgreSQL. You can, but it's going to involve a lot more code and a lot more latency.

Streams

Redis Streams are append-only log structures with consumer groups. Think Kafka, but running on a single Redis instance with zero configuration.

Streams support consumer groups, which means multiple consumers can read from the same stream and Redis tracks what each consumer has processed. Failed messages can be claimed by other consumers. It's event sourcing without the ceremony.

import Redis from 'ioredis'

const redis = new Redis()

// Producer: add events to a stream
async function publishEvent(stream: string, data: Record<string, string>) {
  const id = await redis.xadd(stream, '*', ...Object.entries(data).flat())
  console.log(`Published event ${id}`)
}

// Consumer: read new events
async function consumeEvents(stream: string, group: string, consumer: string) {
  // Create consumer group (ignore if exists)
  try {
    await redis.xgroup('CREATE', stream, group, '0', 'MKSTREAM')
  } catch (e) {
    // Group already exists, that's fine
  }

  while (true) {
    const results = await redis.xreadgroup(
      'GROUP', group, consumer,
      'COUNT', 10,
      'BLOCK', 5000,
      'STREAMS', stream, '>'
    )

    if (results) {
      for (const [, messages] of results) {
        for (const [id, fields] of messages) {
          console.log(`Processing ${id}:`, fields)
          await redis.xack(stream, group, id)
        }
      }
    }
  }
}

I've used this pattern for order processing pipelines. An API endpoint publishes an event to a stream. Three different consumer groups pick it up: one sends a confirmation email, one updates inventory, one fires analytics. Each consumer processes at its own pace. If the email service crashes, the other two keep going, and the email consumer picks up where it left off when it restarts.

Is it Kafka? No. Kafka handles millions of messages per second across distributed clusters. But for most applications doing thousands of events per second, Redis Streams are simpler to run, simpler to debug, and simpler to deploy.

Pub/Sub

Redis Pub/Sub is fire-and-forget messaging. A publisher sends a message to a channel. Every subscriber listening to that channel gets it immediately. No persistence. No acknowledgments. No replays.

That sounds limiting, and it is, intentionally. Pub/Sub is built for real-time notifications where losing a message isn't catastrophic: chat presence indicators, live scoreboards, cache invalidation across servers, WebSocket fan-out.

# Publisher
import redis

r = redis.Redis(host='localhost', port=6379)
r.publish('notifications', 'user:42:logged_in')

# Subscriber
import redis

r = redis.Redis(host='localhost', port=6379)
pubsub = r.pubsub()
pubsub.subscribe('notifications')

for message in pubsub.listen():
    if message['type'] == 'message':
        print(f"Received: {message['data'].decode()}")

I use this for cross-server cache invalidation. When one server updates a cached record, it publishes a message. Every other server receives it and drops its local copy. Simple, fast, and it works across any number of servers.

HyperLogLog

This one is niche but beautiful. HyperLogLog is a probabilistic data structure that counts unique items using fixed 12 KB of memory, no matter how many items you count. The tradeoff is about 0.81% error rate.

import redis

r = redis.Redis(host='localhost', port=6379)

# Track unique visitors
r.pfadd('unique_visitors:2025-07-07', 'user:1', 'user:2', 'user:3')
r.pfadd('unique_visitors:2025-07-07', 'user:1', 'user:4')  # user:1 is duplicate

count = r.pfcount('unique_visitors:2025-07-07')
print(f"Unique visitors: {count}")  # ~4

You could count 100 million unique visitors and it would still use 12 KB. That's it. Try doing SELECT COUNT(DISTINCT user_id) on a table with 100 million rows and see how your database feels about it.

Real Use Cases With Code

Let me walk through the patterns I actually use in production. Not hello-world examples. Real patterns.

Caching (The One Everyone Knows)

Yes, caching. But done right, with a proper cache-aside pattern and TTL:

import redis
import json

r = redis.Redis(host='localhost', port=6379, decode_responses=True)

def get_user_profile(user_id: int) -> dict:
    cache_key = f"user:{user_id}:profile"

    # Check cache first
    cached = r.get(cache_key)
    if cached:
        return json.loads(cached)

    # Cache miss: fetch from database
    profile = fetch_from_database(user_id)  # Your DB query here

    # Store in cache with 5-minute TTL
    r.setex(cache_key, 300, json.dumps(profile))

    return profile

def invalidate_user_cache(user_id: int):
    r.delete(f"user:{user_id}:profile")

The thing most tutorials skip: cache invalidation. You need invalidate_user_cache called everywhere the user profile changes. Miss one spot and users see stale data. This is the "two hard things in computer science" problem and Redis doesn't solve it for you. You still have to think about it.

Session Storage

Session storage is where Redis replaces something clunky (server-side files, database rows) with something clean:

import Redis from 'ioredis'
import { v4 as uuidv4 } from 'uuid'

const redis = new Redis()

interface Session {
  userId: string
  email: string
  role: string
  createdAt: string
}

async function createSession(userId: string, email: string, role: string): Promise<string> {
  const sessionId = uuidv4()
  const session: Session = {
    userId,
    email,
    role,
    createdAt: new Date().toISOString(),
  }

  // Store session with 24-hour TTL
  await redis.setex(
    `session:${sessionId}`,
    86400,
    JSON.stringify(session)
  )

  return sessionId
}

async function getSession(sessionId: string): Promise<Session | null> {
  const data = await redis.get(`session:${sessionId}`)
  return data ? JSON.parse(data) : null
}

async function destroySession(sessionId: string): Promise<void> {
  await redis.del(`session:${sessionId}`)
}

Why Redis and not your database? Two reasons. First, session lookups happen on every single request. That's a lot of reads. Redis handles hundreds of thousands of reads per second without breaking a sweat. Your PostgreSQL connection pool might disagree. Second, sessions are ephemeral. They expire. Redis has built-in TTL. You don't need a cron job to clean up expired sessions.

Job Queues with BullMQ

BullMQ is a Node.js job queue built entirely on Redis Streams. Sidekiq does the same thing for Ruby. Both are battle-tested in production at scale.

import { Queue, Worker } from 'bullmq'
import Redis from 'ioredis'

const connection = new Redis({ maxRetriesPerRequest: null })

// Create a queue
const emailQueue = new Queue('email', { connection })

// Add a job
async function sendWelcomeEmail(userId: string, email: string) {
  await emailQueue.add('welcome', {
    userId,
    email,
    template: 'welcome',
  }, {
    attempts: 3,
    backoff: { type: 'exponential', delay: 2000 },
  })
}

// Process jobs
const worker = new Worker('email', async (job) => {
  const { email, template } = job.data
  console.log(`Sending ${template} email to ${email}`)
  // Actually send the email here
  await sendEmail(email, template)
}, { connection, concurrency: 5 })

worker.on('completed', (job) => {
  console.log(`Job ${job.id} completed`)
})

worker.on('failed', (job, err) => {
  console.log(`Job ${job?.id} failed: ${err.message}`)
})

BullMQ gives you retries with exponential backoff, concurrency control, job priorities, rate limiting, and a dashboard (Bull Board). All backed by Redis. I've run this in production handling 50,000 jobs per hour on a single Redis instance. It's boring. Nothing breaks. That's the highest compliment I can give infrastructure.

Leaderboard (The Full Version)

Here's a more complete leaderboard than the three-liner I opened with:

import redis
from typing import Optional

r = redis.Redis(host='localhost', port=6379, decode_responses=True)

class Leaderboard:
    def __init__(self, name: str):
        self.key = f"leaderboard:{name}"

    def update_score(self, player: str, score: float):
        """Set or update a player's score"""
        r.zadd(self.key, {player: score})

    def increment_score(self, player: str, amount: float = 1):
        """Add to a player's existing score"""
        r.zincrby(self.key, amount, player)

    def get_rank(self, player: str) -> Optional[int]:
        """Get player's rank (0-indexed, highest score = rank 0)"""
        rank = r.zrevrank(self.key, player)
        return rank + 1 if rank is not None else None

    def get_top(self, count: int = 10) -> list:
        """Get top N players with scores"""
        return r.zrevrange(self.key, 0, count - 1, withscores=True)

    def get_around(self, player: str, count: int = 5) -> list:
        """Get players around a specific player"""
        rank = r.zrevrank(self.key, player)
        if rank is None:
            return []
        start = max(0, rank - count)
        end = rank + count
        return r.zrevrange(self.key, start, end, withscores=True)

    def total_players(self) -> int:
        return r.zcard(self.key)


# Usage
lb = Leaderboard('weekly')
lb.update_score('alice', 4500)
lb.update_score('bob', 3200)
lb.increment_score('bob', 150)  # Bob now has 3350

print(f"Bob's rank: {lb.get_rank('bob')}")
print(f"Top 10: {lb.get_top(10)}")
print(f"Around Bob: {lb.get_around('bob', 3)}")
print(f"Total players: {lb.total_players()}")

Every operation here is O(log N). With a million players, getting the top 10 takes the same sub-millisecond time as with 100 players. Building this with SQL would mean an indexed ORDER BY query, which is fast but not this fast, and gets slower as the table grows.

Redis vs Memcached vs Valkey

This comparison matters because people still ask "why not just use Memcached?" and because the Valkey fork has complicated the decision.

Feature	Redis	Memcached	Valkey
Data Structures	Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLog	Strings only	Same as Redis
Persistence	RDB snapshots + AOF	None	RDB + AOF
Replication	Built-in primary/replica	None (use mcrouter)	Built-in
Threading	Single-threaded event loop	Multi-threaded	Multi-threaded (new)
Pub/Sub	Yes	No	Yes
Streams	Yes	No	Yes
Lua Scripting	Yes	No	Yes
License	SSPL/RSAL + AGPLv3 option	BSD	BSD (3-clause)
Performance	1.5x faster than Valkey	Fast for simple gets/sets	Close to Redis
Cloud Default	Azure Cache	AWS ElastiCache (legacy)	AWS ElastiCache (default)
Usage (2025)	28%	Declining	2.4%

Memcached is multi-threaded but feature-limited. It's a pure key-value cache, and it does that well. If all you need is get and set for strings, Memcached will happily use all your CPU cores for that. But the moment you need sorted sets, streams, pub/sub, or persistence, you're reaching for Redis or Valkey anyway.

Redis uses a single-threaded event loop, which sounds like a limitation but is actually a feature. No locks. No race conditions. No mutex contention. Every command executes atomically. When you run that rate limiter pipeline from earlier, you know the four commands execute in sequence without another request sneaking in between them.

The License Drama (You Should Care About This)

In March 2024, Redis Labs switched the Redis license from BSD to dual SSPL/RSAL. BSD is about as permissive as open-source licenses get. SSPL and RSAL are... not. The short version: cloud providers can no longer offer Redis-as-a-service without either paying Redis Labs or open-sourcing their entire stack.

The community did not take this well.

The Linux Foundation forked Redis as Valkey under a BSD license, backed by AWS, Google, and Oracle. Within a year, Redis lost most of its external contributors. AWS switched ElastiCache to use Valkey as the default engine. 83% of large companies reportedly adopted or began testing Valkey.

Redis clearly felt the pressure. The original creator, Salvatore Sanfilippo, returned in November 2024. Then in May 2025, Redis 8.0 added AGPLv3 as a license option, which is a recognized open-source license approved by the OSI. Not as permissive as BSD, but open-source in a way SSPL never was.

So what does this mean for you?

If you self-host or use Redis Cloud directly: the license change probably doesn't affect you. RSAL allows internal use. You can run Redis in your own infrastructure, build products on top of it, whatever. The restrictions only hit you if you try to sell Redis itself as a managed service.

If you use AWS, GCP, or Azure: your cloud provider made the choice for you. AWS uses Valkey now. Azure still uses Redis. GCP offers both.

If you're picking for a new project: honestly, both are fine. Redis is still about 1.5x faster than Valkey in benchmarks, but Valkey 8.1 is production-ready and narrowing the gap. Valkey has the BSD license and the cloud providers behind it. Redis has the faster engine and the commercial support.

The Redis vs Valkey Decision Framework

Here's how I think about this choice now:

Choose Redis if:

You need maximum performance (that 1.5x gap matters at scale)
You want Redis Stack features (RedisJSON, RediSearch, RedisTimeSeries)
You're using Azure (which still defaults to Redis)
You're buying Redis Cloud directly
License terms don't conflict with your business model

Choose Valkey if:

You're on AWS (it's already the default)
You need a permissive BSD license for redistribution
You're building a product that embeds or redistributes the database
You want the backing of the Linux Foundation and major cloud vendors
Community contributor health matters to your evaluation process

Choose Memcached if:

You only need simple key-value caching
You want multi-threaded performance for pure get/set workloads
You don't need persistence, pub/sub, or complex data structures

For most web applications, the Redis-vs-Valkey choice comes down to which cloud you're on. If AWS, you're getting Valkey whether you ask for it or not. If Azure, you're getting Redis. If self-hosting, pick the one your team knows.

Patterns I Use Every Week

Here are the Redis patterns that show up in almost every project I build:

Cache-aside with stampede protection. When a hot cache key expires, every request hits the database simultaneously. I use Redis's SET NX (set if not exists) to acquire a lock. One request rebuilds the cache. Everyone else gets stale data or waits.

import redis
import json
import time

r = redis.Redis(host='localhost', port=6379, decode_responses=True)

def get_with_lock(key: str, ttl: int, fetch_fn):
    cached = r.get(key)
    if cached:
        return json.loads(cached)

    lock_key = f"{key}:lock"
    if r.set(lock_key, "1", nx=True, ex=5):
        # We got the lock, rebuild cache
        data = fetch_fn()
        r.setex(key, ttl, json.dumps(data))
        r.delete(lock_key)
        return data
    else:
        # Someone else is rebuilding, wait briefly
        time.sleep(0.1)
        cached = r.get(key)
        return json.loads(cached) if cached else fetch_fn()

Distributed rate limiting across servers. The sorted set rate limiter from earlier works across any number of application servers because they all talk to the same Redis. No sticky sessions. No shared-nothing complexity. One Redis, one source of truth.

Feature flags. Store them in a Redis hash. Read is sub-millisecond. Update propagates to all servers instantly. No deploy needed.

# Set a feature flag
r.hset('features', 'new_checkout', '1')
r.hset('features', 'dark_mode', '0')

# Check a feature flag (on every request, it's fast enough)
is_enabled = r.hget('features', 'new_checkout') == '1'

What I Actually Think

Here's my honest opinion after running Redis in production for several years: Redis is underused. Most teams treat it like a simple cache and miss 80% of what it can do.

Every time I see someone spin up a separate service for job queues, or add Kafka for a system that processes 500 events per second, or build a rate limiter using database rows, I think: Redis does all of that. Already. With fewer moving parts.

The sorted set is the most underrated data structure in all of web development. The number of times I've seen teams build complex ranking systems with SQL queries, materialized views, and background jobs, when ZADD and ZREVRANGE would have done the job in two commands, is honestly painful.

That said, I'm not saying "use Redis for everything." Don't store your primary data in Redis. Don't use it as your only database. It's memory-first, which means your data set needs to fit in RAM (or at least the hot portion does). Use PostgreSQL or MySQL for your source of truth. Use Redis for the speed layer on top.

The license situation is messy but stabilizing. The AGPLv3 addition in Redis 8.0 was the right move, even if it came too late to prevent the fork. In practice, if you're not building a competing cloud service, neither license restricts you. And if you are on AWS, Valkey at 2.4% adoption is going to grow fast now that it's the default in ElastiCache.

My actual recommendation: learn the data structures. That's the thing that transfers regardless of whether you end up on Redis or Valkey. Sorted sets, streams, pub/sub, HyperLogLog, these are the building blocks. Once you internalize them, you start seeing problems differently. You stop thinking "how do I query this from my database?" and start thinking "what data structure fits this problem?"

That mental shift is what Redis actually gave me. Not just faster reads. A different way of thinking about data.

Redis Changed How I Think About Databases

The Numbers Tell a Story

Data Structures That Change Your Thinking

Sorted Sets

Streams

Pub/Sub

HyperLogLog

Real Use Cases With Code

Caching (The One Everyone Knows)

Session Storage

Job Queues with BullMQ

Leaderboard (The Full Version)

Redis vs Memcached vs Valkey

The License Drama (You Should Care About This)

The Redis vs Valkey Decision Framework

Patterns I Use Every Week

What I Actually Think

Sources

Enjoyed this article?

The Numbers Tell a Story

Data Structures That Change Your Thinking

Sorted Sets

Streams

Pub/Sub

HyperLogLog

Real Use Cases With Code

Caching (The One Everyone Knows)

Session Storage

Job Queues with BullMQ

Leaderboard (The Full Version)

Redis vs Memcached vs Valkey

The License Drama (You Should Care About This)

The Redis vs Valkey Decision Framework

Patterns I Use Every Week

What I Actually Think

Sources