🇮🇳
🇮🇳
Republic Day Special Offer!Get 20% OFF on all courses
Enroll Now
P
Prakalpana
📚Learn
Code Your Future
System Design⏱️ 15 min read📅 Jan 12

Design URL Shortener (TinyURL) - Complete System Design Interview Guide

SK
Sanjay KrishnamurthyEx-Netflix Principal Engineer
📑 Contents (23 sections)

📌The Interview Question

"Design a URL shortening service like TinyURL or bit.ly"

This is one of the most common system design questions asked at Google, Amazon, Microsoft, and top startups. Here's exactly how to tackle it step-by-step.

📌Why This Question is Asked

Interviewers love this problem because it tests:

  • Your ability to design scalable distributed systems
  • Database design and sharding knowledge
  • Caching strategies
  • Understanding of trade-offs
  • Capacity estimation skills
  • 📌Step 1: Clarify Requirements (5 mins)

    Always ask these questions before jumping to design:

    Functional Requirements

  • Generate short URL from long URL
  • Redirect short URL to original URL
  • Custom short links (optional) - e.g., tiny.url/my-brand
  • Link expiration (optional)
  • Analytics (click count, geographic data)
  • User accounts (optional)
  • Non-Functional Requirements

  • High availability: 99.99% uptime (critical for redirects)
  • Low latency: < 100ms for redirect (users hate slow redirects)
  • Durability: Once created, URL should never be lost
  • Security: Short URLs should be unpredictable (no sequential IDs)
  • Back-of-envelope Estimation

    Let's estimate the scale:

  • Write volume: 100M new URLs/month = ~40 URLs/second
  • Read volume: 10:1 read-to-write ratio = 400 redirects/second
  • Peak traffic: 5x average = 2000 redirects/second
  • Storage: 6 billion URLs over 5 years
  • Each URL: ~500 bytes = 3TB total storage
  • Interview Tip: Always show your math. It demonstrates structured thinking.

    📌Step 2: API Design

    // Create short URL
    POST /api/v1/shorten
    Request:
    {
    "long_url": "https://example.com/very-long-path/with/many/segments",
    "custom_alias": "my-link", // optional
    "expiry_date": "2025-12-31" // optional
    }
    Response:
    {
    "short_url": "https://tiny.url/abc123",
    "expires_at": "2025-12-31T00:00:00Z",
    "created_at": "2025-01-12T10:30:00Z"
    }
    // Redirect (most important API)
    GET /{short_code}
    Response: HTTP 301/302 Redirect to original URL
    Headers: Location: https://example.com/very-long-path
    // Get Analytics
    GET /api/v1/stats/{short_code}
    Response:
    {
    "total_clicks": 15000,
    "unique_visitors": 8500,
    "clicks_by_country": {"IN": 5000, "US": 3000, ...},
    "clicks_by_day": [{"date": "2025-01-01", "count": 500}, ...]
    }
    // Delete URL (for user-owned URLs)
    DELETE /api/v1/url/{short_code}

    📌Step 3: Database Schema Design

    Primary URL Table

    CREATE TABLE urls (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    short_code VARCHAR(7) UNIQUE NOT NULL,
    long_url TEXT NOT NULL,
    user_id BIGINT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    expires_at TIMESTAMP NULL,
    click_count BIGINT DEFAULT 0,
    is_active BOOLEAN DEFAULT TRUE,
    INDEX idx_short_code (short_code),
    INDEX idx_user_id (user_id),
    INDEX idx_expires (expires_at)
    );
    -- For analytics (separate table for performance)
    CREATE TABLE url_analytics (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    short_code VARCHAR(7),
    clicked_at TIMESTAMP,
    user_agent TEXT,
    ip_address VARCHAR(45),
    country_code VARCHAR(2),
    referrer TEXT,
    INDEX idx_short_code_time (short_code, clicked_at)
    );

    Database Choice

  • MySQL/PostgreSQL: Good for consistency, ACID transactions
  • Cassandra: Better for write-heavy workloads, horizontal scaling
  • Recommendation: Start with MySQL, migrate to Cassandra at scale
  • 📌Step 4: Short Code Generation

    This is the core algorithm. You have several options:

    Characters: a-z (26) + A-Z (26) + 0-9 (10) = 62 characters

  • 6 characters = 62^6 = 56 billion combinations
  • 7 characters = 62^7 = 3.5 trillion combinations
  • public class Base62Encoder {
    private static final String ALPHABET =
    "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    public String encode(long id) {
    StringBuilder sb = new StringBuilder();
    while (id > 0) {
    sb.append(ALPHABET.charAt((int)(id % 62)));
    id /= 62;
    }
    // Pad to minimum length
    while (sb.length() < 6) {
    sb.append('0');
    }
    return sb.reverse().toString();
    }
    public long decode(String shortCode) {
    long id = 0;
    for (char c : shortCode.toCharArray()) {
    id = id * 62 + ALPHABET.indexOf(c);
    }
    return id;
    }
    }

    Option 2: Pre-generated Key Service

    For high-scale systems:

  • 1Separate service pre-generates unique keys
  • 2Stores in database/Redis
  • 3Application fetches batch of keys
  • 4No collision handling needed
  • Option 3: MD5/SHA Hash

    public String generateShortCode(String longUrl) {
    String hash = DigestUtils.md5Hex(longUrl);
    return hash.substring(0, 7);
    }

    Problem: Collisions. Solution: Append random suffix and retry.

    📌Step 5: System Architecture

    System Architecture
    Live
    Load Balancer
    Nginx/AWS
    API Server
    Java
    API Server
    Java
    API Server
    Java
    Redis Cluster
    Cache
    MySQL/Cassandra
    Primary DB
    ⚡ High Availability🔄 Auto-scaling🛡️ Fault Tolerant

    Component Details

  • 1Load Balancer: Round-robin distribution, health checks
  • 2API Servers: Stateless, auto-scaling group (2-10 instances)
  • 3Redis Cache: Cache hot URLs, 80% hit rate target
  • 4Database: Master-slave replication for reads
  • 📌Step 6: Read Path (Critical for Performance)

    The redirect flow handles 90% of traffic:

    public class RedirectService {
    @Autowired private RedisTemplate<String, String> redis;
    @Autowired private UrlRepository urlRepo;
    @Autowired private AnalyticsPublisher analytics;
    public String getLongUrl(String shortCode) {
    // Step 1: Check Redis cache
    String longUrl = redis.opsForValue().get("url:" + shortCode);
    if (longUrl != null) {
    // Cache hit - publish analytics async
    analytics.publishAsync(shortCode, request);
    return longUrl;
    }
    // Step 2: Cache miss - query database
    Url url = urlRepo.findByShortCode(shortCode);
    if (url == null || !url.isActive()) {
    throw new UrlNotFoundException();
    }
    // Step 3: Check expiration
    if (url.getExpiresAt() != null &&
    url.getExpiresAt().isBefore(Instant.now())) {
    throw new UrlExpiredException();
    }
    // Step 4: Update cache
    redis.opsForValue().set(
    "url:" + shortCode,
    url.getLongUrl(),
    Duration.ofHours(24)
    );
    // Step 5: Publish analytics asynchronously
    analytics.publishAsync(shortCode, request);
    return url.getLongUrl();
    }
    }

    📌Step 7: Scaling Strategies

    Database Sharding

    By short_code hash (recommended):

  • Shard = hash(short_code) % num_shards
  • Even distribution
  • Single shard lookup
  • public int getShard(String shortCode) {
    return Math.abs(shortCode.hashCode()) % NUM_SHARDS;
    }

    Cache Warming

  • Pre-load popular URLs on server startup
  • Use Redis Cluster for horizontal scaling
  • CDN for Global Reach

  • Deploy in multiple regions
  • Use Cloudflare/AWS CloudFront
  • Cache 301 redirects at edge
  • 📌Common Interview Questions

    Q: Why Base62 instead of Base64? A: Base64 includes + and / which are not URL-safe. Base62 uses only alphanumeric characters.

    Q: 301 vs 302 redirect?

  • 301 (Permanent): Browsers cache it. Better for SEO. But you lose analytics.
  • 302 (Temporary): Every request hits your server. Better for analytics.
  • Recommendation: Use 302 for analytics, 301 for static/permanent links.
  • Q: How do you prevent abuse?

  • Rate limiting per IP/user
  • CAPTCHA for anonymous users
  • Blacklist malicious URLs
  • Require account for bulk creation
  • Q: How do you handle hash collisions?

  • Append incrementing suffix: abc123 → abc1231 → abc1232
  • Or regenerate with random salt
  • Q: What happens if Redis goes down?

  • Application falls back to database
  • Higher latency but still functional
  • Redis Sentinel/Cluster for HA
  • 📌Key Takeaways

  • 1Start with requirements and estimations
  • 2Design for read-heavy workload (10:1 ratio)
  • 3Cache aggressively (80% of traffic is 20% of URLs)
  • 4Use Base62 for short, URL-safe codes
  • 5Separate analytics to avoid blocking redirects
  • 6Plan for horizontal scaling from day one
  • This problem tests your ability to design practical, scalable systems. Practice explaining each component's purpose and trade-offs.

    SK

    Written by

    Sanjay Krishnamurthy

    Ex-Netflix Principal Engineer

    🚀 Master System Design

    Join 500+ developers

    Explore Courses →
    Chat on WhatsApp