API Gateways for Java Microservices: Routing, Rate Limiting & Security

When you have one service, clients call it directly. When you have fifty, you don’t want each client to know fifty addresses, and you don’t want fifty services each re-implementing authentication, rate limiting, and TLS. An API gateway is the single front door that handles those cross-cutting concerns once, at the edge. This deep dive covers what a gateway does, how to build one with Spring Cloud Gateway, and where a gateway ends and a service mesh begins.

TL;DR: An API gateway is the single entry point in front of your microservices, centralizing routing, authentication, rate limiting, TLS termination, and request shaping. Spring Cloud Gateway (reactive, built on Project Reactor) is the Java-native choice. Pair it with Redis for distributed rate limiting and Resilience4j for edge resilience. A gateway handles north-south (client-to-system) traffic; a service mesh handles east-west (service-to-service).

Tailor your resume to a platform / backend role →

flowchart LR
  Cl[Client] --> GW[API Gateway]
  GW -->|authenticate| A[Validate JWT]
  GW -->|rate limit| RL[Redis limiter]
  GW --> S1[Service A]
  GW --> S2[Service B]

The gateway centralizes auth, rate limiting, and routing in front of your services.

What an API gateway does

The gateway is where edge concerns live so individual services don’t each reinvent them:

Routing — map incoming paths/hosts to the right backend service.
Authentication — validate tokens once at the edge before traffic reaches services.
Rate limiting & throttling — protect backends from abuse and noisy neighbors.
TLS termination — handle HTTPS at the edge.
Cross-cutting shaping — CORS, request/response transformation, header injection, size limits.
Resilience — timeouts, retries, and circuit breaking at the boundary.
Observability — a single place to log, trace, and meter all inbound traffic.

The payoff is one stable address for clients and one place to enforce edge policy — instead of duplicated, drifting implementations across every service.

Spring Cloud Gateway

Spring Cloud Gateway is the Java-native gateway: reactive (built on Spring WebFlux/Project Reactor and Netty, so it handles high concurrency on few threads) and configured with routes made of predicates (match conditions) and filters (actions). Most of it is YAML.

spring:
  cloud:
    gateway:
      routes:
        - id: orders
          uri: lb://orders-service        # load-balanced via service discovery
          predicates:
            - Path=/api/orders/**
          filters:
            - StripPrefix=1
            - name: RequestRateLimiter
              args:
                redis-rate-limiter.replenishRate: 100   # tokens/sec
                redis-rate-limiter.burstCapacity: 200
            - name: CircuitBreaker
              args:
                name: ordersCb
                fallbackUri: forward:/fallback/orders

Predicates can match path, host, method, headers, or time; filters can rewrite paths, add headers, rate-limit, circuit-break, or retry. You can also write custom filters in Java for bespoke logic (e.g. injecting a tenant header from a claim).

Authentication at the edge

The gateway is the natural place to validate access tokens, so unauthenticated traffic never reaches your services. Configure it as an OAuth2 resource server (or client, for login flows) to validate the JWT’s signature, issuer, audience, and expiry, then forward the request — often propagating the token or claims downstream.

spring:
  security:
    oauth2:
      resourceserver:
        jwt:
          issuer-uri: https://login.example.com/

An important caveat: edge authentication is necessary but not sufficient under zero trust. Services should still validate tokens themselves rather than blindly trusting “it came through the gateway,” because an attacker inside the network could otherwise bypass the front door. The gateway reduces load and centralizes policy; it doesn’t excuse services from verifying.

Distributed rate limiting with Redis

Rate limiting protects backends and enforces fair use, but it only works if the limit is shared across all gateway instances — an in-memory counter per instance lets a client multiply its quota by the number of replicas. Spring Cloud Gateway’s RequestRateLimiter uses Redis and a token-bucket algorithm so the limit is global. You define who a limit applies to with a KeyResolver — per user, per API key, or per IP.

@Bean
KeyResolver userKeyResolver() {
  // limit per authenticated user; fall back to remote address
  return exchange -> exchange.getPrincipal()
      .map(Principal::getName)
      .defaultIfEmpty(exchange.getRequest().getRemoteAddress().getAddress().getHostAddress());
}

Return a clear 429 Too Many Requests with a Retry-After header so well-behaved clients can back off. Consider tiered limits (a paid tier gets a higher rate) keyed off the token’s claims.

Resilience at the boundary

The gateway is a strategic place to apply resilience patterns because it sees every inbound call. Add timeouts so a slow backend can’t tie up gateway resources, circuit breakers (Resilience4j) so a failing service fails fast into a fallback instead of cascading, and bounded retries for idempotent routes. This complements — rather than replaces — the in-service resilience covered in our circuit breakers deep dive; defense in depth at both layers is normal.

CORS, payload limits, and shaping

Centralize browser concerns and protection at the edge: configure CORS once at the gateway rather than per service; enforce maximum request body size to blunt abusive payloads; and use filters to normalize headers, strip internal ones, or inject correlation/trace IDs so everything downstream is consistent. Doing this at the gateway keeps services focused on business logic.

Gateway vs service mesh

These are frequently confused but solve different problems, defined by traffic direction:

	API Gateway	Service Mesh
Traffic	North-south (clients → system)	East-west (service ↔ service)
Lives	At the edge	Between services (sidecars)
Concerns	Auth, rate limiting, routing, TLS termination	mTLS, retries, traffic shifting, observability
Example	Spring Cloud Gateway	Istio, Linkerd

They are complementary. A large platform commonly runs a gateway at the front door for client-facing policy and a mesh inside for transparent, language-agnostic service-to-service security and reliability. Use the gateway for what clients see; use the mesh for what services do among themselves.

Operational considerations

Don’t let it become a monolith. Keep business logic out of the gateway — it routes and enforces edge policy; it doesn’t make domain decisions.
It’s a single point of failure. Run multiple replicas behind a load balancer, with health checks and autoscaling.
Watch latency. Every request passes through it; keep filters lean and the gateway well-resourced.
Managed alternatives exist. AWS API Gateway, Azure API Management, and Kong/Apigee provide gateways as a service if you’d rather not operate Spring Cloud Gateway yourself — the concepts transfer directly.

Takeaways

An API gateway gives your microservice platform one secure, observable front door — centralizing routing, authentication, rate limiting (distributed via Redis), TLS, and edge resilience so individual services don’t each reinvent them. Spring Cloud Gateway is the reactive, Java-native way to build one. Remember its boundaries: pair it with per-service token validation for zero trust, complement it with a service mesh for east-west traffic, and keep business logic out of it. Done right, the gateway is the calm, consistent entry point that makes everything behind it simpler.

Frequently asked questions

What is an API gateway and why do microservices need one?
An API gateway is a single entry point in front of your services that handles cross-cutting concerns — routing, authentication, rate limiting, TLS termination, CORS, and request shaping — so each service does not reimplement them. It gives clients one stable address and centralizes edge policy.

What is the difference between an API gateway and a service mesh?
A gateway handles north-south traffic (clients into the system) at the edge. A service mesh handles east-west traffic (service-to-service) inside the system, providing mTLS, retries, and observability transparently. They are complementary: a gateway at the front door, a mesh between services.

Land your next Java role — tailor your resume with AI →