API Optimization- CS Patterns

Cool, you have a functional API but are you sure it will be able to handle the load, be optimized for the kind of device it is called from. Lets walk through some practical strategies to improve your API performance.

Adapting API to Client Needs

APIs can be tailored according to end devices*(phones, laptops etc)* to improve performance.

Intermediate Adaption Layer: We detect client type (say from headers) and tailor the responses based on client. Ex - Netflix
Expandable Fields: Design API such that client can request expanded version of objects by adding a parameter. Ex - GET /users?expand=profile,preferences
GraphQL: A query language for the API to only request the resources we need.

Efficient Data Fetching Patterns

Short Polling: Client requesting data at regular intervals (usually < 1 minute)
Long Polling: Client stays connected to server until it gets the update or the waiting time is reached (unlike short polling). Also called hanging get.
WebSockets: Fully duplex communication protocol that runs on single TCP connection.

Frontend Rendering Methods

Server-Side Rendering (SSR): Generate a complete HTML doc on the server and send it to client as response. On initial load page is static but once the JS file is downloaded then it becomes interactive. This method makes fewer API calls, as data comes bundled with initial HTML.
Client-Side Rendering (CSR): Browser generates the complete web page, instead of on the server. Multiple API calls are made after page load and also has higher client-side processing.

Resource Hints

Resource hints are instructions that help browser in managing resources (webpages, images, and even APIs).

Preloading- Force download a resource asap as it is crucial to the page.

<link rel="preload" href="/api/critical-data">

Prefetching- Allows browser to fetch resources that might be needed later and store in browsers cache. Has a lower download priority than preload.

<link rel="prefetch" href="/api/secondary-data">

Preconnect- Helps browser establish connection to a domain in advance (establish early TCP connection, complete DNS lookup and Handle TLS negotitations in advance)

<link rel="preconnect" href="https://api.some-service.com">

API Gateway

In cases of microservices architecture, clients shouldn’t talk to every service directly. That’s where we use a API Gateway as a central entry point. Its primary job is to -

Handles routing, caching, auth, and more.
Improve performance and reduce client complexity.

Debouncing

It is a API optimization technique, where we limits the times an event fires. Imagine you’re typing in a search box, so instead of making an API call for every keystroke, debouncing waits until you’ve finished typing before making the request.

Rate Limiting

It is a constraint mechanism that controls number of requests by using throttling (discard some of the requests) to protect system resources.

To avoid latency by rate limiter a cache can be used.
Common Algorithms - Fixed window counter, Token bucket, leaking bucket, sliding window log

Caching at different layers

A cache acts as a temporary memory store for frequently accessed API responses, using eviction policies like LRU (Least Recently Used), LFU (Least Frequently Used), and MRU (Most Recently Used) to manage storage.

Caching Layers

Browser’s has a built in cache.
CDN Cache → Geographically distributed caching servers that store content closer to users for faster delivery.
Server side →
- API Gateway Cache- Stores frequent API responses to avoid redundant processing
- Web Server Cache- Maintains commonly requested static web pages
- Database Cache- Stores results of complex, time-consuming queries
- Distributed Cache- Solutions like Redis and Memcached for scalable caching across servers

HTTP Cache Headers

Server sends the HTTP response to the client, it also sends the cache headers in the response. The header (Cache-Control) define caching policies like -

Cacheability: no-cache, no-store
Scope: private, public
Expiration: max-age, s-max-age

API Monitoring

Monitoring helps track API performance and identify issues before they affect users.

Types of Monitoring -

Type	Description
Synthetic	Automated scripts that simulate user behavior from different locations
Real User	Tracking actual user interactions in production

Key Metrics -

Performance Metrics - Requests per minute (RPM), Latency (response time)
Reliability Metrics - Availability (uptime), Failure rate

Regular monitoring helps maintain API health and user satisfaction, there are many tools one case use for monitoring like Prometheus, Datadog etc

API Optimization & Performance Patterns