Bassam Ismail
Writing
Engineering

Preserving the Real Client IP Through a Proxy Chain That Rewrites the Evidence

10 min read

I was looking at /locations/nearby/, a location-sensitive endpoint that needs the visitor's address to return nearby locations, and the logs were telling a clean lie. The request arrived with X-Forwarded-For and CF-Connecting-IP, but both had already been rewritten to proxy egress addresses by the time the request reached the application. The fix was not to trust those headers later with more confidence. It was to preserve the original client IP earlier, at the first layer where the evidence was still intact, by copying the first X-Forwarded-For value into X-Original-Client-IP in the hosting provider .htaccess, then passing that header through nginx to the app.

That is the practical rule I would argue for: in a proxy chain, identity is perishable data. If a downstream service needs the real visitor IP, capture it at the earliest trusted boundary and give it a new, explicit name before another proxy gets a chance to normalize, overwrite, or reinterpret it.

Why the header looked right until it mattered

The failure mode was annoying because nothing was obviously broken. The request still had IP-shaped headers. The app still saw values in the usual places. But the values were no longer the visitor. They were artifacts of the path the request had taken.

A typical mental model for client IP handling is too flat:

Browser -> App

In a real hosted stack, the path is closer to this:

visitor browser

CDN edge

hosting proxy

nginx proxy

the application

Each layer can add or rewrite headers. X-Forwarded-For is usually a comma-separated list where the leftmost value is intended to be the original client and later values are proxies. CF-Connecting-IP is a provider-specific attempt to preserve the connecting client seen at the CDN edge. Both are useful only if you understand which layer created them and which layers are allowed to change them.

By the time /locations/nearby/ reached the application, both familiar headers had become untrustworthy for the business question we were asking: where is this visitor roughly coming from? The app did not need the last proxy hop. It needed the first public client address observed before the internal proxy chain rewrote the evidence.

Capture the real visitor IP end-to-end through the proxy chain.

That line sounds simple. The hard part was deciding where the word "real" could still be defended.

Where original client IP belongs at the trusted boundary

The right place to solve this was not inside the application. Once the request reached the app, the original and the rewritten values were indistinguishable unless we brought extra context from the edge.

So we moved the responsibility earlier. The hosting provider .htaccess still had access to the incoming X-Forwarded-For value before the application received the rewritten proxy headers. There, we captured the first value and wrote it into a dedicated header:

<IfModule mod_setenvif.c>
  SetEnvIfNoCase X-Forwarded-For "^([^,[:space:]]+)" ORIG_CLIENT_IP=$1
</IfModule>
 
<IfModule mod_headers.c>
  RequestHeader set X-Original-Client-IP "%{ORIG_CLIENT_IP}e" env=ORIG_CLIENT_IP
</IfModule>

For a route-specific version, use the request URI as a guard so the behavior is scoped to the endpoint that needs it:

<IfModule mod_setenvif.c>
  SetEnvIf Request_URI "^/locations/nearby/" IS_NEARBY=1
  SetEnvIfNoCase X-Forwarded-For "^([^,[:space:]]+)" ORIG_CLIENT_IP=$1
</IfModule>
 
<IfModule mod_headers.c>
  RequestHeader set X-Original-Client-IP "%{ORIG_CLIENT_IP}e" env=ORIG_CLIENT_IP
</IfModule>

The second snippet still sets the header whenever ORIG_CLIENT_IP exists. In our deployment, the route guard belonged with the deployed .htaccess context and the endpoint-specific traffic path. If your application has mixed traffic under the same document root, test the scope instead of assuming the guard is enough.

The nginx side should then pass the preserved value without recomputing it:

location /locations/nearby/ {
    proxy_set_header X-Original-Client-IP $http_x_original_client_ip;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header Host $host;
    proxy_pass http://app_backend;
}

And the app should prefer the explicit preserved header, with a controlled fallback only if the header is absent:

def get_client_ip(request):
    preserved = request.headers.get("X-Original-Client-IP")
    if preserved:
        return preserved.strip()
 
    forwarded = request.headers.get("X-Forwarded-For", "")
    if forwarded:
        return forwarded.split(",", 1)[0].strip()
 
    return request.META.get("REMOTE_ADDR", "")

That fallback is not a security policy. It is an availability policy. If your endpoint makes fraud, rate-limit, payment, or access-control decisions, you should validate the sender and the trusted proxy list before accepting any client-supplied header.

Important

A header named X-Original-Client-IP is only as trustworthy as the layer that sets it and the layers allowed to forward it. Strip inbound copies at the edge if untrusted clients can send the same header.

The mechanism is simple, the trust model is not

The mechanical part is a string copy. The architectural part is deciding which hop owns truth.

Before the fix, the application had to infer identity from headers that had already been altered:

visitor

edge adds headers

host rewrites headers

nginx forwards request

app reads proxy IP

After the fix, the first useful value is promoted into a purpose-specific header before the chain mutates it:

visitor

edge receives IP

host captures first XFF

X-Original-Client-IP

nginx preserves header

app uses preserved IP

I like the phrase "promoted" here because it describes the intent better than "copied." We did not copy X-Forwarded-For because we love headers. We promoted one value from an overloaded transport header into a domain-specific contract. X-Forwarded-For describes a path. X-Original-Client-IP describes the value the application needs.

That distinction matters in reviews. If someone later changes a CDN rule, an nginx template, or a hosting proxy behavior, they can ask a specific question: does this preserve X-Original-Client-IP for /locations/nearby/? Without that named contract, every future deploy reopens the argument from first principles. That is the same reason the data contract mattered in Building Press, Part 2: The data model is the spine: the useful part is not just where data lives, but what the rest of the system is allowed to assume about it.

What we rejected

The tempting fix was to keep adjusting application logic until it found a plausible IP. For example, we could have tried a priority list:

OptionWhy it was temptingWhy I rejected it
Prefer CF-Connecting-IPIt often represents the client at the CDN edgeIt was already rewritten in this path
Parse X-Forwarded-For in the applicationThe app already receives the headerThe meaningful first value had been lost before the app saw it
Use REMOTE_ADDRIt is available without custom plumbingIt identifies the immediate proxy, not the visitor
Add app heuristicsIt avoids infrastructure changesIt turns an architecture bug into a guessing problem

The last one is the trap. Heuristics feel productive because they produce code. They also hide the failing contract. If a proxy chain changes tomorrow, the app becomes a forensic parser of whatever happened upstream rather than a consumer of a documented interface.

I also rejected making the new header a broad platform convention in the first pass. That might be the right long-term shape, but the immediate need was /locations/nearby/. Scoping the change kept the blast radius small, gave us a concrete test path, and made review easier. The broader convention can come after the first documented path proves itself.

The repeatable debugging model

The useful artifact was not only the config. It was the documented request path. Header bugs get slippery because each layer can show a locally correct view. The CDN can say it sent the right value. The hosting layer can say it forwarded a valid request. nginx can say it passed what it received. The app can say the header was present. All of those can be true while the user-facing behavior is wrong.

I now use a three-column model for this class of issue:

LayerQuestionEvidence
Earliest trusted boundaryWhere is the visitor IP first visible?Raw request headers or edge logs
Mutation pointsWhich systems can rewrite or append headers?CDN, host, reverse proxy config
Application contractWhich header should the app trust?A named, documented header

For this incident, the answer was straightforward once written down:

Earliest trusted boundary: the hosting provider .htaccess
Mutation points: hosting proxy chain and nginx forwarding
Application contract: X-Original-Client-IP
Consumer path: /locations/nearby/

A quick smoke test can validate the behavior from a controlled environment:

curl -sS -H 'X-Forwarded-For: 203.0.113.10, 198.51.100.20' \
  https://your staging environment URL/locations/nearby/ \
  -o /tmp/nearby-response.json

For lower-level verification, log the preserved header at the proxy boundary during a short non-production test window:

log_format client_ip_debug '$remote_addr "$http_x_forwarded_for" "$http_x_original_client_ip"';
access_log /var/log/nginx/client-ip-debug.log client_ip_debug;

Then inspect only enough lines to prove the contract:

tail -n 50 /var/log/nginx/client-ip-debug.log

The expected shape is not that every IP matches. It is that the first address from the original X-Forwarded-For value is now visible as X-Original-Client-IP when the request reaches the next hop.

Deep-dive: Why the first X-Forwarded-For value matters

X-Forwarded-For is conventionally built as a comma-separated chain. A client with IP 203.0.113.10 reaches proxy A, which forwards to proxy B, which forwards to the app. The header often grows like this:

X-Forwarded-For: 203.0.113.10, 198.51.100.20, 192.0.2.30

The leftmost value is the original client in that convention. The rightmost values are later hops. This is useful only when the proxies are trusted and append rather than replace. If an untrusted client can send the initial header, or a middle layer overwrites the chain, the convention stops being enough. That is why the capture point matters as much as the parsing rule.

Solution summary

For /locations/nearby/, the application was receiving rewritten proxy addresses in X-Forwarded-For and CF-Connecting-IP, so location lookup could not reliably use those headers. The fix was to capture the first X-Forwarded-For value in the hosting provider .htaccess, store it as X-Original-Client-IP, forward that header through nginx, and make the app read the preserved header first. The important design choice was moving original client IP preservation to the earliest trusted boundary instead of adding downstream heuristics.

The remaining sharp edges

This approach still carries costs. It adds another header contract that future infrastructure changes must preserve. It requires the team to know which layer is allowed to set the header. It also needs defensive handling for spoofing: if a public client can inject X-Original-Client-IP before the trusted boundary, the boundary should clear that inbound value before setting its own.

There is also a documentation burden. The Slack note to document the whole process was not administrative cleanup. It was part of the fix. Header identity problems are path-dependent. Without a written path, the next person sees only a pile of familiar names and has to rediscover which one lies at which hop. That review burden is familiar from Building Press, Part 4: Review is where the human gates the irreversible: once a change can affect production behavior, the human-readable contract is part of the work.

The broader point is that infrastructure metadata has a shelf life. Some data should not be carried through a system under its original overloaded name. When a downstream service depends on a value for product behavior, preserve it early, name it for the contract it serves, and make the proxy chain boring enough that the app does not need to become an investigator.