Engineering

Preserving the Real Client IP Through a Proxy Chain That Rewrites the Evidence

Q: Why is X-Forwarded-For unreliable behind a proxy chain?

Any proxy in the path can overwrite or append to X-Forwarded-For. Once a later hop rewrites it, the leftmost value becomes an internal egress IP. The syntax still looks valid, so the application is confidently wrong.

Q: How do I stop clients from spoofing the preserved header?

Strip any incoming X-Original-Client-IP at the outer edge, or overwrite it at the first trusted boundary. If arbitrary clients can set it directly, the header forges an origin rather than preserving one.

June 20, 202610 min read

I was staring at a request to /api/nearby/ that claimed the visitor lived inside our proxy chain. The feature needed the visitor's actual network address so a local business could be ranked with the right regional context, but proxy chain client IP preservation had already failed by the time the request reached the application. Both X-Forwarded-For and CF-Connecting-IP had stopped describing the browser. They described the last machine polite enough to forward the request. The fix was to capture the first X-Forwarded-For value earlier in the platform .htaccess, write it into X-Original-Client-IP, and pass that header unchanged through nginx into the app.

That outcome matters more than the particular header name. In a layered edge stack, identity is a perishable thing. If you wait until the application to ask where the request came from, you may be interrogating a witness after three people have rewritten the statement. The right move is to preserve the first trustworthy observation at the earliest boundary you control, then treat every later hop as transport.

TL;DR

When a request crosses a CDN, a proxy chain, and nginx or Apache, X-Forwarded-For and CF-Connecting-IP get rewritten to internal egress IPs, so the leftmost value is no longer the real client. The application layer cannot recover a value that was already overwritten upstream. The fix: capture the original IP at the earliest trusted boundary into a purpose-built header (X-Original-Client-IP), then have every downstream hop transport it rather than re-derive it. Strip incoming copies at the edge, or any client can set that header and forge its own origin.

Where proxy chain client IP preservation failed

The broken path looked normal enough to be irritating. A browser hit a CDN and proxy chain, then the request moved through the platform, nginx, and finally the application. Each hop had a defensible reason to touch forwarding headers. None of the individual systems looked obviously malicious or broken. Together, they made the application confidently wrong.

The usual mental shortcut is that X-Forwarded-For is a breadcrumb trail: client first, then proxy one, proxy two, and so on. That is only useful if every participant appends and preserves in the same way, and if the application sees the original chain. In this case, it did not. Somewhere before the backend application, X-Forwarded-For and CF-Connecting-IP were being rewritten to proxy egress IPs. The app was not seeing the visitor. It was seeing infrastructure wearing a name tag.

The first wrong assumption was mine: I expected the application layer to be the place where we could solve this. That is a comfortable bias for application engineers. We like fixes we can unit test. But there was no clever Django helper that could recover a value already replaced upstream. Once the evidence is gone, application code can only choose which wrong value to believe.

The mechanism, not the folklore

X-Forwarded-For is a comma-separated list. The leftmost value is conventionally the original client IP. A proxy that appends its own address produces something like this:

X-Forwarded-For: 203.0.113.44, 10.0.0.12, 10.0.0.28

If the request is still intact when it reaches a boundary you control, the first value is the one you want to preserve. If a later proxy overwrites the header, that leftmost value may become an internal egress address. At that point the syntax still looks valid, which is a small cruelty. Logs with plausible headers are worse than logs with missing headers because they encourage tidy explanations.

Step 1: Capture the original IP in Apache `.htaccess`

The repair was deliberately boring. In the platform .htaccess, capture the first value of X-Forwarded-For for the affected route and write it into a new header:

<IfModule mod_setenvif.c>
  SetEnvIfNoCase Request_URI "^/api/nearby/" IS_NEARBY_ROUTE=1
  SetEnvIfNoCase X-Forwarded-For "^([0-9a-fA-F:.]+)" ORIGINAL_CLIENT_IP=$1
</IfModule>
 
<IfModule mod_headers.c>
  RequestHeader set X-Original-Client-IP "%{ORIGINAL_CLIENT_IP}e" env=IS_NEARBY_ROUTE
</IfModule>

Step 2: Forward it unchanged in nginx

Then nginx forwards that header without trying to reinterpret it:

location /api/nearby/ {
    proxy_set_header X-Original-Client-IP $http_x_original_client_ip;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_pass http://app_upstream;
}

Step 3: Resolve it explicitly in the app

In the application, the lookup becomes explicit. Prefer the preserved header only when the request came from the known proxy path. Fall back to REMOTE_ADDR for local and non-proxied cases:

def client_ip_from_request(request):
    original = request.META.get("HTTP_X_ORIGINAL_CLIENT_IP", "").strip()
    if original:
        return original.split(",", 1)[0].strip()
 
    forwarded = request.META.get("HTTP_X_FORWARDED_FOR", "").strip()
    if forwarded:
        return forwarded.split(",", 1)[0].strip()
 
    return request.META.get("REMOTE_ADDR", "")

That code is not the security boundary by itself. The security boundary is the infrastructure rule that only trusted ingress can set or preserve X-Original-Client-IP. If arbitrary clients can send that header directly to the app, you have built a spoofing feature with a polite name.

Important

A preserved IP header is only useful when the app trusts the hop that set it. Strip incoming copies at the outer edge or overwrite them at the first trusted boundary.

The preserve-then-transport model

The reusable model is preserve-then-transport. At each hop, ask one question: is this the earliest boundary that still has the real value? If yes, freeze it into a purpose-built field. After that, downstream systems should transport it, not derive it again.

This is not limited to IP addresses. The same model applies to request IDs, authenticated user context, experiment assignments, region hints, and any other value that can be degraded by later infrastructure. Systems tend to accrete proxies because each one solves a local problem: caching, routing, TLS termination, WAF checks, blue-green deploys. The compound effect is that no single layer owns the truth unless you name the ownership.

For this incident, ownership landed in the platform .htaccess because that was the earliest practical place we controlled and could deploy. I would rather have had a cleaner edge configuration with a formal contract that stripped untrusted incoming headers, set the preserved value once, and documented all downstream expectations. Reality offered .htaccess, Ansible templates for non-production nginx, and contrib/production for production. Glamour was unavailable, so we used the tool that could sit in the right place.

What we rejected

We considered making the application smarter about X-Forwarded-For. That failed the basic mechanism test. The app could parse a list, but it could not know whether the list still contained the visitor. Adding clever parsing after the rewrite point would make the code look more responsible while leaving the behavior wrong.

We also considered relying on CF-Connecting-IP. That header is useful when it reaches you intact from the CDN boundary. Here it had the same problem as X-Forwarded-For: by the time the backend application saw the request, it had already been rewritten somewhere in the chain. A header with a famous name is still just a header.

Another option was changing every proxy layer to preserve and append perfectly. That is architecturally cleaner and operationally more expensive. It requires coordination across systems with different ownership, deploy mechanisms, and rollback risks. For a route-specific production issue, the smaller fix was to capture the value at the first controlled point where it was still correct, then document the path so the next person did not have to rediscover it from logs and suspicion.

Option	Why it looked attractive	Why we rejected it
Parse `X-Forwarded-For` in the app	Low code change, easy tests	The header was already rewritten
Trust `CF-Connecting-IP`	Familiar CDN convention	It was no longer trustworthy at the backend application
Rework every proxy	Cleaner long-term contract	More coordination and deployment risk
Capture in `.htaccess`	Earliest controlled boundary	Carries Apache config sharp edges

The chosen approach still has costs. Route-specific header rules can drift when paths change. Apache environment variable syntax is not where most application engineers spend their happiest afternoons. There is also a real maintenance obligation: production and non-production templates must keep forwarding behavior aligned, or the staging environment becomes a rehearsal for a different play.

How we proved it

The test was to send a known forwarding chain through a non-production environment and inspect what the application received. The exact command was mundane, which is what you want when debugging identity plumbing:

curl -sS \
  -H 'X-Forwarded-For: 203.0.113.44, 10.0.0.12' \
  -H 'CF-Connecting-IP: 203.0.113.44' \
  'https://staging.example.com/api/nearby/' \
  -o /dev/null \
  -D -

On the server side, log the relevant values together for one request path:

log_format client_ip_debug '$remote_addr '
                           'xff="$http_x_forwarded_for" '
                           'orig="$http_x_original_client_ip" '
                           'uri="$request_uri"';
 
access_log /var/log/nginx/client_ip_debug.log client_ip_debug;

The expected result is boring and specific:

orig="203.0.113.44"

The app should use 203.0.113.44 for nearby-shop logic, even if REMOTE_ADDR is a proxy and X-Forwarded-For grows downstream. The preserved value is not evidence that the visitor is good, local, authenticated, or entitled to anything. It is only evidence of the client IP observed at the trusted boundary. That distinction matters. Location-ish features can tolerate some ambiguity. Access control should demand a higher bar.

Deep-dive: A safer app-side parser

The app parser should be small and suspicious. It should reject empty strings, trim whitespace, and avoid treating a comma-separated spoof as multiple authoritative values. In Python, the standard library can at least validate shape:

from ipaddress import ip_address
 
 
def parse_single_ip(value):
    candidate = value.split(",", 1)[0].strip()
    if not candidate:
        return ""
    try:
        return str(ip_address(candidate))
    except ValueError:
        return ""

This does not prove the value is truthful. It only prevents malformed input from becoming a quiet dependency.

FAQ

Why is X-Forwarded-For unreliable behind a proxy chain?

Any proxy in the path can overwrite or append to X-Forwarded-For. Once a later hop rewrites it, the leftmost value becomes an internal egress IP. The syntax still looks valid, so the application is confidently wrong.

Can I just trust CF-Connecting-IP instead?

Only if the backend sits directly behind Cloudflare. In a multi-hop proxy chain the value is no longer trustworthy by the time it reaches the application.

Where should I capture the real client IP?

At the earliest boundary that still holds the true value. Freeze it into a dedicated header such as X-Original-Client-IP, and have downstream systems transport it rather than re-deriving it.

How do I stop clients from spoofing the preserved header?

Strip any incoming X-Original-Client-IP at the outer edge, or overwrite it at the first trusted boundary. If arbitrary clients can set it directly, the header forges an origin rather than preserving one.

Solution summary for original client IP preservation

For the /api/nearby/ route, capture the visitor IP at the earliest trusted boundary before forwarding headers are rewritten. In the platform .htaccess, read the first value from X-Forwarded-For and set X-Original-Client-IP. In nginx, forward X-Original-Client-IP unchanged to the app backend. In the application, prefer HTTP_X_ORIGINAL_CLIENT_IP only for requests from the trusted proxy path, then fall back to conventional headers for lower-stakes or local cases. Verify with curl using a known X-Forwarded-For chain and log $http_x_original_client_ip beside $http_x_forwarded_for and $remote_addr.

The documentation mattered almost as much as the config. The team documentation was updated to capture the full process That was the right instinct. Without a written request path, proxy bugs become oral tradition with packet captures.

A proxy chain is not a neutral pipe. Each hop can rewrite the headers it touches, so the original value survives only if some layer is told to preserve it. Capture the fact while it is still there, name it plainly, and make every later hop carry it with as little imagination as possible.