A forward proxy sits between users and the internet, acting as an intermediary for outbound requests. When users want to access external websites, their traffic goes through the forward proxy first. This setup allows to monitor/control internet usage, cache common requests, and mask users’ original IP addresses.

It is the opposite of a Reverse Proxy.

Private forward proxy to mask IP addresses

This can be used so that local network servers can all access the internet through a common IP we control. It is useful when accessing external services which use IP whitelists. It is not relevant when the servers are behind a NAT router and therefore not a setup I use in my Homelab, but elsewhere.

# ...
 
http {
	# ...
	
    resolver 1.1.1.1 1.0.0.1 valid=300s;
    resolver_timeout 5s;
 
    map $request $forward_scheme {
        "~ https://" "https";
        default      "http";
    }
 
    log_format fwd
    	'[$time_local] $proxy_add_x_forwarded_for "$http_user_agent" - status=$status sent=$bytes_sent - "$request"';
 
    server {
        listen 8080;
    
    	access_log /var/log/nginx/fwd-proxy.access.log fwd;
        error_log /var/log/nginx/fwd-proxy.error.log notice;
 
        allow 172.16.0.0/16; # also restricted in the env's firewall
        deny all;
 
        location / {
            proxy_pass $forward_scheme://$http_host$request_uri;
 
            proxy_set_header Host $http_host;
 
            proxy_ssl_server_name on;
            proxy_ssl_protocols TLSv1.2 TLSv1.3;
            proxy_ssl_verify off;
 
            proxy_connect_timeout 20s;
            proxy_send_timeout 20s;
	        proxy_read_timeout 20s;
        }
    }
}

Scheme and proxy_pass

Here, clients of the proxy connect to it through HTTP. This is ok since the proxy is on our own local network. This means the default Nginx variable $scheme will always be “http”, as it’s inferred from the client’s connection.

The thing is that we want to support both HTTP and HTTPS targets. Therefore $scheme cannot be used and we have to parse the target’s address to define a $forward_scheme (arbitrary name), which must be “https” if the target contains “https://” and “http” otherwise. This is what the map at the beginning does.

Actually it’s unfortunate Nginx doesn’t offer a variable with the full URL and the full URL only, e.g. https://a.b.com:1234/xyx?m=n. $request contains the full URL but also more, e.g. POST https://a.b.com:1234/xyx?m=n HTTP/1.1. So the full URL it needs to be reconstructed for proxy_pass using $forward_scheme. The other elements are http_host (a.b.com:1234) and request_uri (/xyx?m=n).

Variables

Default variables $host and $uri are processed and potentially modified versions of $http_host and $request_uri, respectively.

Other useful variables include $arg_{name} (where $arg_m is n in the above example), $content_type, $request_method.

$http_{name} are the values of arbitrary HTTP headers. For example, $http_user_agent is the User-Agent header (“Mozilla/5.0 …”).

List of Nginx http module variables: https://nginx.org/en/docs/http/ngx_http_core_module.html.

SSL / TLS

proxy_ssl_verify

In this scenario, the proxy acts as a normal client, initiating the HTTPS connection to the target itself, like any HTTPS client would.

proxy_ssl_verify is off by default. Turning it on would be a nice security improvement. Without it, the proxy won’t verify the validity of the target’s SSL certificate, which could expose us to MITM attacks between the proxy and the target.

The two following options because necessary:

  • proxy_ssl_trusted_certificate /path/to/ca-certificates.crt;
    • Points to a file containing trusted CA certificates in PEM format
    • Typically points to the system’s CA bundle (like /etc/ssl/certs/ca-certificates.crt on Debian/Ubuntu)
    • Only needs public certificates (unlike ssl_certificate which needs private keys)
  • proxy_ssl_verify_depth 2;
    • Controls how many intermediate certificates are allowed in the chain
    • Default is 1, meaning: target certificate + 1 intermediate
    • Most public sites need at most depth 2

I’ve had issues making this work though and haven’t tested further. The target’s SSL certificates were incorrectly rejected by the proxy. Possibly my CA bundle was not up-to-date.

proxy_connect

Another approach for HTTPS connections through forward proxies is to use proxy_connect. The TLS connection to the target is then negotiated by the client itself through a tunnel, and the proxy has no access whatsoever to the data exchanged between the two. In other words, this ensure end-to-end encryption between the client and the target.

In the current setup, the proxy could log everything that goes through it, and it is fine since we own and trust the proxy. proxy_connect is mostly useful for public forward proxies.

It works using an HTTP method called CONNECT, which requests to establish a TCP tunnel between the client and the target, unrelated to TLS and so before any HTTP/HTTPS traffic flows. More info. The proxy then just forward raw TCP bytes.

proxy_connect is usually not supported by default by Nginx, which needs to be explicitly compiled with this module.

Headers and IPs

Proxies often set headers such as X-Real-IP and X-Forwarded-For and X-Forwarded-Proto.

In our case, we do not want to set those, because we do not want or need to expose our internal network’s architecture. As far as the target is concerned, the request comes from the proxy. The point of the setup discussed here was to only expose / whitelist the proxy’s IP.

  • X-Forwarded-Proto tells the target server what protocol (http/https) the client originally used.
  • X-Real-IP is set to a single IP address that represents the original client. If there are multiple proxies in the chain, it will typically be set by the first proxy and won’t change.
  • X-Forwarded-For is a comma-separated list of IPs showing the chain of forwarding. Each proxy appends the IP it received the request from, building a trail of all proxies involved: X-Forwarded-For: client_ip, proxy1_ip, proxy2_ip.

As for the Host header, it is essential for virtual hosting - it tells the target server which website we’re trying to access when multiple sites are hosted on the same IP. Since we’re using a forward proxy with full URLs in the request URI, Nginx should extract the correct host from the URL and it might work without setting this header. However, it’s considered best practice to set Host explicitly to ensure consistent behavior.

IP Spoofing

One simple IP spoofing technics consists of sending a fake X-Real-IP header to the target. It is therefore an Application-layer security concern.

curl https://httpbin.io/anything -H "X-Real-IP: 1.1.1.1"