When I was figuring out how to enable our team at the newspaper to work in a more distributed fashion while maintaining network security, I looked at a variety of VPN and proxy options. In the end I settled on a simple proxy server setup using the squid open source proxy software.
I needed a solution that would work not only for our staff, where I could have some significant control over the setup of their work computing devices, but also for our contractors, where they would be using devices that were not under our control and presumably used for a mix of other things. I didn’t want anything locked in to a hardware vendor or dependent on a specific physical office location having connectivity. I wanted something fast to set up for each individual user, and that wouldn’t require installation of special software, activating/monitoring connections, or worrying about variations in network rules about what VPN connections were allowed.
I settled on squid:
- We have a proxy auto-configuration file (PAC) that we serve publicly.
- We set up a squid username/password for every user that needs one
- Users tell their device the URL of the PAC file
- Their device uses our squid proxy for the hosts/IPs we specify in the PAC file, and otherwise uses their default network configuration.
- The first time their device attempts to connect via the proxy, they’re prompted to enter their username/password, and that typically is saved in their OS settings forever.
Using this approach we can ask network traffic bound for one of our internal services/servers to go through our proxy instead of over the public Internet. That in turn allows us to limit connection attempts to only trusted internal hosts (including the proxy), creating a kind of simple private network.
(Hat tip to the Automattic systems team for this concept, which I saw successfully deployed for a much larger user base when I worked there.)
We use this setup for internal web application servers, file sharing, phone system endpoints, and restricting access to some third-party services that allow IP-based access rules. It’s working on laptops running various OSes, mobile devices and other devices.
Once it’s set up on a user’s system, they don’t have to think about it ever again. A user has to trust that we will not proxy traffic that they don’t want proxied, but in theory they can examine the PAC file at any time to see how things are routed.
I’m sure it’s not as powerful or flexible as some of the modern VPN solutions out there, but it’s been dead simple to set up and maintain. We run it on a $4/month Digital Ocean droplet with 1 CPU, 512MB of RAM and .5TB/month of bandwidth. It’s been humming along smoothly for several years now.
I mostly followed this guide from Digital Ocean on how to set up a squid proxy server on Ubuntu. I used nginx instead of Apache for web server software.
For firewall ports, we allow HTTPS/443 for serving the PAC file, HTTP/80 for Let’s Encrypt certificate validation, and port 3128 for proxy access.
We use a Digital Ocean reserved IP address for the droplet so we could move it to another droplet or region if needed. To make sure the droplet’s outbound network traffic also originates from the reserved IP (needed for consistency in firewall rules), we followed these instructions from Digital Ocean.
We use fail2ban to block IP addresses that are trying too hard to use our proxy without proper authentication. There are a lot of them!
Our monitoring tool checks to make sure the proxy is available for new connections, that the PAC file is being served, that the PAC file contents are what we expect them to be, that unauthenticated proxy requests are rejected, and that authenticated requests are successfully routed through the proxy. I have a simple shell script with sqlite datastore monitor in place that alerts when someone authenticates to the proxy from an IP address that we’ve never seen before. And I receive a regular report of which users/IPs are connecting to the proxy the most with how much data is transferred.
Password management is handled old-school:$ htpasswd /etc/squid/passwords [email protected]
I’ll automate that “some day.”
The proxy auto-configuration file looks roughly like this:
function FindProxyForURL( url, host ) {
if ( isPlainHostName( host ) ) {
return 'DIRECT';
}
# Do not proxy requests to where the PAC file is hosted
if ( dnsDomainIs( host, 'our.proxy.hostname.com' ) ) {
return 'DIRECT';
}
if ( url.substring(0, 4) == "ssh:" ) {
return 'DIRECT';
}
if ( dnsDomainIs( host, '.local' ) || dnsDomainIs( host, '.test' ) ) {
return 'DIRECT';
}
# Do not proxy connections when on a trusted company office subnet
if ( isInNet( myIpAddress(), '192.168.100.0', '255.255.255.0' ) ) {
return 'DIRECT';
}
# Proxy connections to a restricted/internal host
if ( 'private.host.example.com' == host ) {
return 'PROXY 12.34.56.78:3128';
}
# Do not proxy connections to common private network hosts
var resolved_ip = dnsResolve( host );
if (
isInNet( resolved_ip, '10.0.0.0', '255.0.0.0' )
||
isInNet( resolved_ip, '172.16.0.0', '255.240.0.0' )
||
isInNet( resolved_ip, '192.168.0.0', '255.255.0.0' )
||
isInNet( resolved_ip, '127.0.0.0', '255.255.255.0')
) {
return 'DIRECT';
}
return 'DIRECT';
}
I don’t like that this publicly available PAC file exposes some information about our network configuration and identifies some hosts that could be potential targets for attack. For now it’s a reasonable tradeoff.
One annoying thing is that some OSes do not support system-wide or multi-network PAC file configuration, and so each time one of those users connects to a new wireless network, ethernet connection, etc. their proxy configuration has to be set anew. I think we’re down to just one device in that situation, but maybe it would be a problem for a larger team with a more diverse range of devices.
Having this in place has simplified a lot of network security and maintenance things for me – it’s great!