Ssh – What’s the difference between SSH and Squid when using them as proxies

http-proxysquidssh

I've been using Squid on one of my server as a transparent proxy for a very long time (years).

Basically from the client I was creating an SSH tunnel between my client and the ssh + Squid server doing this:

ssh -T -N -x -C -L3128:127.0.0.1:3128 cedric@xxx

and then I'd launch my web browser (chromium) giving it a proxy-server at 127.0.0.1 on port 3128.

This worked fine for a very long time but then I started having logging issues on many websites (forums, stackexchange using Google as the login authority, etc.). Some kept working fine (like GMail and all the GMail services). I don't know what caused the login problems: I think it's due to some Squid configuration problem when I upgraded the server's OS and software.

So I considered setting up a VPN but then by reading an article called "SSH as a poor man's VPN", I realized that I could use SSH and simply do this, from the client:

ssh -D 5222 cedric@xxx -N

And then configure my browser to use 127.0.0.1 as a SOCKS host on port 5222.

And now everything works fine: all the login problems seem to be solved and I don't even need Squid anymore on the server.

However I don't understand how it works. From the various "what is my ip" websites, I see the address of my server (which is what I want). Also these sites do not seem to detect that there's a SSH "tunnel" going on.

Basically my question is: what is, technically, the difference between using Squid as a transparent proxy and using "SSH -D … -N" and using a VPN?

And from the viewpoint of the websites that I'm visiting, is there any difference?

Also I'd like to know if there's any way that a website I'm visiting would be able to detect my real IP when using "SSH -D … -N"? Is this technically feasible or is that information simply not available to the browser?

Best Answer

1) Squid is a special proxy for http and https traffic. As it knows this protocol it can offer some advanced features like caching, filtering, rewrite rules, dns resolving. All web browsers know how to handle http proxies as it's part of that protocol spec.

Additionally Squid can act as a transparent proxy where you don't configure your web browser to use it but a firewall redirects the traffic to the proxy. The browser doesn't know it's using a proxy which is the meaning of transparent in this scenario. Your old setup was not transparent in this sense.

2) ssh -D acts as a SOCKS proxy which is protocol agnostic. It just takes all traffic coming in on the socks port on the local side of the tunnel, extracts the payload and sends it to the remote side unmodified and unchecked and there sends it to the real destination. For a SOCKS proxy to work, the client program has to specially support it as each package sent from the client to the socks proxy has to be changed (encapsulated) so that the socks proxy knows where to send it. I don't know if all web browsers support it, but in principle it works with many protocols, not just http.

3) Your old commandline ssh -T -N -x -C -L3128:127.0.0.1:3128 ... only opened one simple tunnel where all traffic coming in on the local port 3128 was send to the predefined destination 127.0.0.1:3128 unmodified. This you can use for any client, as no modification of the sent packages are needed, but you have only a fixed destination.

In addition to proxying, ssh always uses an encrypted tunnel, so that your data cannot be snooped on while traveling through the tunnel.

Your old setup combined method 1 and 3, but in principle they could work alone, depending on circumstances.

4) With VPN you also create an encrypted tunnel from your local host to your "proxy" server. In this case it's not limited to one client program, but your operating system would send all non-local network traffic to your proxy which then forwards it to the destination. To manage this you need root/administrator permission, so it might not be as easy to set up.

In all four cases, data is first sent to a remote host which then creates the real connection to the destination web server and sends your request. The answer first gets back to the remote host which looks up the real destination in some internal data and then sends it back to the original client. Websites that show your IP only see the remote host and don't know about your tunnel/proxy.

Edit:

If a website really wants to know your IP, there might be some possibilities via javascript or plugins within a web page, which execute code within your browser, perhaps using some bugs. If you absolutely need to hide your real IP address you have to disable javascript and plugins.

Edit2:

Added VPN

Related Question