Ssh – What’s the difference between SSH and Squid when using them as proxies

http-proxysquidssh

I've been using Squid on one of my server as a transparent proxy for a very long time (years).

Basically from the client I was creating an SSH tunnel between my client and the ssh + Squid server doing this:

ssh -T -N -x -C -L3128:127.0.0.1:3128 cedric@xxx

and then I'd launch my web browser (chromium) giving it a proxy-server at 127.0.0.1 on port 3128.

This worked fine for a very long time but then I started having logging issues on many websites (forums, stackexchange using Google as the login authority, etc.). Some kept working fine (like GMail and all the GMail services). I don't know what caused the login problems: I think it's due to some Squid configuration problem when I upgraded the server's OS and software.

So I considered setting up a VPN but then by reading an article called "SSH as a poor man's VPN", I realized that I could use SSH and simply do this, from the client:

ssh -D 5222 cedric@xxx -N

And then configure my browser to use 127.0.0.1 as a SOCKS host on port 5222.

And now everything works fine: all the login problems seem to be solved and I don't even need Squid anymore on the server.

However I don't understand how it works. From the various "what is my ip" websites, I see the address of my server (which is what I want). Also these sites do not seem to detect that there's a SSH "tunnel" going on.

Basically my question is: what is, technically, the difference between using Squid as a transparent proxy and using "SSH -D … -N" and using a VPN?

And from the viewpoint of the websites that I'm visiting, is there any difference?

Also I'd like to know if there's any way that a website I'm visiting would be able to detect my real IP when using "SSH -D … -N"? Is this technically feasible or is that information simply not available to the browser?

Best Answer

1) Squid is a special proxy for http and https traffic. As it knows this protocol it can offer some advanced features like caching, filtering, rewrite rules, dns resolving. All web browsers know how to handle http proxies as it's part of that protocol spec.

Additionally Squid can act as a transparent proxy where you don't configure your web browser to use it but a firewall redirects the traffic to the proxy. The browser doesn't know it's using a proxy which is the meaning of transparent in this scenario. Your old setup was not transparent in this sense.

2) ssh -D acts as a SOCKS proxy which is protocol agnostic. It just takes all traffic coming in on the socks port on the local side of the tunnel, extracts the payload and sends it to the remote side unmodified and unchecked and there sends it to the real destination. For a SOCKS proxy to work, the client program has to specially support it as each package sent from the client to the socks proxy has to be changed (encapsulated) so that the socks proxy knows where to send it. I don't know if all web browsers support it, but in principle it works with many protocols, not just http.

3) Your old commandline ssh -T -N -x -C -L3128:127.0.0.1:3128 ... only opened one simple tunnel where all traffic coming in on the local port 3128 was send to the predefined destination 127.0.0.1:3128 unmodified. This you can use for any client, as no modification of the sent packages are needed, but you have only a fixed destination.

In addition to proxying, ssh always uses an encrypted tunnel, so that your data cannot be snooped on while traveling through the tunnel.

Your old setup combined method 1 and 3, but in principle they could work alone, depending on circumstances.

4) With VPN you also create an encrypted tunnel from your local host to your "proxy" server. In this case it's not limited to one client program, but your operating system would send all non-local network traffic to your proxy which then forwards it to the destination. To manage this you need root/administrator permission, so it might not be as easy to set up.

In all four cases, data is first sent to a remote host which then creates the real connection to the destination web server and sends your request. The answer first gets back to the remote host which looks up the real destination in some internal data and then sends it back to the original client. Websites that show your IP only see the remote host and don't know about your tunnel/proxy.

Edit:

If a website really wants to know your IP, there might be some possibilities via javascript or plugins within a web page, which execute code within your browser, perhaps using some bugs. If you absolutely need to hide your real IP address you have to disable javascript and plugins.

Edit2:

Added VPN

Related Solutions

How to make netcat use an existing HTTP proxy

Netcat is not a specialized HTTP client. Connecting through a proxy server for Netcat thus means creating a TCP connection through the server, which is why it expects a SOCKS or HTTPS proxy with the -x argument, specified by -X:

 -X proxy_protocol
         Requests that nc should use the specified protocol when talking
         to the proxy server.  Supported protocols are “4” (SOCKS v.4),
         “5” (SOCKS v.5) and “connect” (HTTPS proxy).  If the protocol is
         not specified, SOCKS version 5 is used.

connect specifies a method for creating SSL (HTTPS) connections through a proxy server. Since the proxy is not the other end point and the connection is endpoint-wise encrypted, a CONNECT request allows you to tunnel a point-to-point connection through an HTTP Proxy (if it is allowed). (I might be glossing over details here, but it's not the important point anyway; details on "HTTP CONNECT tunneling" here)

So, to connect to your webserver using a proxy, you'll have to do what the web browser would do - talk to the proxy:

$ nc squid-proxy 3128
GET http://webserver/sample HTTP/1.0

(That question has similarities to this one; I don't know if proxychain is of use here.)

Addendum A browser using an ordinary HTTP proxy, e.g. Squid (as I know it), does what more or less what the example illustrated, as Netcat can show you: after the nc call, I configured Firefox to use 127.0.0.1 port 8080 as proxy and tried to open google, this is what was output (minus a cookie):

$ nc -l 8080
GET http://google.com/ HTTP/1.1
Host: google.com
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
DNT: 1
Proxy-Connection: keep-alive

By behaving this way, too, you can use Netcat to access a HTTP server through the HTTP proxy. Now, what should happen if you try to access a HTTPS webserver? The browser surely should not reveal the traffic to anyone in the middle, so a direct connection is needed; and this is where CONNECT comes into play. When I again start nc -l 8080 and try to access, say, https://google.com with the proxy set to 127.0.0.1:80, this is what comes out:

CONNECT google.com:443 HTTP/1.1
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2
Proxy-Connection: keep-alive
Host: google.com

You see, the CONNECT requests asks the server for a direct connection to google.com, port 443 (https). Now, what does this request do?

$ nc -X connect -x 127.0.0.1:8080 google.com 443

The output from the nc -l 8080 instance:

CONNECT google.com:443 HTTP/1.0

So it uses the same way to create a direct connection. However, as this can of course be exploited for almost anything (using for example corkscrew), CONNECT requests are usually restricted to the obvious ports only.

NAT with transparent proxy

Assuming that you want only to redirect traffic coming from eth0, not from localhost, and that your transparent proxy is running on your NAT server, you can do this for HTTP traffic with:

iptables -A PREROUTING -i eth0 -p tcp --dport 80 -j REDIRECT --to-ports 3128

As HTTPS uses end-to-end encryption, you cannot log specific requests without modifications on the client side - otherwise man-in-the-middle attacks would be easy.

Best Answer

Related Solutions

How to make netcat use an existing HTTP proxy

NAT with transparent proxy

Related Question