Ssh – Why wget does not work via ssh tunnel? What does proxy prevent ssh-client to do

httpPROXYsshssh-tunnelingwget

To test things I'm trying to establish a local ssh tunnel from my laptop to a site via ssh-server, and download a page of the site (or view it in a browser).

The tunnel is made like this:

$ ssh -L 9999:www.gnu.org:80 ssh-server

I check the tunnel works with nc program and ~# characker on the server. I check I'm allowed to do http requests on ssh-server by running wget and lynx on the server — they both run without errors.

But when I run wget --no-proxy localhost:9999 on the laptop I get error 403.

I can do the same thing using ssh ssh-server 'wget -O - http://www.gnu.org/' >> whatever. But why the tunnel doesn't work?

So I want to clear up what is going on and what kind of things the proxy does not allow.

My guess is proxy prevents specifically the ssh-client from doing http requests. Is it so?

If it is so – how does proxy distinguish ssh-client from other programs? Can it distinguish a request sent from ssh-client and a request from another program?

And what are common ways for "masking ssh-client to another program" (or other ways to pass by the proxy)?

PS
It would be awesome if someone wrote the address of a free ssh-server open for testing ssh-tunneling and other stuff in the comments. (Usually ssh-servers don't allow tunneling for free.)

Best Answer

You haven't set up (or tried to use) a HTTP proxy, nor an ssh tunnel. Instead, you used port-forwarding over ssh.

Forwarding TCP ports does not work for HTTP. Visiting a HTTP URL uses the domain of the URL at two different points. 1 - to find the IP address to send messages to. 2 - for the the Host header in the HTTP message. This lets one IP address serve websites for multiple domains.

So when you visit http://localhost:9999/, the HTTP message includes a header line Host: localhost:9999. The GNU webserver doesn't serve a website called localhost:9999, and denies access (403).

(The 403 is spec-legal. In theory, 403 is a bit unkind and you should prefer 400 with a message. Personally I used 403 on my trivial DynDNS site. Not for security as per the spec, but because FORBIDDEN is such a nice strong signal for troubleshooting. Hopefully it's even strong enough to deny the impression that my web server had intercepted theirs (in case of out of date DNS cache, for example)).

The convenient approach is to use the SSH "dynamic port forwarding" option -D, which sets up a SOCKS proxy. Unfortunately wget doesn't have an option for a SOCKS proxy (curl does though).

Related Question