How to make netcat use an existing HTTP proxy

netcatPROXYsquid

I can access a web page just fine by directly hitting my web server as follows:

$ echo "GET /sample" | nc web-server 80
This is contents of /sample...
$

Now, I would like netcat to go via a Squid HTTP proxy (listening on port 3128), much like I can configure my Firefox browser via its proxy preferences and have it go via an HTTP proxy.

I tried the following, but it did not work:

$ echo "GET /sample" | nc -x squid-proxy:3128 web-server 80
    <Seemed to be blocked FOREVER on input, so I killed it.>
<Ctrl-C>
$

Note: I'm using RHEL 5.3 version of netcat that has the following options:

$ nc --help
nc: invalid option -- -
usage: nc [-46DdhklnrStUuvzC] [-i interval] [-p source_port]
  [-s source_ip_address] [-T ToS] [-w timeout] [-X proxy_version]
  [-x proxy_address[:port]] [hostname] [port[s]]

Excerpt from the man page of nc:

 EXAMPLES
    <snip>
 Connect to port 42 of host.example.com via an HTTP proxy at 10.2.3.4, port 8080. 
 This example could also be used by ssh(1); see the ProxyCommand directive in
 ssh_config(5) for more information.
       $ nc -x10.2.3.4:8080 -Xconnect host.example.com 42

Now, because mine is not an ssh/SSL usecase, I'm not sure how to use the -x / -X options, or even whether I should be using them at all!

If there's more than one way to achieve the above goal (namely, routing netcat traffic via an HTTP proxy), then I would greatly appreciate if you could share them all.

Many thanks in advance.

Best Answer

Netcat is not a specialized HTTP client. Connecting through a proxy server for Netcat thus means creating a TCP connection through the server, which is why it expects a SOCKS or HTTPS proxy with the -x argument, specified by -X:

 -X proxy_protocol
         Requests that nc should use the specified protocol when talking
         to the proxy server.  Supported protocols are “4” (SOCKS v.4),
         “5” (SOCKS v.5) and “connect” (HTTPS proxy).  If the protocol is
         not specified, SOCKS version 5 is used.

connect specifies a method for creating SSL (HTTPS) connections through a proxy server. Since the proxy is not the other end point and the connection is endpoint-wise encrypted, a CONNECT request allows you to tunnel a point-to-point connection through an HTTP Proxy (if it is allowed). (I might be glossing over details here, but it's not the important point anyway; details on "HTTP CONNECT tunneling" here)

So, to connect to your webserver using a proxy, you'll have to do what the web browser would do - talk to the proxy:

$ nc squid-proxy 3128
GET http://webserver/sample HTTP/1.0

(That question has similarities to this one; I don't know if proxychain is of use here.)

Addendum A browser using an ordinary HTTP proxy, e.g. Squid (as I know it), does what more or less what the example illustrated, as Netcat can show you: after the nc call, I configured Firefox to use 127.0.0.1 port 8080 as proxy and tried to open google, this is what was output (minus a cookie):

$ nc -l 8080
GET http://google.com/ HTTP/1.1
Host: google.com
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
DNT: 1
Proxy-Connection: keep-alive

By behaving this way, too, you can use Netcat to access a HTTP server through the HTTP proxy. Now, what should happen if you try to access a HTTPS webserver? The browser surely should not reveal the traffic to anyone in the middle, so a direct connection is needed; and this is where CONNECT comes into play. When I again start nc -l 8080 and try to access, say, https://google.com with the proxy set to 127.0.0.1:80, this is what comes out:

CONNECT google.com:443 HTTP/1.1
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2
Proxy-Connection: keep-alive
Host: google.com

You see, the CONNECT requests asks the server for a direct connection to google.com, port 443 (https). Now, what does this request do?

$ nc -X connect -x 127.0.0.1:8080 google.com 443

The output from the nc -l 8080 instance:

CONNECT google.com:443 HTTP/1.0

So it uses the same way to create a direct connection. However, as this can of course be exploited for almost anything (using for example corkscrew), CONNECT requests are usually restricted to the obvious ports only.

Related Question