Netcat is not a specialized HTTP client. Connecting through a proxy server for Netcat thus means creating a TCP connection through the server, which is why it expects a SOCKS or HTTPS proxy with the -x
argument, specified by -X
:
-X proxy_protocol
Requests that nc should use the specified protocol when talking
to the proxy server. Supported protocols are “4” (SOCKS v.4),
“5” (SOCKS v.5) and “connect” (HTTPS proxy). If the protocol is
not specified, SOCKS version 5 is used.
connect
specifies a method for creating SSL (HTTPS) connections through a proxy server. Since the proxy is not the other end point and the connection is endpoint-wise encrypted, a CONNECT
request allows you to tunnel a point-to-point connection through an HTTP Proxy (if it is allowed). (I might be glossing over details here, but it's not the important point anyway; details on "HTTP CONNECT
tunneling" here)
So, to connect to your webserver using a proxy, you'll have to do what the web browser would do - talk to the proxy:
$ nc squid-proxy 3128
GET http://webserver/sample HTTP/1.0
(That question has similarities to this one; I don't know if proxychain
is of use here.)
Addendum
A browser using an ordinary HTTP proxy, e.g. Squid (as I know it), does what more or less what the example illustrated, as Netcat can show you: after the nc
call, I configured Firefox to use 127.0.0.1 port 8080 as proxy and tried to open google, this is what was output (minus a cookie):
$ nc -l 8080
GET http://google.com/ HTTP/1.1
Host: google.com
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
DNT: 1
Proxy-Connection: keep-alive
By behaving this way, too, you can use Netcat to access a HTTP server through the HTTP proxy.
Now, what should happen if you try to access a HTTPS webserver? The browser surely should not reveal the traffic to anyone in the middle, so a direct connection is needed; and this is where CONNECT
comes into play. When I again start nc -l 8080
and try to access, say, https://google.com
with the proxy set to 127.0.0.1:80
, this is what comes out:
CONNECT google.com:443 HTTP/1.1
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2
Proxy-Connection: keep-alive
Host: google.com
You see, the CONNECT
requests asks the server for a direct connection to google.com
, port 443
(https). Now, what does this request do?
$ nc -X connect -x 127.0.0.1:8080 google.com 443
The output from the nc -l 8080
instance:
CONNECT google.com:443 HTTP/1.0
So it uses the same way to create a direct connection. However, as this can of course be exploited for almost anything (using for example corkscrew
), CONNECT
requests are usually restricted to the obvious ports only.
Simple core command line tools like nc
, socat
seem not to be able to handle the specific HTTP stuff going on (chunks, transfer encodings, etc.). As a result this may produce unexpected behaviour compared to talking to a real web server. So, my first thought is to share the quickest way I know of setting up a tiny web server and making it just do what you want: dump all output.
The shortest I could come up with using Python Tornado:
#!/usr/bin/env python
import tornado.ioloop
import tornado.web
import pprint
class MyDumpHandler(tornado.web.RequestHandler):
def post(self):
pprint.pprint(self.request)
pprint.pprint(self.request.body)
if __name__ == "__main__":
tornado.web.Application([(r"/.*", MyDumpHandler),]).listen(8080)
tornado.ioloop.IOLoop.instance().start()
Replace the pprint
line to output only the specific fields you need, for example self.request.body
or self.request.headers
. In the example above it listens on port 8080, on all interfaces.
Alternatives to this are plenty. web.py, Bottle, etc.
(I'm quite Python oriented, sorry)
If you don't like its way of outputting, just run it anyway and try tcpdump
like this:
tcpdump -i lo 'tcp[32:4] = 0x484f535420'
to see a real raw dump of all HTTP-POST requests. Alternatively, just run Wireshark.
Best Answer
Both Perl and Python (and probably Ruby as well) have simple kits that you can use to quickly build simple HTTP proxies.
In Perl, use HTTP::Proxy. Here's the 3-line example from the documentation. Add filters to filter, log or rewrite requests or responses; see the documentation for examples.
In Python, use SimpleHTTPServer. Here's some sample code lightly adapted from effbot. Adapt the
do_GET
method (or others) to filter, log or rewrite requests or responses.