Server replies to TCP SYN packets with delay

tcp

I have the following network topology:
workstation and server network topology

When workstation connects to HTTPS server in server, then usually server sends SYN+ACK packet with ~60 seconds delay. Packet capture from server can be seen below:

10:15:21.310878 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 74: 10.10.10.160.38256 > 10.10.10.16.443: S 3411497795:3411497795(0) win 29200 <mss 1460,sackOK,timestamp 2503046494 0,nop,wscale 7>
10:15:23.102826 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 74: 10.10.10.160.38244 > 10.10.10.16.443: S 3008273869:3008273869(0) win 29200 <mss 1460,sackOK,timestamp 2503046942 0,nop,wscale 7>
10:15:23.326801 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 74: 10.10.10.160.38256 > 10.10.10.16.443: S 3411497795:3411497795(0) win 29200 <mss 1460,sackOK,timestamp 2503046998 0,nop,wscale 7>
10:15:27.230802 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 74: 10.10.10.160.38244 > 10.10.10.16.443: S 3008273869:3008273869(0) win 29200 <mss 1460,sackOK,timestamp 2503047974 0,nop,wscale 7>
10:15:27.486804 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 74: 10.10.10.160.38256 > 10.10.10.16.443: S 3411497795:3411497795(0) win 29200 <mss 1460,sackOK,timestamp 2503048038 0,nop,wscale 7>
10:15:35.422853 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 74: 10.10.10.160.38244 > 10.10.10.16.443: S 3008273869:3008273869(0) win 29200 <mss 1460,sackOK,timestamp 2503050022 0,nop,wscale 7>
10:15:35.678797 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 74: 10.10.10.160.38256 > 10.10.10.16.443: S 3411497795:3411497795(0) win 29200 <mss 1460,sackOK,timestamp 2503050086 0,nop,wscale 7>
10:15:51.550815 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 74: 10.10.10.160.38244 > 10.10.10.16.443: S 3008273869:3008273869(0) win 29200 <mss 1460,sackOK,timestamp 2503054054 0,nop,wscale 7>
10:15:51.806784 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 74: 10.10.10.160.38256 > 10.10.10.16.443: S 3411497795:3411497795(0) win 29200 <mss 1460,sackOK,timestamp 2503054118 0,nop,wscale 7>
10:16:24.062769 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 74: 10.10.10.160.38256 > 10.10.10.16.443: S 3411497795:3411497795(0) win 29200 <mss 1460,sackOK,timestamp 2503062182 0,nop,wscale 7>
10:16:24.062832 00:11:25:8c:7a:1a > 1c:87:2c:5a:43:e2, ethertype IPv4 (0x0800), length 74: 10.10.10.16.443 > 10.10.10.160.38256: S 561747608:561747608(0) ack 3411497796 win 5792 <mss 1460,sackOK,timestamp 3558683637 2503062182,nop,wscale 2>
10:16:24.062843 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 74: 10.10.10.160.38244 > 10.10.10.16.443: S 3008273869:3008273869(0) win 29200 <mss 1460,sackOK,timestamp 2503062182 0,nop,wscale 7>
10:16:24.062860 00:11:25:8c:7a:1a > 1c:87:2c:5a:43:e2, ethertype IPv4 (0x0800), length 74: 10.10.10.16.443 > 10.10.10.160.38244: S 562554685:562554685(0) ack 3008273870 win 5792 <mss 1460,sackOK,timestamp 3558683637 2503062182,nop,wscale 2>
10:16:24.063075 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 66: 10.10.10.160.38256 > 10.10.10.16.443: . ack 1 win 229 <nop,nop,timestamp 2503062182 3558683637>
10:16:24.063116 00:19:e2:9e:df:f0 > 00:11:25:8c:7a:1a, ethertype IPv4 (0x0800), length 66: 10.10.10.160.38244 > 10.10.10.16.443: . ack 1 win 229 <nop,nop,timestamp 2503062182 3558683637>

In order to exclude any ARP related issues, then I installed static ARP entry for workstation in server:

# ip neigh show 10.10.10.160                               
10.10.10.160 dev eth0 lladdr 1c:87:2c:5a:43:e2 PERMANENT                      
# 

Last, but not least, I'm able to ping 10.10.10.160 from 10.10.10.16 at all times. For example I had while :; do ping -c 1 -I 10.10.10.16 10.10.10.160 &>/dev/null || date; sleep 2; done running in server all day and not a single ping failed.

Finally, when I compare TCP SYN packet sent by client at 10:15:51.806784(does not receive SYN+ACK from server) with 10:16:24.062769(receives SYN+ACK from server) in Wireshark, then other than checksums, they are identical.

Also, the server side firewall is configured in a way that first rule of INPUT chain is to log TCP SYN packets from 10.10.10.160(iptables -I INPUT -s 10.10.10.160 -d 10.10.10.16 -p tcp --syn --dport 443 -j LOG) and second rule is to accept all traffic from 10.10.10.160. For example following lines are logged to kernel ring buffer:

IN=eth0 OUT= MAC=00:11:25:8c:7a:1a:00:19:e2:9e:df:f0:08:00 SRC=10.10.10.160 DST=10.10.10.16 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=65477 DF PROTO=TCP SPT=40066 DPT=443 WINDOW=29200 RES=0x00 SYN URGP=0

As I already said, they are accepted in the next rule. This should exclude any tc/netfilter related issues.

Other clients(for example 10.10.10.170) work fine.

What could cause such behavior?

Best Answer

I see one major issue here : the replies from your server don't go through the same path as the packets going to it.

Your workstation is using your router 10.10.10.190 to reach the server through it's 10.10.10.16/32 address (/32? your drawing also says /28) instead of using its 10.10.10.148 address which is on the same LAN segment as the WS.

However, the TCP packets going from the server to the WS don't use the router since the server can reach the WS directly.

Why does it matter?

The consequence is that your router doesn't see the replies from your server and has a wrong idea of the connection state (despite the server replied with a SYN+ACK, from the router's viewpoint the connection state is still at the initial SYN).

Like most today's routers, it probably blocks any subsequent TCP packets going from the WS to the server until it sees that SYN+ACK from the server (that won't happen).

Thus, your actual problem is probably not that your server waits 60 seconds before sending that SYN+ACK but that your router blocks the TCP traffic from your WS to the server after the initial SYN.

Why that traffic dump, then?

If my theory is correct, the traffic dump you posted in your question is deceiving because we don't have the full dump:

  • the server does not reply to SYN requests because it has already replied to the first one and those are considered as duplicates
  • what you see at 10:16:24.062769 and at 10:16:24.062860 is probably the server sending its SYN+ACK reply again after a certain delay without receiving anything from the WS

How to fix that?

You have several options:

  • Reach the server directly through its 10.10.10.148 IP address (not a fix, actually)
  • Remove that 10.10.10.148 IP address from the server (not an option I guess)
  • Disable the firewall's connection tracking on the router (not an option I guess, and not desirable anyway)
  • Put the router's MAC address 00:19:e2:9e:df:f0 in the server's ARP table for 10.10.10.160 (a ugly hack IMHO and you will end up having another similar issue when reaching the server directly through its 10.10.10.148 IP address since the SYN packets won't use the router but the server's replies will)
  • Use source-based routing (policy routing) to tell the server to use the router when the source address of the outgoing packets is 10.10.10.16, whatever the destination address

Of course, the options that are actually not real options are given so that you can experiment and validate my theory. Source-based routing is what you should do.

Related Question