Trace 1 starts out as a typical connection, SYN,SYN-ACK,ACK (frames 2595, 2596, and 2603), followed by data (frames 2607, 2678 and 2680), followed by FIN, ACK, FIN-ACK, and ACK (frames, 2711, 2722, 2724 and 2726). That is a lot of overhead for 182 bytes of data but that is another story.
Host 172.16.1.14 then sends a new SYN (frame 3227) to start a new connection but it uses the same local port number as the previous connection, 1287. I know it's a new connection because the sequence number of the new SYN is 3272303794 while the last sequence number from the previous connection was 3566092853. The server however, ACKs the old sequence number using its sequence number from the previous connection 2132478809. It obviously thinks the old connection is still active.
Host 172.16.1.14 tries again (frame 4898) with the same result (4901) and then one more time (frame 8470) but this time the server sends a SYN-ACK with a new sequence number (101055931) and ACKs the client's new sequence number.
No. Time Source Destination Protocol Info 2595 383.186090 172.16.1.14 172.16.1.110 1287 > 38094 [SYN] Seq=3566092683 Win=65535 Len=0 MSS=1460 2596 383.186648 172.16.1.110 172.16.1.14 8094 > 1287 [SYN, ACK] Seq=2132478615 Ack=3566092684 Win=8192 Len=0 MSS=1460 2603 383.229113 172.16.1.14 172.16.1.110 1287 > 38094 [ACK] Seq=3566092684 Ack=2132478616 Win=65535 Len=0 2607 383.232835 172.16.1.110 172.16.1.14 38094 > 1287 [PSH, ACK] Seq=2132478616 Ack=3566092853 Win=8192 Len=30 2678 383.426600 172.16.1.14 172.16.1.110 1287 > 38094 [ACK] Seq=3566092853 Ack=2132478646 Win=65505 Len=0 2680 383.426698 172.16.1.110 172.16.1.14 38094 > 1287 [PSH, ACK] Seq=2132478646 Ack=3566092853 Win=8192 Len=162 2711 383.486675 172.16.1.110 172.16.1.14 38094 > 1287 [FIN, ACK] Seq=2132478808 Ack=3566092853 Win=8192 Len=0 2722 383.529177 172.16.1.14 172.16.1.110 1287 > 38094 [ACK] Seq=3566092853 Ack=2132478809 Win=65343 Len=0 2724 383.529765 172.16.1.14 172.16.1.110 1287 > 38094 [FIN, ACK] Seq=3566092853 Ack=2132478809 Win=65343 Len=0 2726 383.529912 172.16.1.110 172.16.1.14 38094 > 1287 [ACK] Seq=2132478809 Ack=3566092854 Win=8192 Len=0 3227 437.570535 172.16.1.14 172.16.1.110 [TCP Port numbers reused] 1287 > 38094 [SYN] Seq=3272303794 Win=65535 Len=0 MSS=1460 3228 54.040850 172.16.1.110 172.16.1.14 [TCP ACKed lost segment] 38094 > 1287 [ACK] Seq=2132478809 Ack=3566092854 Win=8192 Len=0 4898 440.436689 172.16.1.14 172.16.1.110 1287 > 38094 [SYN] Seq=3272303794 Win=65535 Len=0 MSS=1460 4901 440.436866 172.16.1.110 172.16.1.14 [TCP Dup ACK 283228#1] 38094 > 1287 [ACK] Seq=2132478809 Ack=3566092854 Win=8192 Len=0 8470 446.463857 172.16.1.14 172.16.1.110 1287 > 38094 [SYN] Seq=3272303794 Win=65535 Len=0 MSS=1460 8471 446.464413 172.16.1.110 172.16.1.14 38094 > 1287 [SYN, ACK] Seq=101055931 Ack=3272303795 Win=8192 Len=0 MSS=1460 8474 446.474257 172.16.1.14 172.16.1.110 1287 > 38094 [ACK] Seq=3272303795 Ack=101055932 Win=65535 Len=0 |
Trace 2 gives a hint as to what the server is doing. It is the same trace but with the time reset at the point where the client sends the FIN-ACK and completes closing the original connection. We can see that the SYN request in frame 3227 occurs 54 seconds after the previous connection was closed and the second request is sent 56.9 seconds after the close. The third request - the one that worked was sent almost 63 seconds after the close.
No. Time Source Destination Protocol Info 2595 383.186090 172.16.1.14 172.16.1.110 1287 > 38094 [SYN] Seq=3566092683 Win=65535 Len=0 MSS=1460 2596 383.186648 172.16.1.110 172.16.1.14 38094 > 1287 [SYN, ACK] Seq=2132478615 Ack=3566092684 Win=8192 Len=0 MSS=1460 2603 383.229113 172.16.1.14 172.16.1.110 1287 > 38094 [ACK] Seq=3566092684 Ack=2132478616 Win=65535 Len=0 2604 383.229148 172.16.1.14 172.16.1.110 1287 > 38094 [PSH, ACK] Seq=3566092684 Ack=2132478616 Win=65535 Len=169 2607 383.232835 172.16.1.110 172.16.1.14 38094 > 1287 [PSH, ACK] Seq=2132478616 Ack=3566092853 Win=8192 Len=30 2678 383.426600 172.16.1.14 172.16.1.110 1287 > 38094 [ACK] Seq=3566092853 Ack=2132478646 Win=65505 Len=0 2680 383.426698 172.16.1.110 172.16.1.14 38094 > 1287 [PSH, ACK] Seq=2132478646 Ack=3566092853 Win=8192 Len=162 2711 383.486675 172.16.1.110 172.16.1.14 38094 > 1287 [FIN, ACK] Seq=2132478808 Ack=3566092853 Win=8192 Len=0 2722 383.529177 172.16.1.14 172.16.1.110 1287 > 38094 [ACK] Seq=3566092853 Ack=2132478809 Win=65343 Len=0 2724 *REF* 172.16.1.14 172.16.1.110 1287 > 38094 [FIN, ACK] Seq=3566092853 Ack=2132478809 Win=65343 Len=0 2726 0.000147 172.16.1.110 172.16.1.14 38094 > 1287 [ACK] Seq=2132478809 Ack=3566092854 Win=8192 Len=0 3227 54.040770 172.16.1.14 172.16.1.110 [TCP Port numbers reused] 1287 > 38094 [SYN] Seq=3272303794 Win=65535 Len=0 MSS=1460 3228 54.040850 172.16.1.110 172.16.1.14 [TCP ACKed lost segment] 38094 > 1287 [ACK] Seq=2132478809 Ack=3566092854 Win=8192 Len=0 4898 56.906924 172.16.1.14 172.16.1.110 1287 > 38094 [SYN] Seq=3272303794 Win=65535 Len=0 MSS=1460 4901 56.907101 172.16.1.110 172.16.1.14 [TCP Dup ACK 283228#1] 38094 > 1287 [ACK] Seq=2132478809 Ack=3566092854 Win=8192 Len=0 8470 62.934092 172.16.1.14 172.16.1.110 1287 > 38094 [SYN] Seq=3272303794 Win=65535 Len=0 MSS=1460 8471 62.934648 172.16.1.110 172.16.1.14 38094 > 1287 [SYN, ACK] Seq=101055931 Ack=3272303795 Win=8192 Len=0 MSS=1460 8474 62.944492 172.16.1.14 172.16.1.110 1287 > 38094 [ACK] Seq=3272303795 Ack=101055932 Win=65535 Len=0 |
When a connection is closed, the socket that sent the initial FIN enters the TIME_WAIT state for 2*MSL (mean segment lifetime). In this case the server has set the MSL value to 30 so it will take 60 seconds for the server to remove the socket from its state tables and allow the client to reuse what amounts to the same connection (same local and remote IP addresses and port numbers).
What the client effectively saw was a delay of 9 seconds for the request to start. The worst case scenario would be to try to start another transaction immediately after the previous one finishes. In that case the user would see the full 1 minute delay caused by the MSL timer and also some addition delay because it is doubtful that the client's retransmission timer would cause the SYN segment to be retransmitted at exactly the right time.
The only solutions are to reduce the server's MSL timer or to configure the appliance not to try to reuse the local port numbers. In this case the appliance was reconfigured.