Socket Connection Failure
Understand the causes behind socket connection failures in distributed systems, including TCP retransmissions and firewall packet drops. Learn how connection pooling, timeout settings, and Oracle's dead connection detection help maintain system stability. This lesson guides you through diagnosing and mitigating these common networking issues to keep your applications responsive.
The failing connections
After the failed connection point, any attempt to read or write from the socket on either end did not result in a TCP reset or an error due to a half-open socket. Instead, the TCP/IP stack sent the packet, waited for an ACK, didn’t get one, and retransmitted. The faithful stack tried and tried to reestablish contact, and that firewall just kept dropping the packets on the floor, without so much as an “ICMP destination unreachable” message.
Long blockages
My Linux system, running on a 2.6 series kernel, has its tcp_retries2 set to the default value of 15, which results in a twenty-minute timeout before the TCP/IP stack will inform the socket library that the connection is broken. The HP-UX servers we were using at the time had a 30-minute ...