Connection Abort Before accept Returns
What happens in different implementations -
- Berkeley-derived implementations handle aborted connection completely within kernel, server process never sees it.
- SVR4 implementations return an error of EPROTO (protocol error) as the return from accept.
- POSIX specifies that return must be ECONNABORTED instead, because EPROTO is also returned due to fatal protocol related events, and server can't decide whether to call accept again, while with ECONNABORTED, server can ignore error and call accept.
Termination of Server Process
Sequence of steps when child process terminates early -
- Server child and client are connected and can send each other message.
- Server child is killed by using its process ID. All open descriptors of child are closed. FIN is sent to client, which sends back ACK. First half of TCP connection termination complete.
- SIGCHLD signal sent to parent server and handled correctly.
- Client TCP receives FIN and sends ACK, but client process is blocked in the call to fgets.
- When we type another line for fgets, str_cli calls writen and client TCP sends data to server. This is allowed by TCP because receipt of FIN by client TCP only indicates that server process has closed its end of connection and won't send data. When server TCP receives data from client, it responds with RST, because the process server is terminated.
- Client process will not see RST because it calls readline immediately after call to writen and readline returns 0 (EOF) immediately because of FIN received earlier. Client is not expecting to receive EOF, so it quits with error message server terminated prematurely.
- All open descriptors of client are closed. If readline happens before RST is received, the result is an unexpected EOF in the client. But if RST arrives first, result is ECONNRESET connection reset by peer error returns from readline.
SIGPIPE Signal
Problem - If client ignores error return from readline and writes more data to server, example - if client needs to perform 2 writes before reading anything back, with the first write eliciting RST.
When a process writes to a socket that has received RST, SIGPIPE signal is sent to process whose default action is to terminate the process, and hence it should be caught by signal handler. If process catches signal and returns from handler, or ignores signal, write operation returns EPIPE.
SIGPIPE can't be obtained on first write, because, first write access elicits RST and second write elicits signal. It is okay to write to a socket that has received FIN, but not RST.
Solution - If multiple sockets are present, then we don't know which socket encountered the error, so we set disposition of SIGPIPE to SIG_IGN, and catch error EPIPE and terminate.
Crashing of Server Host
Problem - Server host crashed / intermediate router is down
- Crash
- Client blocked in call to readline
- Client TCP continually retransmits data segments trying to receive ACK from server - 12 times, waiting for around 9 minutes before giving up.
If server host crashed and there were no responses at all to client's data segments, error is ETIMEDOUT.
If some intermediate router determined that server host was unreachable and responded with ICMP destination unreachable message, error is either EHOSTUNREACH or ENETUNREACH.
Crashing and Rebooting of Server Host
- Start server and client and ensure connection is established by sending messages using str_cli and str_echo).
- Server host crashes and reboots.
- Type a line of input to client, which is sent as TCP data segment to server host.
- When server host reboots after crashing, its TCP loses all information about previous connections. Therefore, server TCP responds with RST.
- Client is blocked in the call to readline when RST is received, causing readline to return error ECONNRESET.
Shutdown of Server Host
When linux system is shutdown, init process normally sends SIGTERM signals to all processes (we can catch this signal), waits some fixed amount of time (say 5-20 seconds), then sends SIGKILL signal to any processes still running. This gives all running processes a short amount of time to clean up and terminate. If we don't catch SIGTERM, our server will be terminated by SIGKILL, and we follow all sequences of termination of server process.