visit
How well do you really know it? — Best practice from the Alibaba tech team
Ren Xijun is a member of Alibaba’s middleware technology team. Recently, he encountered a problem with a client-side communication server that was constantly throwing an exception. But to his dismay, despite scouring the Internet for information and making repeated attempts to locate the cause, he could not find anything to help explain the two queues or how to observe their metrics. Undeterred, he took it upon himself to get to the bottom of the issue. He wrote this article to record how he identified and resolved the issue.
Process of a TCP three-way handshake
667399 times the listen queue of a socket overflowed
After checking three times, I found that the value was increasing continuously. It was clear that the accept queue on the server had overflowed.
It was possible to then see how the OS deals with the overflow.
# cat /proc/sys/net/ipv4/tcp_abort_on_overflow0
With tcp_abort_on_overflow being 0, if the accept queue is full in the third step of the three-way handshake, the server throws away the ACK packet sent by the client, as it presumes that the connection has not been established on the server side.
To prove that the exception was related to the complete connection queue, I first changed tcp_abort_on_overflow to 1. If the complete connection queue was full in the third step, the server would send a reset packet to the client, indicating that it should end both the handshake process and the connection. (The connection was in fact not established on the server side.)
I then proceeded with the test, finding that there were a number of “connection reset by peer” exceptions in the client. We came to the conclusion that the complete connection queue’s overflow was in turn causing the client error, which helped us quickly identify key parts of the problem.
The development team looked at the Java source code and found that the default value of the backlog of the socket was 50 (this value controls the size of the complete connection queue and will be detailed later). I increased the value and ran it again, and after over 12 hours of stress testing, I noticed that the error wasn’t showing up anymore and that the overflow also wasn’t increasing.
So, it’s as simple as that. There is a complete connection queue overflow after a TCP three-way handshake takes place, and only after entering this queue can the server change from Listen to accept. The default value of the backlog is 50, which is easy to overflow. If it overflows, at the third step of the handshake, the server ignores the ACK packet sent by the client. The server will repeat the second step (sending the SYN-ACK packet to the client) at regular intervals. If the connection is not queued, it results in an exception.
But although we had solved the problem, I still wasn’t satisfied. I wanted to use this whole encounter as a learning experience, so I looked into the problem further.
(Source: ) As shown above, there are two queues: a SYN queue (or incomplete connection queue) and an accept queue (or complete connection queue). In the three-way handshake, after receiving a SYN packet from the client, the server places the connection information in the SYN queue and sends a SYN-ACK packet back to the client. The server then receives an ACK packet from the client. If the accept queue isn’t full, you should either remove the information from the SYN queue and put it into the accept queue, or execute as tcp_abort_on_overflow instructed. At this point, if the accept queue is full and tcp_abort_on_overflow is 0, the server sends a SYN-ACK packet to the client again after a certain period of time (in other words, it repeats the second step of the handshake). If the client experiences even a short timeout, it is easy to encounter a client exception. In our OS, the second step is retried twice by default (five times for CentOS).
net.ipv4.tcp_synack_retries = 2
netstat –s
[root@server ~]# netstat -s | egrep "listen|LISTEN" 667399 times the listen queue of a socket overflowed667399 SYNs to LISTEN sockets ignored
Here, for example, 667399 indicates the number of times that the accept queue overflowed. Execute this command every few seconds, and if the number increases, the accept queue must be full.
ss command
[root@server ~]# ss -lntRecv-Q Send-Q Local Address:Port Peer Address:Port 0 50 *:3306 *:*
Here, the Send-Q value in the second column is 50, indicating that the accept queue on the listen port (the third column) is 50 at most. The first column, Recv-Q, indicates the amount of the accept queue that is currently being used.
The size of the accept queue depends on min(backlog, somaxconn). The backlog is passed in when the socket is created, and somaxconn is an OS-level system parameter.
At this point, we can establish contact with our code. For example, when Java creates ServerSocket, it will let you pass in the value of the backlog.
(Source: ) The size of the SYN queue depends on max(64, /proc/sys/net/ipv4/tcp_max_syn_backlog) and the OSs of different versions may be different.
netstat command
Send-Q and Recv-Q can also be shown via the netstat command just as with the ss command. However, if the connection is not in Listen state, Recv-Q means that the received data is still in a cache and has not been read by the process. This value represents the bytes that have not been read by the process. Send is the number of bytes in the send queue that have not been acknowledged by the remote host.$netstat -tn Active Internet connections (w/o servers)Proto Recv-Q Send-Q Local Address Foreign Address State tcp0 0 100.81.180.187:8182 10.183.199.10:15260 SYN_RECV tcp0 0 100.81.180.187:43511 10.137.67.18:19796 TIME_WAIT tcp0 0 100.81.180.187:2376 100.81.183.84:42459 ESTABLISHED
It is important to note that the Recv-Q data shown by netstat -tn has nothing to do with the accept queue or SYN queue. This must be emphasized here so as not to confuse it with the Recv-Q data shown by ss -lnt.
For example, the following netstat -t sees that Recv-Q has accumulated a lot of data, which is generally caused by CPU processing failures.
Fri May 5 13:50:23 CST 2017Recv-Q Send-QLocal Address:Port Peer Address:Port11 10 *:3306 *:*
Here we can see that the service accept queue on port 3306 is 10 at most, but that there are now 11 connections in the queue. There must be a queue that cannot be queued and will overflow. At the same time, it is true that the value of the overflow is constantly increasing.
#ss -lntRecv-Q Send-Q Local Address:Port Peer Address:Port0 100 *:8080 *:*
In Nginx, the default value of the backlog is 511.
$sudo ss -lntState Recv-Q Send-Q Local Address:PortPeer Address:PortLISTEN 0 511 *:8085 *:*LISTEN 0 511 *:8085 *:*
Nginx runs in multi-process mode, so there are multiple numbers of 8085, meaning that multiple processes are listening to the same port both to avoid context switching and to improve performance.