Troubleshooting SYNs to LISTEN sockets dropped message from netstat

The problem:

Elevated number of dropped TCP connections to a listening remote network socket.

The symptom:

“SYNs to LISTEN sockets dropped” increments at a high rate:


File: gistfile1.txt
-------------------

root@smtp-out-n01:~# netstat -s | grep -i listen
    2608 SYNs to LISTEN sockets dropped
root@smtp-out-n01:~#

Obtaining the baseline:

  1. Starting from a lower level, let's check the size of the transmit queue on the network interface and make sure there aren’t any collisions:
  2. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# ifconfig eth0 | grep txqueuelen
              collisions:0 txqueuelen:1000
    root@smtp-out-n01:~#
    

  3. Next, let’s check to see if the interface is dropping packets due to the transmit queue:
  4. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# tc -s qdisc show dev eth0 | grep dropped
     Sent 17873576470 bytes 21407282 pkt (dropped 0, overlimits 0 requeues 12223)
     Sent 2830505875 bytes 2906590 pkt (dropped 0, overlimits 0 requeues 1874)
     Sent 1498561593 bytes 2255912 pkt (dropped 0, overlimits 0 requeues 1391)
     Sent 3102757206 bytes 2651357 pkt (dropped 0, overlimits 0 requeues 1121)
     Sent 5034092949 bytes 2946821 pkt (dropped 0, overlimits 0 requeues 2729)
     Sent 1231897711 bytes 2718582 pkt (dropped 0, overlimits 0 requeues 1506)
     Sent 1743081970 bytes 2229000 pkt (dropped 0, overlimits 0 requeues 1851)
     Sent 1435231757 bytes 2717978 pkt (dropped 0, overlimits 0 requeues 1015)
     Sent 997447409 bytes 2981042 pkt (dropped 0, overlimits 0 requeues 736)
    root@smtp-out-n01:~#
    
    

  5. Finally, check for any fragmentation problems:
  6. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# cat /proc/net/snmp | grep '^Ip:' | cut -f17 -d' '
    ReasmFails
    0
    root@smtp-out-n01:~#
    

  7. Moving up the stack, print the Accept Queue sizes for the listening service. Recv-Q shows the number of sockets in the Accept Queue, and Send-Q shows the backlog parameter:
  8. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# ss -plnt sport = :2319|cat && ss -plnt sport = :2320|cat
    State      Recv-Q Send-Q Local Address:Port               Peer Address:Port
    LISTEN     0      65535       :::2319                    :::*                   users:(("service",pid=3646,fd=47))
    State      Recv-Q Send-Q Local Address:Port               Peer Address:Port
    LISTEN     0      65535       :::2320                    :::*                   users:(("service",pid=3646,fd=48))
    root@smtp-out-n01:~#
    
    

  9. Nothing really in the accept queue, let’s check how many connections are in SYN-RECV state for the receiving process in question:
  10. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# ss -n state syn-recv sport = :2319 | wc -l; ss -n state syn-recv sport = :2320 | wc -l
    5
    1
    root@smtp-out-n01:~#
    

  11. Connections are moving to ESTABLISHED pretty quickly. Let’s make sure we have enough file descriptors available (the current number of allocated file handles, the number of unused but allocated file handles, the system-wide maximum):
  12. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# sysctl fs.file-nr
    fs.file-nr = 6240	0	3247209
    root@smtp-out-n01:~#
    

  13. Check for half-closed connections, waiting on FIN,ACK and total established connections:
  14. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# ss -n state time-wait | wc -l
    86
    root@smtp-out-n01:~# ss -n state established | wc -l
    1441
    root@smtp-out-n01:~#
    

  15. No concerns here, based on the total number of connections. Let’s check the number of concurrent (NEW) connections:
  16. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# modprobe ip_conntrack
    root@smtp-out-n01:~# conntrack -E -e NEW | pv -l -i 1 -r > /dev/null
    [ 180 /s]
    ^C
    root@smtp-out-n01:~#
    

  17. Current rate is at 180 NEW connections per second. Observing the rate on a single node for a 24 hours period, we peak at about 250 connections per second. Checking the CPU and memory utilization shows a pretty idle system even during peak send times:
  18. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# free -m
                  total        used        free      shared  buff/cache   available
    Mem:          31711        7779       17838         296        6092       23083
    Swap:             0           0           0
    root@smtp-out-n01:~# w
     18:12:36 up  2:21,  1 user,  load average: 0.61, 0.71, 0.70
    root@smtp-out-n01:~#
    

  19. Finally, checking the counter for dropped SYN packets, shows an ever increasing number at a rate of about 20/sec:
  20. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# netstat -s | grep -i listen
        2608 SYNs to LISTEN sockets dropped
    root@smtp-out-n01:~# nstat -az | grep -i listen
    TcpExtListenOverflows           0                  0.0
    TcpExtListenDrops               2608               0.0
    TcpExtTCPFastOpenListenOverflow 0                  0.0
    root@smtp-out-n01:~#
    

  21. The main reason for dropping SYN packets is when the SYN Queue is getting full. I was not able to see that in any of the above diagnostics. For better visibility let’s install some kernel hooks with SystemTap to print details on specifically what connections suffer due to Accept Queue overflow. This should help in identifying periodically hung applications that fail to accept() connections fast enough:
  22. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# cat acceptq.stp
    
    probe begin {
        printf("time (us)       \tacceptq\tqmax\tlocal addr\tremote_addr\n")
    }
    
    function skb_get_remote_v4addr:string(skb:long)
    {
        return format_ipaddr(__ip_skb_daddr(__get_skb_iphdr(skb)), 2 /* AF_INET */)
    }
    
    function skb_get_remote_v6addr:string(skb:long)
    {
        ipv6_hdr = &@cast(__get_skb_iphdr(skb), "ipv6hdr")
        return format_ipaddr(&ipv6_hdr->daddr, 10 /* AF_INET6 */)
    }
    
    function skb_get_remote_port:long(skb:long)
    {
        return __tcp_skb_sport(__get_skb_tcphdr(skb))
    }
    
    probe kernel.function("tcp_v4_conn_request") {
        if ($sk->sk_ack_backlog > $sk->sk_max_ack_backlog) {
            printf("%d\t%d\t%d\t%s:%d\t%s:%d\n",
                gettimeofday_us(),
                $sk->sk_ack_backlog,
                $sk->sk_max_ack_backlog,
                inet_get_ip_source($sk),
                inet_get_local_port($sk),
                skb_get_remote_v4addr($skb),
                skb_get_remote_port($skb));
        }
    }
    
    probe kernel.function("tcp_v6_conn_request") {
        if ($sk->sk_ack_backlog > $sk->sk_max_ack_backlog) {
            printf("%d\t%d\t%d\t[%s]:%d\t[%s]:%d\n",
                gettimeofday_us(),
                $sk->sk_ack_backlog,
                $sk->sk_max_ack_backlog,
                inet_get_ip_source($sk),
                inet_get_local_port($sk),
                skb_get_remote_v6addr($skb),
                skb_get_remote_port($skb));
        }
    }
    
    root@smtp-out-n01:~# wget http://launchpadlibrarian.net/483914277/linux-image-4.4.0-1110-aws-dbgsym_4.4.0-1110.121_amd64.ddeb && dpkg --install linux-image-4.4.0-1110-aws-dbgsym_4.4.0-1110.121_amd64.ddeb
    
    root@smtp-out-n01:~# stap -v acceptq.stp
    Pass 1: parsed user script and 110 library script(s) using 107992virt/43612res/6352shr/37332data kb, in 80usr/20sys/99real ms.
    
    Pass 2: analyzed script: 6 probe(s), 28 function(s), 5 embed(s), 3 global(s) using 258404virt/195296res/7684shr/187744data kb, in 1530usr/420sys/1960real ms.
    Pass 3: using cached /root/.systemtap/cache/4a/stap_4ae7ddea0627fb050d53ced46ad3b670_24938.c
    Pass 4: using cached /root/.systemtap/cache/4a/stap_4ae7ddea0627fb050d53ced46ad3b670_24938.ko
    Pass 5: starting run.
    time (us)       	acceptq	qmax	local addr	remote_addr
    
    

  23. Unfortunately running the kernel hook for 24 hours did not yield any results. The SYN and ACCEPT queues were nearly empty, though the “SYNs to LISTEN sockets dropped” issue persisted.
Tuning the kernel for better network performance:

After each incremental change, I measured the rate of SYN errors and checked the SYN and Accept queue utilizations.

  1. Increased the number of incoming connections backlog queue. This queue sets the maximum number of packets, queued on the INPUT side:
  2. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# sysctl -w net.core.netdev_max_backlog=3000000
    net.core.netdev_max_backlog = 3000000
    root@smtp-out-n01:~#
    

  3. Increased the overall TCP memory, in pages (number of guaranteed pages for TCP, the threshold at which TCP should start to conserve pages, maximum number of allocatable pages):
  4. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# sysctl -w net.ipv4.tcp_mem=’758316 1011092 1516632’
    net.ipv4.tcp_mem = 758316	1011092	1516632
    root@smtp-out-n01:~#
    

  5. Increased the core system socket read and write buffers absolute max, in bytes. The applications cannot request more than this value:
  6. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# sysctl -w net.core.rmem_max=67108864
    net.core.rmem_max = 67108864
    root@smtp-out-n01:~# sysctl -w net.core.wmem_max=67108864
    net.core.wmem_max = 67108864
    root@smtp-out-n01:~#
    

  7. Increased the system socket read and write buffers (min, default and max size in bytes):
  8. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# sysctl net.ipv4.tcp_rmem=’715827867 1073741800 2147483600’
    net.ipv4.tcp_rmem = 715827867	1073741800	2147483600
    root@smtp-out-n01:~# sysctl net.ipv4.tcp_wmem=’715827867 1073741800 2147483600’
    net.ipv4.tcp_wmem = 715827867	1073741800	2147483600
    root@smtp-out-n01:~#
    
    

  9. Ensured TCP window scaling is enabled:
  10. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# sysctl net.ipv4.tcp_window_scaling
    net.ipv4.tcp_window_scaling = 1
    root@smtp-out-n01:~#
    

  11. Updated how many times to retry SYN connections. With the default the final timeout for an active TCP connection attempt will happen after 127 seconds:
  12. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# sysctl -w net.ipv4.tcp_syn_retries=6
    net.ipv4.tcp_syn_retries = 6
    root@smtp-out-n01:~#
    

  13. And arguably most importantly I’ve increased the limit of the socket listen() backlog, the maximum value that net.ipv4.tcp_max_syn_backlog can take. The kernel documentation states that if this limit is reached SYN packets will be dropped:
  14. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# sysctl -w net.core.somaxconn=1000000
    net.core.somaxconn = 1000000
    root@smtp-out-n01:~#
    

  15. Even though that huge number got accepted (the default varies by kernel version, from 128 to 4096) the queue can’t be more than 65535 it seems:
  16. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# ss -plnt sport = :2319|cat && ss -plnt sport = :2320|cat
    State      Recv-Q Send-Q Local Address:Port               Peer Address:Port
    LISTEN     0      65535       :::2319                    :::*                   users:(("service",pid=3646,fd=47))
    State      Recv-Q Send-Q Local Address:Port               Peer Address:Port
    LISTEN     0      65535       :::2320                    :::*                   users:(("service",pid=3646,fd=48))
    root@smtp-out-n01:~#
    

  17. Increased the Listener queue length for unacknowledged SYN_RECV connection attempts. A SYN_RECV request socket consumes about 304 bytes of memory:
  18. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# sysctl -w net.ipv4.tcp_max_syn_backlog=7064090
    net.ipv4.tcp_max_syn_backlog = 7064090
    root@smtp-out-n01:~#
    

  19. Checking to see how many connections are in SYN_RECV state after the above change:
  20. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# netstat -antup | grep SYN_RECV | egrep "2319|2320" | wc -l
    3
    root@smtp-out-n01:~#
    

  21. Increased the number of times SYNACKs for a passive TCP connection attempt will be retransmitted. WIth the default the final timeout for a passive TCP connection will happen after 63seconds:
  22. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# sysctl -w net.ipv4.tcp_synack_retries=5
    net.ipv4.tcp_synack_retries = 5
    root@smtp-out-n01:~#
    

  23. Finally, disabling the reuse of TCP connections (at the expense of increased number of TIME_WAIT connections and about 120MB of extra memory usage) yielded the best result, dropped SYN packets went down to about 3 per 15 minutes!
  24. 
    File: gistfile1.txt
    -------------------
    
    root@smtp-out-n01:~# sysctl -w net.ipv4.tcp_tw_recycle=0
    net.ipv4.tcp_tw_recycle = 0
    root@smtp-out-n01:~#