TCP是无感知的虚拟连接,中间断开两端不会立刻得到通知。一般在使用长连接的环境下,需要心跳保活机制可以勉强感知其存活。业务层面有心跳机制,TCP协议也提供了心跳保活机制。
长连接的环境下,人们一般使用业务层面或上层应用层协议(诸如MQTT,SOCKET.IO等)里面定义和使用。一旦有热数据需要传递,若此时连接已经被中介设备断开,应用程序没有及时感知的话,那么就会导致在一个无效的数据链路层面发送业务数据,结果就是发送失败。
无论是因为客户端意外断电、死机、崩溃、重启,还是中间路由网络无故断开、NAT超时等,服务器端要做到快速感知失败,减少无效链接操作。
下面协议解读,基于RFC1122#TCP Keep-Alives。
以下环境是在Linux服务器上进行。应用程序若想使用,需要设置SO_KEEPALIVE套接口选项才能够生效。
发送频率tcp_keepalive_intvl乘以发送次数tcp_keepalive_probes,就得到了从开始探测到放弃探测确定连接断开的时间
若设置,服务器在客户端连接空闲的时候,每90秒发送一次保活探测包到客户端,若没有及时收到客户端的TCP Keepalive ACK确认,将继续等待15秒*2=30秒。总之可以在90s+30s=120秒(两分钟)时间内可检测到连接失效与否。
以下改动,需要写入到/etc/sysctl.conf文件:
net.ipv4.tcp_keepalive_time=90
net.ipv4.tcp_keepalive_intvl=15
net.ipv4.tcp_keepalive_probes=2
保存退出,然后执行sysctl -p
生效。可通过 sysctl -a | grep keepalive
命令检测一下是否已经生效。
针对已经设置SO_KEEPALIVE的套接字,应用程序不用重启,内核直接生效。
只需要在服务器端一方设置即可,客户端完全不用设置,比如基于netty 4服务器程序:
ServerBootstrap b = new ServerBootstrap();
b.group(bossGroup, workerGroup)
.channel(NioServerSocketChannel.class)
.option(ChannelOption.SO_BACKLOG, 100)
.childOption(ChannelOption.SO_KEEPALIVE, true)
.handler(new LoggingHandler(LogLevel.INFO))
.childHandler(new ChannelInitializer() {
@Override
public void initChannel(SocketChannel ch) throws Exception {
ch.pipeline().addLast(
new EchoServerHandler());
}
});
// Start the server.
ChannelFuture f = b.bind(port).sync();
// Wait until the server socket is closed.
f.channel().closeFuture().sync();
Java程序只能做到设置SO_KEEPALIVE选项,至于TCP_KEEPCNT,TCP_KEEPIDLE,TCP_KEEPINTVL等参数配置,只能依赖于sysctl配置,系统进行读取。
下面代码摘取自libkeepalive源码,C语言可以设置更为详细的TCP内核参数。
int socket(int domain, int type, int protocol)
{
int (*libc_socket)(int, int, int);
int s, optval;
char *env;
*(void **)(&libc;_socket) = dlsym(RTLD_NEXT, "socket");
if(dlerror()) {
errno = EACCES;
return -1;
}
if((s = (*libc_socket)(domain, type, protocol)) != -1) {
if((domain == PF_INET) && (type == SOCK_STREAM)) {
if(!(env = getenv("KEEPALIVE")) || strcasecmp(env, "off")) {
optval = 1;
} else {
optval = 0;
}
if(!(env = getenv("KEEPALIVE")) || strcasecmp(env, "skip")) {
setsockopt(s, SOL_SOCKET, SO_KEEPALIVE, &optval;, sizeof(optval));
}
#ifdef TCP_KEEPCNT
if((env = getenv("KEEPCNT")) && ((optval = atoi(env)) >= 0)) {
setsockopt(s, SOL_TCP, TCP_KEEPCNT, &optval;, sizeof(optval));
}
#endif
#ifdef TCP_KEEPIDLE
if((env = getenv("KEEPIDLE")) && ((optval = atoi(env)) >= 0)) {
setsockopt(s, SOL_TCP, TCP_KEEPIDLE, &optval;, sizeof(optval));
}
#endif
#ifdef TCP_KEEPINTVL
if((env = getenv("KEEPINTVL")) && ((optval = atoi(env)) >= 0)) {
setsockopt(s, SOL_TCP, TCP_KEEPINTVL, &optval;, sizeof(optval));
}
#endif
}
}
return s;
}
完全可以借助于第三方工具libkeepalive,通过LD_PRELOAD方式实现。比如
LD_PRELOAD=/the/path/libkeepalive.so java -jar /your/path/yourapp.jar &
这个工具还有一个比较方便的地方,可以直接在程序运行前指定TCP保活详细参数,可以省去配置sysctl.conf的麻烦:
LD_PRELOAD=/the/path/libkeepalive.so \
> KEEPCNT=20 \
> KEEPIDLE=180 \
> KEEPINTVL=60 \
> java -jar /your/path/yourapp.jar &
针对较老很久不更新的程序,可以尝试一下嘛。
参数和定义
#define MAX_TCP_KEEPIDLE 32767
#define MAX_TCP_KEEPINTVL 32767
#define MAX_TCP_KEEPCNT 127
#define MAX_TCP_SYNCNT 127
#define TCP_KEEPIDLE 4 /* Start keeplives after this period */
#define TCP_KEEPINTVL 5 /* Interval between keepalives */
#define TCP_KEEPCNT 6 /* Number of keepalives before death */
net/ipv4/Tcp.c,可以找到对应关系:
case TCP_KEEPIDLE:
val = (tp->keepalive_time ? : sysctl_tcp_keepalive_time) / HZ;
break;
case TCP_KEEPINTVL:
val = (tp->keepalive_intvl ? : sysctl_tcp_keepalive_intvl) / HZ;
break;
case TCP_KEEPCNT:
val = tp->keepalive_probes ? : sysctl_tcp_keepalive_probes;
break;
初始化:
case TCP_KEEPIDLE:
if (val < 1 || val > MAX_TCP_KEEPIDLE)
err = -EINVAL;
else {
tp->keepalive_time = val * HZ;
if (sock_flag(sk, SOCK_KEEPOPEN) &&
!((1 << sk->sk_state) &
(TCPF_CLOSE | TCPF_LISTEN))) {
__u32 elapsed = tcp_time_stamp - tp->rcv_tstamp;
if (tp->keepalive_time > elapsed)
elapsed = tp->keepalive_time - elapsed;
else
elapsed = 0;
inet_csk_reset_keepalive_timer(sk, elapsed);
}
}
break;
case TCP_KEEPINTVL:
if (val < 1 || val > MAX_TCP_KEEPINTVL)
err = -EINVAL;
else
tp->keepalive_intvl = val * HZ;
break;
case TCP_KEEPCNT:
if (val < 1 || val > MAX_TCP_KEEPCNT)
err = -EINVAL;
else
tp->keepalive_probes = val;
break;
这里可以找到大部分处理逻辑,net/ipv4/Tcp_timer.c:
static void tcp_keepalive_timer (unsigned long data)
{
struct sock *sk = (struct sock *) data;
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
__u32 elapsed;
/* Only process if socket is not in use. */
bh_lock_sock(sk);
if (sock_owned_by_user(sk)) {
/* Try again later. */
inet_csk_reset_keepalive_timer (sk, HZ/20);
goto out;
}
if (sk->sk_state == TCP_LISTEN) {
tcp_synack_timer(sk);
goto out;
}
// 关闭状态的处理
if (sk->sk_state == TCP_FIN_WAIT2 && sock_flag(sk, SOCK_DEAD)) {
if (tp->linger2 >= 0) {
const int tmo = tcp_fin_time(sk) - TCP_TIMEWAIT_LEN;
if (tmo > 0) {
tcp_time_wait(sk, TCP_FIN_WAIT2, tmo);
goto out;
}
}
tcp_send_active_reset(sk, GFP_ATOMIC);
goto death;
}
if (!sock_flag(sk, SOCK_KEEPOPEN) || sk->sk_state == TCP_CLOSE)
goto out;
elapsed = keepalive_time_when(tp);
/* It is alive without keepalive 8) */
if (tp->packets_out || sk->sk_send_head)
goto resched;
elapsed = tcp_time_stamp - tp->rcv_tstamp;
if (elapsed >= keepalive_time_when(tp)) {
if ((!tp->keepalive_probes && icsk->icsk_probes_out >= sysctl_tcp_keepalive_probes) ||
(tp->keepalive_probes && icsk->icsk_probes_out >= tp->keepalive_probes)) {
tcp_send_active_reset(sk, GFP_ATOMIC);
tcp_write_err(sk); // 向上层应用汇报连接异常
goto out;
}
if (tcp_write_wakeup(sk) <= 0) {
icsk->icsk_probes_out++; // 这里仅仅是计数,并没有再次发送保活探测包
elapsed = keepalive_intvl_when(tp);
} else {
/* If keepalive was lost due to local congestion,
* try harder.
*/
elapsed = TCP_RESOURCE_PROBE_INTERVAL;
}
} else {
/* It is tp->rcv_tstamp + keepalive_time_when(tp) */
elapsed = keepalive_time_when(tp) - elapsed;
}
TCP_CHECK_TIMER(sk);
sk_stream_mem_reclaim(sk);
resched:
inet_csk_reset_keepalive_timer (sk, elapsed);
goto out;
death:
tcp_done(sk);
out:
bh_unlock_sock(sk);
sock_put(sk);
}
keepalive_intvl_when 函数定义:
static inline int keepalive_intvl_when(const struct tcp_sock *tp)
{
return tp->keepalive_intvl ? : sysctl_tcp_keepalive_intvl;
}
启用TCP Keepalive的应用程序,一般可以捕获到下面几种类型错误
java.io.IOException: Connection timed out
java.io.IOException: No route to host
java.io.IOException: Connection reset by peer
虽然说没有固定的模式可遵循,那么有以下原则可以参考: