最近部署的web程序,服务器上出现不少
time_wait
的tcp连接状态,占用了tcp端口,花费几天时间排查。
之前我有结论:HTTP keep-alive 是在应用层对TCP连接的滑动续约复用,如果客户端、服务器稳定续约,就成了名副其实的长连接。
有关[Http持久连接]的一切,卷给你看
HTTP1.1 Keep-Alive到底算不算长连接?
目前所有的HTTP网络库(不论是客户端、服务端)都默认开启了HTTP Keep-Alive,通过Request/Response的Connection标头来协商复用连接。
01
非常规的行为形成的短连接
我手上有个项目,由于历史原因,客户端禁用了Keep-Alive,服务端默认开启了Keep-Alive,如此一来协商复用连接失败, 客户端每次请求会使用新的TCP连接, 也就是回退为短连接。
客户端强制禁用Keep-Alive
package main
import ("fmt""io/ioutil""log""net/http""time"
)func main() {tr := http.Transport{DisableKeepAlives: true,}client := &http.Client{Timeout: 10 * time.Second,Transport: &tr,}for {requestWithClose(client)time.Sleep(time.Second * 1)}
}func requestWithClose(client *http.Client) {resp, err := client.Get("http://10.100.219.9:8081")if err != nil {fmt.Printf("error occurred while fetching page, error: %s", err.Error())return}defer resp.Body.Close()c, err := ioutil.ReadAll(resp.Body)if err != nil {log.Fatalf("Couldn't parse response body. %+v", err)}fmt.Println(string(c))
}
web服务端默认开启Keep-Alive
package mainimport ("fmt""log""net/http"
)// 根据RemoteAddr 知道客户端使用的持久连接
func IndexHandler(w http.ResponseWriter, r *http.Request) {fmt.Println("receive a request from:", r.RemoteAddr, r.Header)w.Write([]byte("ok"))
}func main() {fmt.Printf("Starting server at port 8081\n")// net/http 默认开启持久连接if err := http.ListenAndServe(":8081", http.HandlerFunc(IndexHandler)); err != nil {log.Fatal(err)}
}
从服务端的日志看,确实是短连接。
receive a request from: 10.22.34.48:54722 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
receive a request from: 10.22.34.48:54724 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
receive a request from: 10.22.34.48:54726 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
receive a request from: 10.22.34.48:54728 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
receive a request from: 10.22.34.48:54731 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
receive a request from: 10.22.34.48:54733 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
receive a request from: 10.22.34.48:54734 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
receive a request from: 10.22.34.48:54738 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
receive a request from: 10.22.34.48:54740 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
receive a request from: 10.22.34.48:54741 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
receive a request from: 10.22.34.48:54743 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
receive a request from: 10.22.34.48:54744 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
receive a request from: 10.22.34.48:54746 map[Accept-Encoding:[gzip] Connection:[close] User-Agent:[Go-http-client/1.1]]
02
谁是主动断开方?
我想当然的以为 客户端是主动断开方,被现实啪啪打脸。
某一天服务器上超过300的time_wait报警,告诉我这tmd是服务器主动终断连接。
常规的TCP4次挥手, 主动断开方会进入time_wait状态,等待2MSL后释放占用的SOCKET
以下是从服务器上tcpdump抓取的tcp连接信息。
2,3红框显示:
Server端先发起TCP的FIN
消息, 之后Client回应ACK确认收到Server的关闭通知; 之后Client再发FIN消息,告知现在可以关闭了, Server端最后发ACK确认收到,并进入time_wait状态,等待2MSL的时间关闭Socket。
特意指出,红框1表示TCP双端同时关闭[1],此时会在Client,Server同时留下
time_wait
痕迹,发生概率较小。
03
没有源码说个串串
此种情况是服务端主动关闭,我们翻一翻golang httpServer的源码
•http.ListenAndServe(":8081")•server.ListenAndServe()•srv.Serve(ln)•go c.serve(connCtx) 使用go协程来处理每个请求
服务器连接处理请求的简略源码如下:
func (c *conn) serve(ctx context.Context) {c.remoteAddr = c.rwc.RemoteAddr().String()ctx = context.WithValue(ctx, LocalAddrContextKey, c.rwc.LocalAddr())defer func() {if !c.hijacked() {c.close() // go协程conn处理请求的协程退出时,主动关闭底层的TCP连接c.setState(c.rwc, StateClosed, runHooks)}}()......// HTTP/1.x from here on.ctx, cancelCtx := context.WithCancel(ctx)c.cancelCtx = cancelCtxdefer cancelCtx()c.r = &connReader{conn: c}c.bufr = newBufioReader(c.r)c.bufw = newBufioWriterSize(checkConnErrorWriter{c}, 4<<10)for {w, err := c.readRequest(ctx)..... serverHandler{c.server}.ServeHTTP(w, w.req)w.cancelCtx()if c.hijacked() {return}w.finishRequest()if !w.shouldReuseConnection() {if w.requestBodyLimitHit || w.closedRequestBodyEarly() {c.closeWriteAndWait()}return}c.setState(c.rwc, StateIdle, runHooks)c.curReq.Store((*response)(nil))if !w.conn.server.doKeepAlives() {// We're in shutdown mode. We might've replied// to the user without "Connection: close" and// they might think they can send another// request, but such is life with HTTP/1.1.return}if d := c.server.idleTimeout(); d != 0 {c.rwc.SetReadDeadline(time.Now().Add(d))if _, err := c.bufr.Peek(4); err != nil {return}}c.rwc.SetReadDeadline(time.Time{})}
}
我们需要关注
①for循环,表示尝试复用该conn,用于处理迎面而来的请求②w.shouldReuseConnection() = false, 表明读取到ClientConnection:Close
标头,设置closeAfterReply=true,跳出for循环,协程即将结束,结束之前执行defer
函数,defer函数内close该连接 c.close()
③如果 w.shouldReuseConnection() = true,则将该连接状态置为idle, 并继续走for循环,复用连接,处理后续请求。
......
// Close the connection.
func (c *conn) close() {
c.finalFlush()
c.rwc.Close()
}
04
我的收获
1. TCP 4次挥手的八股文2. 短连接的效应:主动关闭方会在机器上产生 time_wait状态,需要等待2MSL时间才会关闭SOCKET3.golang http keep-alive复用tcp连接的源码级分析4.tcpdump抓包的姿势
引用链接
[1]
TCP双端同时关闭: https://blog.csdn.net/q1007729991/article/details/69950255
有态度的马甲建立了真● 高质量交流群:大佬汇聚、无事静默、有事激活、深度思考。
[长按图片加我好友]
年终总结:2021技术文大盘点 | 打包过去,面向未来
项目总结:麻雀虽小,五脏俱全
理念总结:实话实说:只会.NET,会让我们一直处于鄙视链、食物链的下游
云原生系列: 什么是云原生?
点“赞”戳“在看”
体现态度很有必要!