openresty+mysql+乱码_openresty记录响应body乱码问题

问题背景

最近新上了一个功能，openresty通过syslog记录请求日志，然后由logstash推送至ES。测试上线时未发现这个问题，在日常查看日志的过程中，发现logstash推送有错误日志，错误内容为：Error parsing json，导致此条请求日志丢失。

排查过程

1、在syslog中查找出现rror parsing json的日志，日志内容为：

{"request": {},"api": {},"upstream_uri": "","response": {"body": " \b }Ϋٰ ȿ¢³H K>-¤ZŨw c¸±H½񨻴¥𼰮ѝ:h٥lQ¶܊ 𩥹\/𢦫A骩£𐵽I§Heƣ J¥ª y\bYHɬ 晲̼.^¢~&Ԗ< Ŝu0004³P v߯𜱽2򣹩9 §𛳰004YRL0Üse񛳰018yׂ򉛵000f ÿ D\b\\ì蛵0006ƞ󛳰018Ġ`OEѐ𶛵001d㐵 y´§ ꨜu0017~~И雿쮺]-¨ 򛛲LH󿶌kl ࢇcL\n{¦ G~׮gy Keą±؜u0002L3\bG@¨#U¾ :Ŧ ,QL¹(=»{ӓ{mm¶[\/7!&c?ժ łcH vxXLu Ǚ¹_ǃ̢򣹽g>U¶سL-Pò𤦡¾Мu001c2¸\f¿OnGŧ⠑矸 I0k̾lЇ¶.龧d0븳 q 򶪰 K7d\t׬ō ^A±%ͨ G¥J]a˜u0016 ƹ�g 擁E5®4[*-¨£\f傜u0012T©+̖៊8r¬iEivn\r»噠 ±ቃྊ;󳮰07¨;_n% ","headers": {"content-type": "text\/plain;charset=UTF-8","date": "Wed, 02 Jan 2019 05:34:43 GMT","connection": "close","x-ratelimit-limit-second": "700","vary": "Accept-Encoding","content-encoding": "gzip","via": "kong\/0.14.0","x-kong-proxy-latency": "4","x-ratelimit-remaining-second": "699","transfer-encoding": "chunked","x-kong-upstream-latency": "2","x-kong-upstream-status": "200","server": "nginx"},"status": 200,"size": "1012"},"started_at": 1546407283066}

大家可以看到response.body是乱码，response.body记录的是请求相应的内容把这一段json进行json校验，也会发现有问题。

2、尝试调用该接口，发现返回的是正常内容，但是记录的确是乱码，所以确定应该是openresty记录日志的时候出现了问题。目前我们是在openresty的log阶段进行日志记录，且针对chunked编码进行了处理(如果body大于1k则不进行记录)。日志记录的代码如下：

functionbody_filter()local headers =ngx.resp.get_headers()if headers['content-type'] and then

if string.find(headers['content-type'], "application/json") or string.find(headers['content-type'], "text/plain") then

local chunk = ngx.arg[1] or ""

if string.len(ngx.ctx.response_temp or "") < max_body_size thenngx.ctx.response_temp= (ngx.ctx.response_temp or "").. chunk

ngx.ctx.response_body=ngx.ctx.response_tempelsengx.ctx.response_body= ""

end

3、想通过在测试环境加一些日志，然后调用线上的接口进行排查问题，由于线上的接口做了IP限制，测试环境调不通，此方法作罢。

4、让接口方把线上的数据拷贝至测试环境，然后调用此接口，但是日志记录也是正常的，没有出现乱码。

5、由于不能重现问题，在测试环境排查很难继续下去。最后没办法，只能献出终极武器，抓包。

6、通过tcpdump -Xvvenn -iany -w /tmp/20181228.pcap net [ip] and net [ip] and port [port]在线上服务器上抓包，然后下载pcap文件用wireshark进行分析，找到出问题的请求，如下图：