问题概述
kafka进程不定期挂掉。ERROR Failed to clean up log for __consumer_offsets-30 in dir /tmp/kafka-logs due to IOException (kafka.server.LogDirFailureChannel),报错如下
[2020-12-07 16:12:36,803] ERROR Failed to clean up log for __consumer_offsets-7 in dir /tmp/kafka-logs due to IOException (kafka.server.LogDirFailureChannel)
java.nio.file.NoSuchFileException: /tmp/kafka-logs/__consumer_offsets-7/00000000000000000000.logat sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)at java.nio.file.Files.move(Files.java:1395)at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:913)at org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:227)at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:495)at kafka.log.Log.$anonfun$deleteSegmentFiles$1(Log.scala:2230)at kafka.log.Log.$anonfun$deleteSegmentFiles$1$adapted(Log.scala:2230)at scala.collection.immutable.List.foreach(List.scala:333)at kafka.log.Log.deleteSegmentFiles(Log.scala:2230)at kafka.log.Log.$anonfun$replaceSegments$6(Log.scala:2300)at kafka.log.Log.$anonfun$replaceSegments$6$adapted(Log.scala:2295)at scala.collection.immutable.List.foreach(List.scala:333)at kafka.log.Log.replaceSegments(Log.scala:2295)at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:606)at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:531)at kafka.log.Cleaner.doClean(LogCleaner.scala:530)at kafka.log.Cleaner.clean(LogCleaner.scala:504)at kafka.log.LogCleaner$CleanerThread.cleanLog(LogCleaner.scala:373)at kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:345)at kafka.log.LogCleaner$CleanerThread.tryCleanFilthiestLog(LogCleaner.scala:325)at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:314)at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)Suppressed: java.nio.file.NoSuchFileException: /tmp/kafka-logs/__consumer_offsets-7/00000000000000000000.log -> /tmp/kafka-logs/__consumer_offsets-7/00000000000000000000.log.deletedat sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396)at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)at java.nio.file.Files.move(Files.java:1395)at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:910)... 19 more
问题分析
错误显示没找到文件导致报错。linux会定时清理/tmp目录下的文件,我的kafka日志文件目录正是放在了/tmp/kafka-logs目录下,导致被定时给清理掉了,所以kafka在尝试读取或追加日志时就会出错。
grep log.dirs /opt/kafka_2.12-2.3.0/config/server.properties
/tmp/kafka-logs
问题解决
第一种:修改 日志目录,然后重启kafka
log.dirs=/opt/kafka_2.12-2.3.0/kafka-logs/
第二种:添加kafka日志目录到清理白名单中
centos7:centos7下/tmp目录的清理由服务systemd负责,其相关配置文件在/usr/lib/tmpfiles.d目录下,我们修改配置文件tmp.conf,将kafka日志目录加进去,
#防止删除kafka日志文件
X /tmp/kafka-logs
centos6:centos6下/tmp目录的清理是通过tmpwatch来实现的,tmpwatch则依赖于cron的定时调度,调度文件为/etc/cron.daily/tmpwatch
#防止删除kafka日志文件
X /tmp/kafka-logs