首先
准备1个log 文件 app.log 它有60000 行数据
from loguru import logger
import osdef sample1():log_path = get_log_path()with open(log_path, "r") as f:list_logs = f.readlines()logger.info("length of app.logs: {}".format(len(list_logs)))# get project path
def get_project_path():return os.path.dirname(os.path.dirname(os.path.dirname(__file__)))# get log path
def get_log_path():import osreturn os.path.join(get_project_path(), "logs", "app.log")if __name__ == "__main__":sample1()
输出:
(.venv) [gateman@manjaro-x13 python_common_import]$ /home/gateman/Projects/python/python_common_import/.venv/bin/python /home/gateman/Projects/python/python_common_import/src/generator/gen_sample6.py
2024-05-13 01:16:19.932 | INFO | __main__:sample1:9 - length of app.logs: 62285
使用普通方法输出app.log 的内容到output.log
我们改一下文件, 增加1个方法sample2()来实现
from loguru import logger
import os
from src.decorator.sum_info import sum_info@sum_info
def sample2():log_path = get_log_path()with open(log_path, "r") as f:list_logs = f.readlines()output_path = get_output_path()with open(output_path, "w") as f:for i in list_logs:f.write(i)logger.info("moved logs to output.log")# get project path
def get_project_path():return os.path.dirname(os.path.dirname(os.path.dirname(__file__)))# get log path
def get_log_path():import osreturn os.path.join(get_project_path(), "logs", "app.log")# get output path
def get_output_path():return os.path.join(get_project_path(), "logs", "output.log")if __name__ == "__main__":sample2()
这个方法利用f.readlines() 一次把文件内容读入1个列表
然后循环这个列表输出到另1个文件
我们看下内存占用,
(.venv) [gateman@manjaro-x13 python_common_import]$ /home/gateman/Projects/python/python_common_import/.venv/bin/python /home/gateman/Projects/python/python_common_import/src/generator/gen_sample5.py
2024-05-13 01:43:55.288 | INFO | src.decorator.print_time:wrapper:10 - Start time of sample2 is 2024-05-13 01:43:55
2024-05-13 01:43:55.343 | INFO | __main__:sample2:16 - moved logs to output.log
2024-05-13 01:43:55.351 | INFO | src.decorator.print_mem:wrapper:14 - Current memory usage is 0.000866MB; Peak was 9.868371MB
2024-05-13 01:43:55.352 | INFO | src.decorator.print_time:wrapper:13 - End time of sample2 is 2024-05-13 01:43:55
2024-05-13 01:43:55.352 | INFO | src.decorator.print_time:wrapper:14 - Time used of sample2 is 0.06403422355651855 seconds
可见到峰值内存是9Mb 多 , 因为它要把整个文件的内容读入内存
使用迭代器
我们改一下文件, 增加1个方法sample3()来实现
from loguru import logger
import os
from src.decorator.sum_info import sum_info@sum_info
def sample3():log_path = get_log_path()output_path = get_output_path()count = 0with open(log_path, "r") as f:with open(output_path, "a") as f2:for i in f:f2.write(i)count += 1logger.info("moved {} logs to output.log".format(count))# get project path
def get_project_path():return os.path.dirname(os.path.dirname(os.path.dirname(__file__)))# get log path
def get_log_path():import osreturn os.path.join(get_project_path(), "logs", "app.log")# get output path
def get_output_path():return os.path.join(get_project_path(), "logs", "output.log")if __name__ == "__main__":sample3()
由于 f实际上是1 TextIOWrapper, 它是1个interable
所以我们可以用for … in 来迭代它
这种方法的内存占用:
(.venv) [gateman@manjaro-x13 python_common_import]$ /home/gateman/Projects/python/python_common_import/.venv/bin/python /home/gateman/Projects/python/python_common_import/src/generator/gen_sample7.py
2024-05-13 01:50:33.133 | INFO | src.decorator.print_time:wrapper:10 - Start time of sample3 is 2024-05-13 01:50:33
2024-05-13 01:50:33.229 | INFO | __main__:sample3:16 - moved 62320 logs to output.log
2024-05-13 01:50:33.229 | INFO | src.decorator.print_mem:wrapper:14 - Current memory usage is 0.00086MB; Peak was 0.041176MB
2024-05-13 01:50:33.230 | INFO | src.decorator.print_time:wrapper:13 - End time of sample3 is 2024-05-13 01:50:33
2024-05-13 01:50:33.230 | INFO | src.decorator.print_time:wrapper:14 - Time used of sample3 is 0.09714841842651367 seconds
只有0.04MB
大大节省了内存!