基于机器学习的C-MAPSS涡扇发动机RUL预测

美国国家航空航天局的商用模块化航空推进仿真系统(CMAPSS)所模拟出的涡扇发动机性能退化数据进行实验验证,数据中包含有风扇、涡轮、压气机等组件参数。C-MAPSS中所包含的数据集可以模拟出从海平面到42千英尺的高度,从0到0.9马赫的速度以及从60到100的油门杆角度。同时在每次循环的某一时间点开始会设置指定故障,并且故障在剩余循环继续存在,从而可以确定故障出现在哪一时刻,所以该数据集被普遍用作预测涡扇发动机RUL问题的基准数据集。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import linear_model
from sklearn.metrics import mean_squared_error, r2_score,mean_absolute_percentage_error
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import GradientBoostingRegressor
import mplcyberpunk as cyberpunk

Importing Data

index_names = ['id', 'cycle']
setting_names = ['setting_1', 'setting_2', 'setting_3']
sensor_list=[ "(Fan inlet temperature) (◦R)",
"(LPC outlet temperature) (◦R)",
"(HPC outlet temperature) (◦R)",
"(LPT outlet temperature) (◦R)",
"(Fan inlet Pressure) (psia)",
"(bypass-duct pressure) (psia)",
"(HPC outlet pressure) (psia)",
"(Physical fan speed) (rpm)",
"(Physical core speed) (rpm)",
"(Engine pressure ratio(P50/P2)",
"(HPC outlet Static pressure) (psia)",
"(Ratio of fuel flow to Ps30) (pps/psia)",
"(Corrected fan speed) (rpm)",
"(Corrected core speed) (rpm)",
"(Bypass Ratio) ",
"(Burner fuel-air ratio)",
"(Bleed Enthalpy)",
"(Required fan speed)",
"(Required fan conversion speed)",
"(High-pressure turbines Cool air flow)",
"(Low-pressure turbines Cool air flow)" ]
columns = index_names + setting_names + sensor_list
#print(len(columns))
train = pd.read_csv('train_FD001.txt',sep='\s+',names=columns)
test = pd.read_csv('test_FD001.txt',sep='\s+',header=None,index_col=False,names=columns)
test_res = pd.read_csv('RUL_FD001.txt',sep="\s+",header=None)

Data Cleaning

df_train=train.copy()
df_test=test.copy()
print("Shape of dataset",df_train.shape)
print("Null values in dataset",df_train.isnull().sum())

Shape of dataset (20631, 26)
Null values in dataset id 0
cycle 0
setting_1 0
setting_2 0
setting_3 0
(Fan inlet temperature) (◦R) 0
(LPC outlet temperature) (◦R) 0
(HPC outlet temperature) (◦R) 0
(LPT outlet temperature) (◦R) 0
(Fan inlet Pressure) (psia) 0
(bypass-duct pressure) (psia) 0
(HPC outlet pressure) (psia) 0
(Physical fan speed) (rpm) 0
(Physical core speed) (rpm) 0
(Engine pressure ratio(P50/P2) 0
(HPC outlet Static pressure) (psia) 0
(Ratio of fuel flow to Ps30) (pps/psia) 0
(Corrected fan speed) (rpm) 0
(Corrected core speed) (rpm) 0
(Bypass Ratio) 0
(Burner fuel-air ratio) 0
(Bleed Enthalpy) 0
(Required fan speed) 0
(Required fan conversion speed) 0
(High-pressure turbines Cool air flow) 0
(Low-pressure turbines Cool air flow) 0
dtype: int64

x = df_train[index_names].groupby('id').max()
#print(x)
plt.figure(figsize=(20,70))
ax= x['cycle'].plot(kind='barh',width=0.8)
plt.title('Engines LifeTime',size=40)
plt.xlabel('Cycles',size=30)
plt.xticks(size=15)
plt.ylabel('Engine_ID',size=30)
plt.yticks(size=15)
plt.grid(True)
plt.tight_layout()
plt.show()

sns.set_theme(style="whitegrid")
sns.displot(x['cycle'], kde=True, bins=20, height=6, aspect=2, color='blue')
plt.gca().lines[0].set_color('lime')
cyberpunk.make_lines_glow()  
plt.xlabel('Max Cycle')
plt.show()

Data Filtering and Treatment

n = len(df_train)
y = list(df_train.groupby(['id'])['cycle'].max())
for i in range(len(y)):y[i] = int(y[i])
z = list(df_train['cycle'])
for i in range(n):z[i] = int(z[i])
ans = [0]*(n)
for i in range(n):ans[i] = y[df_train.iloc[i,0]-1] - z[i]
df_train['RUL'] = ans
train['RUL'] = ans
train

rem_col = list()
for i in range(df_train.shape[1]-1):cor = df_train["RUL"].corr(df_train.iloc[:,i])if -0.001<=cor<=0.001 or np.isnan(cor):rem_col.append(df_train.columns[i])
print(rem_col)

['setting_3', '(Fan inlet temperature) (◦R)', '(Fan inlet Pressure) (psia)', '(Engine pressure ratio(P50/P2)', '(Burner fuel-air ratio)', '(Required fan speed)', '(Required fan conversion speed)']

plt.figure(figsize=(18,18))
sns.set_style("whitegrid", {"axes.facecolor": ".0"})
df_cluster2 = df_train.corr()
plot_kws={"s": 1}
sns.heatmap(train.corr(),cmap='RdYlBu',annot=True,linecolor='lightgrey').set_facecolor('white')

Since correlation of some of the columns are almost zero we will remove them from the dataset for a better training model.

df_train = df_train.drop(columns=rem_col)
df_train

Heatmap after removing unwanted columns

plt.figure(figsize=(18,18))
sns.set_style("whitegrid", {"axes.facecolor": ".0"})
df_cluster2 = df_train.corr()
plot_kws={"s": 1}
sns.heatmap(df_train.corr(),cmap='RdYlBu',annot=True,linecolor='lightgrey').set_facecolor('white')

Heatmap of the dataset showing correlation greater than 0.9.

threshold = 0.90
plt.figure(figsize=(10,10))sns.set_style("whitegrid", {"axes.facecolor": ".0"})
df_cluster2 = df_train.corr()
mask = df_cluster2.where((abs(df_cluster2) >= threshold)).isna()
plot_kws={"s": 1}
sns.heatmap(df_cluster2,cmap='RdYlBu',annot=True,mask=mask,linewidths=0.2,linecolor='lightgrey').set_facecolor('white')

rem_col_new = []
y =[]
for i in range(df_train.shape[1]):for j in range(i+1,df_train.shape[1]):corr = df_train.iloc[:,i].corr(df_train.iloc[:,j])if abs(corr)>=0.9:rem_col_new.append(df_train.columns[j])
rem_col_new

df_train = df_train.drop(columns=rem_col_new)
df_train

Columns having 2 unique values with a ratio greater or equal to 0.95 will be removed from the datset for a better training model.

uniq = list(df_train.nunique())
rem_column = []
for i in range(len(uniq)):if uniq[i]==2:x_un = df_train.iloc[:,i].unique()x1 = ((df_train[df_train.columns[i]]==x_un[0]).sum())/df_train.shape[0]x2 = ((df_train[df_train.columns[i]]==x_un[1]).sum())/df_train.shape[0]if x1/x2>0.95 or x2/x1>0.95:rem_column.append(df_train.columns[i])
rem_column

df_train = df_train.drop(columns=rem_column)
df_train

from sklearn.metrics import mean_squared_error, r2_scoredef error(test_res, y_pred):mse = mean_squared_error(test_res, y_pred)rmse = np.sqrt(mse)r2 = r2_score(test_res, y_pred)print(f"Mean squared error: {mse}")print(f"Root mean squared error: {rmse}")print(f"R-squared score: {r2}")return [r2,rmse]

r_2_score = []
rmse = []
Method = []

Linear Regression without dropping features/columns.

drop_col = ['RUL']X_train=train.drop(columns=drop_col).copy()
Y_train = df_train["RUL"]reg = linear_model.LinearRegression()
reg.fit(X_train, Y_train)
LinearRegression()

X_test=test.copy()
ans = reg.predict(X_test)
df_test["Pred_RUL_LR_wrf"] = ans
rem = ["Pred_RUL_LR_wrf"]

y_pred_LR_wrf = list(df_test.groupby(['id'])['Pred_RUL_LR_wrf'].min())
error_LR = error(test_res, y_pred_LR_wrf)
r_2_score.append(error_LR[0])
rmse.append(error_LR[1])
Method.append("LR_wrf")

Mean squared error: 894.8305578921604
Root mean squared error: 29.91371855674517
R-squared score: 0.48181926539666886

Applying Linear Regression after removing features.

x_train=df_train.drop(columns=['RUL']).copy()
y_train = df_train["RUL"]
reg = linear_model.LinearRegression()
reg.fit(x_train, y_train)

y_test = df_test.drop(columns = rem+rem_col+rem_col_new+rem_column).copy()
lin_pre = reg.predict(y_test)
df_test["Pred_RUL_LR_arf"] = lin_pre

y_pred_LR_arf = list(df_test.groupby(['id'])['Pred_RUL_LR_arf'].min())
f = error(test_res, y_pred_LR_arf)
r_2_score.append(f[0])
rmse.append(f[1])
Method.append("LR_arf")
rem.append("Pred_RUL_LR_arf")

Mean squared error: 890.8276472071307
Root mean squared error: 29.846735955664073
R-squared score: 0.48413728100423403

Random Forest Regression before removing any features.

rf = RandomForestRegressor(max_features = "log2")
rf.fit(X_train, Y_train)
ans = rf.predict(X_test)
df_test["Pred_RUL_RF_wrf"] = ans
rem.append("Pred_RUL_RF_wrf")
y_pred_RF_wrf = list(df_test.groupby(['id'])['Pred_RUL_RF_wrf'].min())
f = error(test_res, y_pred_RF_wrf)
r_2_score.append(f[0])
rmse.append(f[1])
Method.append("RF_wrf")

Mean squared error: 576.1797359999999
Root mean squared error: 24.00374420793556
R-squared score: 0.6663443863972125

Applying Random Forest Regression after removing features

rf = RandomForestRegressor(max_features = "log2")
rf.fit(x_train, y_train)
pre_rf = rf.predict(y_test)
df_test["Pred_RUL_RF_arf"] = pre_rf
y_pred_RF_arf = list(df_test.groupby(['id'])['Pred_RUL_RF_arf'].min())
f = error(test_res, y_pred_RF_arf)
r_2_score.append(f[0])
rmse.append(f[1])
Method.append("RF_arf")
rem.append("Pred_RUL_RF_arf")

Mean squared error: 651.5577000000001
Root mean squared error: 25.525628297849988
R-squared score: 0.6226943250376287

KNN before removing features

model=KNeighborsRegressor(n_neighbors = 24)
model.fit(X_train, Y_train)
ans = model.predict(X_test)
df_test["Pred_RUL_KNN_wrf"] = ans
rem.append("Pred_RUL_KNN_wrf")
y_pred_KNN_wrf = list(df_test.groupby(['id'])['Pred_RUL_KNN_wrf'].min())
f = error(test_res, y_pred_KNN_wrf)
r_2_score.append(f[0])
rmse.append(f[1])
Method.append("KNN_wrf")

Mean squared error: 969.0081597222222
Root mean squared error: 31.128895896292597
R-squared score: 0.4388643127875884

KNN after removing features

model = KNeighborsRegressor(n_neighbors = 78)
model.fit(x_train, y_train)
pre_KNN = model.predict(y_test)
df_test["Pred_RUL_KNN_arf"] = pre_KNN
y_pred_KNN_arf = list(df_test.groupby(['id'])['Pred_RUL_KNN_arf'].min())
f = error(test_res, y_pred_KNN_arf)
r_2_score.append(f[0])
rmse.append(f[1])
Method.append("KNN_arf")
rem.append("Pred_RUL_KNN_arf")

Mean squared error: 940.5797731755424
Root mean squared error: 30.66887303399886
R-squared score: 0.45532669451385177

Gradient Boosting Regression before removing features

model=GradientBoostingRegressor(loss = "absolute_error",criterion = "squared_error",max_features = "sqrt")
model.fit(X_train, Y_train)
ans = model.predict(X_test)
df_test["Pred_RUL_GB_wrf"] = ans
rem.append("Pred_RUL_GB_wrf")
y_pred_GB_wrf = list(df_test.groupby(['id'])['Pred_RUL_GB_wrf'].min())
f = error(test_res, y_pred_GB_wrf)
r_2_score.append(f[0])
rmse.append(f[1])
Method.append("GB_wrf")

Mean squared error: 549.1271065841099
Root mean squared error: 23.433461259150555
R-squared score: 0.6820100912170148

Gradient Boosting Regression after removing features

model=GradientBoostingRegressor(loss = "absolute_error",criterion = "squared_error",max_features = "log2")
model.fit(x_train, y_train)
pre_KNN = model.predict(y_test)
df_test["Pred_RUL_GBR_arf"] = pre_KNN
y_pred_GB_arf = list(df_test.groupby(['id'])['Pred_RUL_GBR_arf'].min())
f = error(test_res, y_pred_GB_arf)
r_2_score.append(f[0])
rmse.append(f[1])
Method.append("GBR_arf")
rem.append("Pred_RUL_GBR_arf")

Mean squared error: 571.646562441639
Root mean squared error: 23.909131361085436
R-squared score: 0.6689694679658271

Deciding the final regression method for prediction

import numpy as np
import matplotlib.pyplot as plt
method = ["LR","RF","KNN","GB"]
X_axis = np.arange(len(method))
r2_score_wr = r_2_score[::2]
r2_score_r = r_2_score[1::2]
plt.figure(figsize = (11,8))
plt.bar(X_axis - 0.1, r2_score_wr, 0.2, label='Without removing features')
plt.bar(X_axis + 0.1, r2_score_r, 0.2, label='After removing features')plt.xticks(X_axis, method)
plt.xlabel("Methods")
plt.ylabel("R2-score")
plt.title("R2-score of applied method")
plt.legend(facecolor='white')for i in range(len(X_axis)):plt.text(X_axis[i] - 0.1, r2_score_wr[i] + 0.01, "{:.2f}".format(r2_score_wr[i]), ha='center', va='bottom')plt.text(X_axis[i] + 0.1, r2_score_r[i] + 0.01, "{:.2f}".format(r2_score_r[i]), ha='center', va='bottom')plt.gca().set_facecolor('white')plt.show()

maxi = r_2_score.index(max(r_2_score))
print("Maximum R2-score",max(r_2_score))
print("Best Regression method for this dataset is:",Method[maxi])

x=[0]*(len(test_res))
for i in range(len(test_res)):x[i]=i+1
import matplotlib.pyplot as pltplt.figure(figsize = (16,8))plt.xlabel('Engine ID')
plt.ylabel('RUL')
plt.plot(x[30:40], test_res.iloc[30:40,0], label='Actual RUL',linestyle='dotted',marker='o')
plt.plot(x[30:40], y_pred_LR_wrf[30:40], label='Linear Regression',marker='o')
plt.plot(x[30:40], y_pred_RF_wrf[30:40], label='Random Forest',marker='o')
plt.plot(x[30:40], y_pred_KNN_wrf[30:40], label='KNN',marker='o')
plt.plot(x[30:40], y_pred_GB_wrf[30:40], label='Gradient Boosting',marker='o')plt.legend(facecolor='white')
plt.gca().set_facecolor('white')
plt.show()

工学博士,担任《Mechanical System and Signal Processing》《中国电机工程学报》《控制与决策》等期刊审稿专家,擅长领域:现代信号处理,机器学习,深度学习,数字孪生,时间序列分析,设备缺陷检测、设备异常检测、设备智能故障诊断与健康管理PHM等。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/pingmian/27204.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

一键实现电脑投屏到电视机,轻松享受更大画面

在数字化的今天&#xff0c;我们常常希望在更大的屏幕上分享电脑上的内容&#xff0c;观看视频、展示演示文稿&#xff0c;或者与家人一同欣赏照片。而实现电脑屏幕投射到电视机上&#xff0c;成为了许多人追求的方便而实用的功能。本文将为您详细介绍电脑投屏到电视机的方法&a…

汽车IVI中控开发入门及进阶(二十六):视频解码芯片ADV7180

前言: ADV7180芯片的功能框图如下: ADV7180自动检测并将兼容全球NTSC、PAL和SECAM标准的标准模拟基带电视信号转换为兼容8位ITU-R BT.656接口标准的4:2:2分量视频数据。简单的数字输出接口与各种MPEG编码器、编解码器、移动视频处理器以及Analog Devices数字视频编码器(如A…

跨平台看抖音、哔哩哔哩、虎牙、斗鱼啦,一个app即可完成

一、简介 1、一款免费、开源、无广告、跨平台的,可以观看抖音、哔哩哔哩、虎牙、斗鱼等平台的直播内容的软件。它简单好用,支持 Windows、MacOS、Linux、Android、iOS 等平台。 二、下载 1、文末有下载链接,apk手机可直接安装,不明白可以私聊我哈(麻烦咚咚咚,动动小手给个…

Matrix->Matrix工具类获取Matrix的平移、缩放、错切数值

// 传入矩阵&#xff0c;获取矩阵数值 class MatrixValues(matrix: Matrix) {val scaleX: Floatval scaleY: Floatval transX: Floatval transY: Floatval skewX : Float val skewY : Floatinit {val fromValues FloatArray(9)matrix.getValues(fromValues)// 缩放数值scaleX …

有什么好用的ai智能写作手机版?6个软件帮助你快速进行智能写作

有什么好用的ai智能写作手机版&#xff1f;6个软件帮助你快速进行智能写作 AI智能写作在现代社会中扮演着越来越重要的角色&#xff0c;许多人依赖这些工具来提高写作效率和质量。以下是六款不同类型的AI智能写作手机应用&#xff0c;它们可以帮助你快速进行智能写作&#xff…

element 表格el-table的 :cell-style用法-表格固定行文字高亮

el-table的 :cell-style用法 实现表格固定行文字高亮效果 <el-tableref"table"borderstripe:data"list":height"height"highlight-current-row:cell-style"cellStyle"><el-table-columnprop"code"label"规则…

2024年【陕西省安全员C证】考试资料及陕西省安全员C证考试总结

题库来源&#xff1a;安全生产模拟考试一点通公众号小程序 陕西省安全员C证考试资料根据新陕西省安全员C证考试大纲要求&#xff0c;安全生产模拟考试一点通将陕西省安全员C证模拟考试试题进行汇编&#xff0c;组成一套陕西省安全员C证全真模拟考试试题&#xff0c;学员可通过…

【2024最新华为OD-C/D卷试题汇总】[支持在线评测] 团队派遣(100分) - 三语言AC题解(Python/Java/Cpp)

🍭 大家好这里是清隆学长 ,一枚热爱算法的程序员 ✨ 本系列打算持续跟新华为OD-C/D卷的三语言AC题解 💻 ACM银牌🥈| 多次AK大厂笔试 | 编程一对一辅导 👏 感谢大家的订阅➕ 和 喜欢💗 🍓OJ题目截图 📎在线评测链接 团队派遣(100分) 🌍 评测功能需要订阅专栏…

Python第二语言(十一、Python面向对象(下))

目录 1. 封装 1.1 私有成员&#xff1a;__成员、__成员方法 2. 继承&#xff1a;单继承、多继承 2.1 继承的基础语法 2.2 复写 & 子类使用父类成员 3. 变量的类型注解&#xff1a;给变量标识变量类型 3.1 为什么需要类型注解 3.2 类型注解 3.3 类型注解的语法 3.…

怎么把Rmvb改成mp4格式?把rmvb改成MP4格式的四种方法

怎么把Rmvb改成mp4格式&#xff1f;在当今的数字时代&#xff0c;视频文件格式的多样性给我们带来了巨大的便利&#xff0c;但也可能带来一些兼容性的问题。rmvb是一种曾经非常流行的视频文件格式&#xff0c;主要由于其较高的压缩效率和相对不错的画质。然而&#xff0c;随着技…

Java:112-SpringMVC的底层原理(下篇)

这里继续续写上一章博客&#xff08;111章博客&#xff09;&#xff1a; Spring MVC 源码深度剖析&#xff1a; 既然我们自行写出了一个&#xff0c;那么我们可以选择看看mvc源码&#xff1a; 前端控制器 DispatcherServlet 继承结构&#xff1a; 前面我们知道mvc是操作同…

【后端开发】服务开发场景之高可用(冗余设计,服务限流,降级熔断,超时重试,性能测试)

【后端开发】服务开发场景之高可用&#xff08;冗余设计&#xff0c;服务限流&#xff0c;降级熔断&#xff0c;超时重试&#xff0c;性能测试&#xff09; 文章目录 序&#xff1a;如何设计一个高可用的系统&#xff1f;可用性的判断指标是什么&#xff1f;哪些情况会导致系统…

陪诊小程序开发,陪诊师在线接单

近几年&#xff0c;陪诊师成为了一个新兴行业&#xff0c;在科技时代中&#xff0c;陪诊小程序作为互联网下的产物&#xff0c;为陪诊市场带来了更多的便利。 当下生活压力大&#xff0c;老龄化逐渐严重&#xff0c;年轻人很难做到陪同家属看病。此外&#xff0c;就诊中出现了…

!力扣46. 全排列

给定一个不含重复数字的数组 nums &#xff0c;返回其所有可能的全排列 。你可以按任意顺序返回答案。 示例 1&#xff1a; 输入&#xff1a;nums [1,2,3] 输出&#xff1a;[[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]] 示例 2&#xff1a; 输入&#xff1a;nu…

摄影构图:如何处理对焦、快门、光圈、ISO 以及拍摄方式

写在前面 博文内容涉及摄影对焦模式、快门速度、光圈、ISO以及拍摄方式的简单介绍《高品质摄影全流程解析》 读书笔记整理理解不足小伙伴帮忙指正 &#x1f603; 生活加油 99%的焦虑都来自于虚度时间和没有好好做事&#xff0c;所以唯一的解决办法就是行动起来&#xff0c;认真…

【VS】尚未配置为Web项目XXXX指定的本地IIS URL HTTP://localhost

报错原因&#xff1a; 我们在Web项目的属性配置中勾选了“使用本地IIS Web服务器”&#xff1b; 本来嘛&#xff0c;这也没啥&#xff0c;问题是当我们的电脑IP改变时&#xff0c;将会导致程序找不到原来的IP地址了&#xff0c;那么当然会报错啦。 解决办法&#xff1a; 其实…

Windows中LoadLibrary加载动态库失败,详细解释(解决思路)

今天在开发的过程中&#xff0c;需要用到动态库里的一些接口&#xff0c;又不希望全部载入&#xff0c;在这过程中使用LoadLibrary加载dll时&#xff0c;出现问题&#xff0c;特此记录一下自己怎么解决的思路。 目录 先介绍一下这几个函数为以下错误分析做准备 GetProcAddres…

数据结构错题答案汇总

王道学习 第一章 绪论 1.1 3.A 数据的逻辑结构是从面向实际问题的角度出发的&#xff0c;只采用抽象表达方式&#xff0c;独立于存储结构&#xff0c;数据的存储方式有多种不同的选择;而数据的存储结构是逻辑结构在计算机上的映射&#xff0c;它不能独立于逻辑结构而存在。数…

Mac | 崩溃分析

一、dump分析 1. 导入符号&#xff1a; ./import_pdb.sh libmedia_stream_ext.dylib.dSYM ./import_pdb.sh libowcr.framework.dSYM 2. 分析dump&#xff1a; ./analyze_dump.sh AE59D64F-0E1D-4A18-8DAF-C2C4D22F9FA6.dmp 3. 第 2 步骤 中会输出崩溃模块、崩溃线程及堆栈…

区间分割求解方程

本文实现了基于mpi4py的多进程算法 mpi不过多介绍&#xff0c;某些函数的用法也不是介绍范围&#xff0c;这里只给出怎么实现多进程的方程求根算法。区间划分求解方程&#xff0c;在串行程序里&#xff0c;二分法是非常经典的算法&#xff0c;现在对其进行拓展&#xff0c;实现…