我试图弄清楚如何在不使用Sklearn提供的.predict
函数的情况下使用LASSO回归来预测值。这基本上只是为了扩大我对套索内部工作原理的理解。我在Cross Validated上问了一个关于套索回归如何工作的问题,其中一条评论提到了预测函数的工作原理与线性回归中的相同。正因为如此,我想试着用我自己的函数来做这个。在
{I>在使用不同的输出时,{I>可以成功地使用cd2>函数。在这个例子中,Sklearn的预测值是4.33,我自己的函数是6.18。我错过了什么?我不是在最后正确地反变换了预测吗?在
import pandas as pd
from sklearn.preprocessing import RobustScaler
from sklearn.linear_model import Lasso
import numpy as npdf = pd.DataFrame({'Y':[5, -10, 10, .5, 2.5, 15], 'X1':[1., -2., 2., .1, .5, 3], 'X2':[1, 1, 2, 1, 1, 1], 'X3':[6, 6, 6, 5, 6, 4], 'X4':[6, 5, 4, 3, 2, 1]})X = df[['X1','X2','X3','X4']]
y = df[['Y']]#Scaling
transformer_x = RobustScaler().fit(X)
transformer_y = RobustScaler().fit(y)
X_scal = transformer_x.transform(X)
y_scal = transformer_y.transform(y)#LASSO
lasso = Lasso()
lasso = lasso.fit(X_scal, y_scal)#LASSO info
print('Score: ', lasso.score(X_scal,y_scal))
print('Raw Intercept: ', lasso.intercept_.round(2)[0])
intercept = transformer_y.inverse_transform([lasso.intercept_])[0][0]
print('Unscaled Intercept: ', intercept)
print('\nCoefficients Used: ')
coeff_array = lasso.coef_
inverse_coeff_array = transformer_x.inverse_transform(lasso.coef_.reshape(1,-1))[0]
for i,j,k in zip(X.columns, coeff_array, inverse_coeff_array):if j != 0:print(i, j.round(2), k.round(2))#Predictions
example = [[3,1,1,1]]
pred = lasso.predict(example)
pred_scal = transformer_y.inverse_transform(pred.reshape(-1, 1))
print('\nRaw Prediction where X1 = 3: ', pred[0])
print('Unscaled Prediction where X1 = 3: ', pred_scal[0][0])#Predictions without using the .predict function
def lasso_predict_value_(X1,X2,X3,X4): print('intercept: ', intercept)print('coef: ', inverse_coeff_array[0])print('X1: ', X1)preds = intercept + inverse_coeff_array[0]*X1print('Your predicted value is: ', preds)lasso_predict_value_(3,1,1,1)
受过训练的^{cd1>}没有任何信息,无论给定数据点是否缩放。因此,您手动进行预测的方法不应该考虑到它的缩放方面。
如果我删除您对模型效率的处理,我们可以得到sklearn模型的结果
example = [[3,1,1,1]]
lasso.predict(example)# array([0.07533937])#Predictions without using the .predict function
def lasso_predict_value_(X1,X2,X3,X4): x_test = np.array([X1,X2, X3, X4])preds = lasso.intercept_ + sum(x_test*lasso.coef_)print('Your predicted value is: ', preds)lasso_predict_value_(3,1,1,1)# Your predicted value is: [0.07533937]