- 官方主页 MLflow | MLflow
- 官方文档 MLflow: A Tool for Managing the Machine Learning Lifecycle | MLflow
0. 简介
MLflow 是一个开源平台,专门为了帮助机器学习的从业者和团队处理机器学习过程中的复杂性而设计。MLflow 关注机器学习项目的完整生命周期,确保每个阶段都是可管理的、可追溯的和可复现的。
MLflow 目前提供了几个关键的组件:
MLflow AI Gateway:通过安全、简单的API与最先进的 LLM 进行交互。
MLflow LLM Evaluate:简化LLM和提示的评估。
MLflow Tracking:记录和查询实验:代码、数据、配置和结果。
MLflow Projects:将数据科学代码打包成一种格式,可以在任何平台上重现运行。
MLflow Models:在不同的服务环境中部署机器学习模型。
Model Registry:在一个中心仓库中存储、注释、发现和管理模型。
1. 安装 MLFlow
pip install mlflow
2. 启动 Tracking UI
mlflow server --host 127.0.0.1 --port 8080
端口可以任意指定一个本地可用端口即可。
浏览器输入 http://localhost:5000 访问:
3. 创建实验
这里的实验类似于我们的project,独立的实验可以方便进行管理和查看
from mlflow import MlflowClient
client = MlflowClient(tracking_uri="http://127.0.0.1:8080")
all_experiments = client.search_experiments()default_experiment = [{"name": experiment.name, "lifecycle_stage": experiment.lifecycle_stage}for experiment in all_experimentsif experiment.name == "Default"
][0]
# Provide an Experiment description that will appear in the UI
experiment_description = ("This is the grocery forecasting project. ""This experiment contains the produce models for apples."
)# Provide searchable tags that define characteristics of the Runs that
# will be in this Experiment
experiment_tags = {"project_name": "grocery-forecasting","store_dept": "produce","team": "stores-ml","project_quarter": "Q3-2023","mlflow.note.content": experiment_description,
}# Create the Experiment, providing a unique name
produce_apples_experiment = client.create_experiment(name="Apple_Models", tags=experiment_tags
)
在 Tracking UI 里面可以看到刚创建的实验。
4. Model准备
下面是一个逻辑回归的模型。
import mlflow
from mlflow.models import infer_signatureimport pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score# Load the Iris dataset
X, y = datasets.load_iris(return_X_y=True)# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42
)# Define the model hyperparameters
params = {"solver": "lbfgs","max_iter": 1000,"multi_class": "auto","random_state": 8888,
}# Train the model
lr = LogisticRegression(**params)
lr.fit(X_train, y_train)# Predict on the test set
y_pred = lr.predict(X_test)# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)
5. Model记录
下面我们添加 MLflow 代码,记录模型信息。
import mlflow
from mlflow.models import infer_signatureimport pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score# Load the Iris dataset
X, y = datasets.load_iris(return_X_y=True)# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42
)# Define the model hyperparameters
params = {"solver": "lbfgs","max_iter": 1000,"multi_class": "auto","random_state": 8888,
}# Train the model
lr = LogisticRegression(**params)
lr.fit(X_train, y_train)# Predict on the test set
y_pred = lr.predict(X_test)# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)
这里训练好了一个逻辑回归的模型。Model记录
import mlflow
from mlflow.models import infer_signatureimport pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score# Load the Iris dataset
X, y = datasets.load_iris(return_X_y=True)# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42
)# Define the model hyperparameters
params = {"solver": "lbfgs","max_iter": 1000,"multi_class": "auto","random_state": 8888,
}# Train the model
lr = LogisticRegression(**params)
lr.fit(X_train, y_train)# Predict on the test set
y_pred = lr.predict(X_test)# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)#2nd part code# Set our tracking server uri for logging
mlflow.set_tracking_uri(uri="http://127.0.0.1:8080")# Create a new MLflow Experiment
mlflow.set_experiment("MLflow Quickstart")# Start an MLflow run
with mlflow.start_run():# Log the hyperparametersmlflow.log_params(params)# Log the loss metricmlflow.log_metric("accuracy", accuracy)# Set a tag that we can use to remind ourselves what this run was formlflow.set_tag("Training Info", "Basic LR model for iris data")# Infer the model signaturesignature = infer_signature(X_train, lr.predict(X_train))# Log the modelmodel_info = mlflow.sklearn.log_model(sk_model=lr,artifact_path="iris_model",signature=signature,input_example=X_train,registered_model_name="tracking-quickstart",)
其实也可以把训练模型和其他逻辑的代码放进 start_run 里面,但是官方不建议这么做,因为如果你训练或者其他逻辑代码报错有什么问题,会导致之前出现空或者无效记录,就需要手动去UI里面进行清理了。
设置链接的方式使用的是 mlflow.set_tracking_uri(uri="http://127.0.0.1:8080"),其实还有一种方式 client = MlflowClient(tracking_uri="http://127.0.0.1:8080")。client方式更加灵活,可以一份代码里面有多个跟踪服务器,另一种适合一份代码只有一个跟踪服务器来使用。
6. 调用Model
import mlflow
from mlflow.models import infer_signatureimport pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score# Load the Iris dataset
X, y = datasets.load_iris(return_X_y=True)# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42
)# Define the model hyperparameters
params = {"solver": "lbfgs","max_iter": 1000,"multi_class": "auto","random_state": 8888,
}# Train the model
lr = LogisticRegression(**params)
lr.fit(X_train, y_train)# Predict on the test set
y_pred = lr.predict(X_test)# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)
print(accuracy)# Set our tracking server uri for logging
mlflow.set_tracking_uri(uri="http://127.0.0.1:8080")# Create a new MLflow Experiment
mlflow.set_experiment("MLflow Quickstart")# Start an MLflow run
with mlflow.start_run():# Log the hyperparametersmlflow.log_params(params)# Log the loss metricmlflow.log_metric("accuracy", accuracy)# Set a tag that we can use to remind ourselves what this run was formlflow.set_tag("Training Info", "Basic LR model for iris data")# Infer the model signaturesignature = infer_signature(X_train, lr.predict(X_train))# Log the modelmodel_info = mlflow.sklearn.log_model(sk_model=lr,artifact_path="iris_model",signature=signature,input_example=X_train,registered_model_name="tracking-quickstart",)print(f'{model_info.model_uri}')# Load the model back for predictions as a generic Python Function modelloaded_model = mlflow.pyfunc.load_model(model_info.model_uri)predictions = loaded_model.predict(X_test)iris_feature_names = datasets.load_iris().feature_namesresult = pd.DataFrame(X_test, columns=iris_feature_names)result["actual_class"] = y_testresult["predicted_class"] = predictionsprint(result[:4])
7. 发布模型(Serving)
MLflow 模型发布,可以docker或k8s容器发布。首先介绍最简单的独立发布。
首先我们需要配置环境变量 MLFLOW_TRACKING_URI,值为你本地mlflow server启动的地址,默认情况就是http://127.0.0.1:8080。
然后是保证你mlflow中是有记录模型的,然后执行下面命令(需要有flask环境,没有的话需要pip install flask)
mlflow models serve -m models:/{model_name}/{version} --no-conda -p 5001 -h 0.0.0.0
mlflow models serve命令是用来在本地部署一个MLflow模型的。它会启动一个Flask服务器,提供一个REST API来预测模型的输出。这上面的参数来指定了模型的位置,端口号,主机名,以及是否使用conda环境:
-m 或 --model-uri:模型的URI,可以是本地文件系统,S3,Azure ML等。
-p 或 --port:服务器的端口号,默认是5000。
-h 或 --host:服务器的主机名,默认是127.0.0.1。
–no-conda:如果指定了这个参数,那么不会使用conda环境来运行模型,而是使用当前的Python环境。
执行命令之后看到这个输出代表启动成功了
mlflow models serve 命令部署模型到本地,只需要访问一个 api,就是 /invocations。这个 api 用于接收模型的输入数据,并返回预测结果。不需要自己定义这个 api,它是由 mlflow 自动生成的。
可以用curl或者postman进行测试
curl -X POST -H "Content-Type:application/json" --data '{"input":[[0,0]]}' http://localhost:5001/invocations
如果想知道是否正常运行的model service,可以call它的/ping routepoint
如果返回是200的话代表模型正常运行,如果其他状态码就表明有问题了。
参考链接:
https://blog.csdn.net/Damien_J_Scott/article/details/134602472
https://blog.csdn.net/scgaliguodong123_/article/details/124802396