用于MLOps的MLflow简介第1部分:Anaconda环境

在这三部分的博客中跟随了演示之后,您将能够: (After following along with the demos in this three part blog you will be able to:)

  • Understand how you and your Data Science teams can improve your MLOps practices using MLflow

    了解您和您的数据科学团队如何使用MLflow改进您的MLOps实践
  • Use all the Components of MLflow (Tracking, Projects, Models, Registry)

    使用MLflow的所有组件(跟踪,项目,模型,注册表)
  • Use MLflow in an Anaconda Environment

    在Anaconda环境中使用MLflow
  • Use MLflow in a Docker Environment (Including running an IDE inside of a container)

    在Docker环境中使用MLflow(包括在容器内运行IDE)
  • Use Postgres Backend Store and Minio Artifact Store for Easy Collaboration

    使用Postgres后端商店和Minio Artifact商店进行轻松协作

The instructions and demos below assume you are using a Mac OSX operating system. Other operating systems can be used with minor modifications.

以下说明和演示假定您使用的是Mac OSX操作系统。 其他操作系统可以稍作修改即可使用。

目录: (Table of Contents:)

Part 1: Anaconda Environment

第1部分:Anaconda环境

  1. What is MLflow and Why Should You Use It?

    什么是MLflow?为什么要使用它?

  2. Using MLflow with a Conda Environment

    在Conda环境中使用MLflow

1.什么是MLflow,为什么要使用它? (1. What is MLflow and Why Should You Use It?)

基本概念 (Basic Concepts)

MLflow is an MLOps tool that can be used to increase the efficiency of machine learning experimentation and productionalization. MLflow is organized into four components (Tracking, Projects, Models, and Registry). You can use each of these components on their own but they are designed to work well together. MLflow is designed to work with any machine learning library, determine most things about your code by convention, and require minimal changes to integrate into an existing codebase. It aims to take any codebase written in its format and make it reproducible and reusable by multiple data scientists. MLflow lets you train, reuse, and deploy models with any library and package them into reproducible steps that other data scientists can use as a “black box”, without even having to know which library you are using.

MLflow是一种MLOps工具,可用于提高机器学习实验和生产化的效率。 MLflow被组织为四个组件(跟踪,项目,模型和注册表)。 您可以单独使用这些组件中的每个组件,但是它们被设计为可以很好地协同工作。 MLflow旨在与任何机器学习库一起使用,按照约定确定有关代码的大多数内容,并且只需进行最小的更改即可集成到现有代码库中。 它旨在获取以其格式编写的任何代码库,并使其可被多个数据科学家复制和重用。 MLflow允许您使用任何库来训练,重用和部署模型,并将它们打包为可重复的步骤,其他数据科学家可以将其用作“黑匣子”,而不必知道您使用的是哪个库。

机器学习中的生产力挑战 (Productivity Challenges in Machine Learning)

It is difficult to keep track of experiments

很难跟踪实验

If you are just working with a script or notebook, how do you tell which data, code, and parameters went into getting a particular model result?

如果您只是在使用脚本或笔记本,那么如何确定获取特定模型结果的数据,代码和参数呢?

It is difficult to reproduce code

难以复制代码

Even if you have meticulously tracked the code versions and parameters, you need to capture the whole environment (e.g. library dependencies) to get the same result. This is especially challenging if you want another data scientist to use your code, or if you want to run the same code at scale on another platform (e.g. in the cloud).

即使您已经仔细跟踪了代码版本和参数,也需要捕获整个环境(例如,库依赖项)才能获得相同的结果。 如果您想让另一个数据科学家使用您的代码,或者想在另一个平台(例如,在云中)上大规模运行相同的代码,则这尤其具有挑战性。

There’s no standard way to package and deploy models

没有打包和部署模型的标准方法

Every data science team comes up with its own approach for each ML library it uses, and the link between a model and the code and parameters that produced it is often lost.

每个数据科学团队都会针对其使用的每个ML库提出自己的方法,并且模型与产生该模型的代码和参数之间的链接通常会丢失。

There is no central store to manage models (their version and stage transitions)

没有中央存储来管理模型(它们的版本和阶段转换)

A data science team creates many models. In the absence of a central place to collaborate and manage model lifecycle, data science teams face challenges in how they manage models and stages.

数据科学团队会创建许多模型。 在缺乏协作和管理模型生命周期的中心位置的情况下,数据科学团队在如何管理模型和阶段方面面临挑战。

MLflow组件 (MLflow Components)

MLflow Tracking

MLflow追踪

This is an API and UI for logging parameters, code versions, metrics, and artifacts when running your machine learning code and later for visualizing results. You can use MLflow Tracking in any environment (e.g. script or notebook) to log results to local files or to a server, then compare multiple runs. Teams can use MLflow tracking to compare results from different users.

这是一个API和UI,用于在运行机器学习代码时记录参数,代码版本,指标和工件,并在以后用于可视化结果。 您可以在任何环境(例如脚本或笔记本)中使用MLflow Tracking将结果记录到本地文件或服务器中,然后比较多次运行。 团队可以使用MLflow跟踪来比较不同用户的结果。

MLflow Projects

MLflow项目

MLflow Projects are a standard format for packaging reusable data science code. Each project is simply a directory with code, and uses a descriptor file to specify its dependencies and how to run the code. For example, a project can contain a conda.yaml for specifying a Python Anaconda environment.

MLflow项目是用于包装可重用数据科学代码的标准格式。 每个项目只是一个包含代码的目录,并使用描述符文件指定其依赖关系以及如何运行代码。 例如,一个项目可以包含conda.yaml用于指定Python Anaconda环境。

MLflow Models

MLflow模型

MLflow Models offer a convention for packaging machine learning models in multiple flavors, and a variety of tools to help deploy them. Each model is saved as a directory containing arbitrary files and a descriptor file that lists several “flavors” the model can be used in. For example, a Tensorflow model can be loaded as a TensorFlow DAG, or as a python function to apply to input data.

MLflow模型提供了一种用于包装多种形式的机器学习模型的约定,并提供了多种工具来帮助部署它们。 每个模型都保存为包含任意文件的目录和一个描述符文件,该文件列出了可以在其中使用的几种“样式”。例如,可以将Tensorflow模型作为TensorFlow DAG加载,或者作为python函数加载到输入中数据。

MLflow Registry

MLflow注册表

MLflow Registry offers a centralized model store, set of APIs, and UI, to collaboratively manage the full lifecycle of a MLflow model. It provides model lineage (which MLflow experiment and run produced the model), model versioning, stage transitions (for example from staging to production or archiving), and annotations.

MLflow Registry提供了一个集中的模型存储,一组API和UI,以协作管理MLflow模型的整个生命周期。 它提供模型沿袭(由MLflow实验并运行以生成模型),模型版本控制,阶段过渡(例如,从登台到生产或归档)和注释。

可扩展性和大数据 (Scalability and Big Data)

An individual MLflow run can execute on a distributed cluster. You can launch runs on the distributed infrastructure of your choice and report results to a tracking server to compare them.

单个MLflow运行可以在分布式集群上执行。 您可以在您选择的分布式基础架构上启动运行,并将结果报告给跟踪服务器以进行比较。

MLflow supports launching multiple runs in parallel with different parameters, for example for hyperparameter tuning. You can use the Projects API to start multiple runs and the tracking API to track them.

MLflow支持并行启动具有不同参数的多个运行,例如用于超参数调整。 您可以使用Projects API来启动多个运行,并使用跟踪API来跟踪它们。

MLflow Projects can take input from, and write output to, distributed storage systems such as AWS S3. This means that you can write projects that build large datasets, such as featurizing a 100TB file.

MLflow项目可以从分布式存储系统(例如AWS S3)中获取输入,或将输出写入到其中。 这意味着您可以编写构建大型数据集的项目,例如将100TB文件特征化。

MLflow Model Registry offers large organizations a central hub to collaboratively manage a complete model lifecycle. Many data science teams within an organization develop hundreds of models, each model with its experiments, runs, versions, artifacts, and stage transitions.

MLflow Model Registry为大型组织提供了一个中心枢纽,以协作管理一个完整的模型生命周期。 组织中的许多数据科学团队都会开发数百个模型,每个模型都包含其实验,运行,版本,工件和阶段转换。

用例范例 (Example Use Cases)

Individual Data Scientists

个人数据科学家

Individual data scientists can use MLflow Tracking to track experiments locally on their machine, organize code in projects for future reuse, and output models that production engineers can then deploy using MLflow’s deployment tools.

单个数据科学家可以使用MLflow跟踪来在其计算机上本地跟踪实验,组织项目中的代码以供将来重用,以及输出模型,生产工程师可以使用MLflow的部署工具进行部署。

Data Science Teams

数据科学团队

Data science teams can deploy an MLflow Tracking server to log and compare results across multiple users working on the same problem (and experimenting with different models). Anyone can download and run another team member’s model.

数据科学团队可以部署MLflow跟踪服务器,以记录和比较处理同一问题(并尝试不同模型)的多个用户的结果。 任何人都可以下载并运行其他团队成员的模型。

Large Organizations

大型组织

Large organizations can share projects, models, and results. Any team can run another team’s code using MLflow Projects, so organizations can package useful training and data preparation steps that another team can use, or compare results from many teams on the same task. Engineering teams can easily move workflows from R&D to staging to production.

大型组织可以共享项目,模型和结果。 任何团队都可以使用MLflow项目来运行另一个团队的代码,因此组织可以打包另一个团队可以使用的有用的培训和数据准备步骤,或者比较来自多个团队在同一任务上的结果。 工程团队可以轻松地将工作流程从研发转移到生产到生产。

Production Engineers

生产工程师

Production engineers can deploy models from diverse ML libraries in the same way, store the models as files in a management system of their choice, and track which run a model came from.

生产工程师可以以相同的方式从各种ML库中部署模型,将模型存储为文件到他们选择的管理系统中,并跟踪运行模型的来源。

Researchers and Open Source Developers

研究人员和开源开发人员

Researchers and open source developers can publish code to GitHub in the MLflow project format, making it easy for anyone to run their code by pointing the mlflow run command directly to GitHub.

研究人员和开源开发人员可以以MLflow项目格式将代码发布到GitHub,从而使任何人都可以通过将mlflow run命令直接指向GitHub来轻松运行其代码。

ML Library Developers

ML库开发人员

ML library developers can output models in the MLflow Model format to have them automatically support deployment using MLflow’s built in tools. Deployment tool developers (for example, a cloud vendor building a servicing platform) can automatically support a large variety of models.

ML库开发人员可以使用MLflow Model格式输出模型,以使其使用MLflow的内置工具自动支持部署。 部署工具开发人员(例如,构建服务平台的云供应商)可以自动支持多种模型。

2.在Conda环境中使用MLflow (2. Using MLflow with a Conda Environment)

In this section we cover how to use the various features of MLflow with an Anaconda environment.

在本节中,我们介绍如何在Anaconda环境中使用MLflow的各种功能。

设置教程 (Setting up for the Tutorial)

  1. Make sure you have Anaconda installed

    确保已安装Anaconda
  2. Install a tool for installing programs (I use Homebrew)

    安装用于安装程序的工具(我使用Homebrew)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"

3. Install Git

3.安装Git

brew install git

4. Clone the repository

4.克隆存储库

git clone https://github.com/Noodle-ai/mlflow_part1_condaEnv.git

5. Create a conda environment from the conda.yaml file and activate

5.从conda.yaml文件创建一个conda.yaml环境并激活

conda env create --file conda.yaml
conda activate mlflow_demos

If, instead of using the conda.yaml to set up your environment, you wanted to create an environment from scratch use the following commands to create your own conda.yaml.

如果要从头创建环境而不是使用conda.yaml来设置环境,请使用以下命令来创建自己的conda.yaml

conda create --name mlflow_demos python=3.8.3
conda activate mlflow_demos
conda install -c anaconda jupyter=1.0.0
conda install -c conda-forge mlflow=1.8.0
conda install scikit-learn=0.22.1
conda install -c anaconda psycopg2=2.8.5
conda install -c anaconda boto3=1.14.12
conda env export --name mlflow_demos > conda.yaml

例子 (Examples)

Open experiment.ipynb and follow along. The notebook contains examples demonstrating how to use MLflow Tracking and MLflow Models. It also contains descriptions of how to use MLflow Projects.

打开experiment.ipynb然后继续。 笔记本中包含一些示例,这些示例演示了如何使用MLflow跟踪和MLflow模型。 它还包含有关如何使用MLflow项目的描述。

Using the Tracking API

使用Tracking API

The MLflow Tracking API lets you log metrics and artifacts (files from your data science code) in order to track a history of your runs.

MLflow Tracking API使您可以记录指标和工件(数据科学代码中的文件),以便跟踪运行历史。

The code below logs a run with one parameter (param1), one metric (foo) with three values (1,2,3), and an artifact (a text file containing “Hello world!”).

下面的代码使用一个参数(param1),一个带有三个值(1,2,3)的度量(foo)和一个工件(一个包含“ Hello world!”的文本文件)记录一次运行。

import mlflow
mlflow.start_run()
# Log a parameter (key-value pair)
mlflow.log_param("param1", 5)
# Log a metric; metrics can be updated throughout the run
mlflow.log_metric("foo", 1)
mlflow.log_metric("foo", 2)
mlflow.log_metric("foo", 3)
# Log an artifact (output file)
with open("output.txt", "w") as f:
f.write("Hello world!")
mlflow.log_artifact("output.txt")
mlflow.end_run()

Viewing the Tracking UI

查看跟踪界面

By default, wherever you run your program, the tracking API writes data into a local ./mlruns directory. You can then run MLflow’s Tracking UI.

默认情况下,无论您在哪里运行程序,跟踪API都会将数据写入本地./mlruns目录。 然后,您可以运行MLflow的跟踪UI。

Activate the MLflow Tracking UI by typing the following into the terminal. You must be in the same folder as mlruns.

通过在终端中输入以下内容来激活MLflow跟踪UI。 您必须与mlruns位于同一文件夹中。

mlflow ui

View the tracking UI by visiting the URL returned by the previous command.

通过访问上一个命令返回的URL来查看跟踪UI。

Image for post

Click on the run to see more details

单击运行以查看更多详细信息

Image for post

Click on the metric to see more details.

单击该指标以查看更多详细信息。

Image for post

合并MLflow跟踪,MLflow模型和MLflow项目的示例 (Example Incorporating MLflow Tracking, MLflow Models, and MLflow Projects)

In this example MLflow Tracking is used to keep track of different hyperparameters, performance metrics, and artifacts of a linear regression model. MLflow Models is used to store the pickled trained model instance, a file describing the environment the model instance was created in, and a descriptor file that lists several “flavors” the model can be used in. MLflow Projects is used to package the training code. And lastly MLflow Models is used to deploy the model to a simple HTTP server.

在此示例中,MLflow跟踪用于跟踪线性回归模型的不同超参数,性能指标和工件。 MLflow Models用于存储腌制的经过训练的模型实例,描述该模型实例在其中创建的环境的文件以及一个可以使用该模型的列出了几种“风味”的描述符文件。MLflow Projects用于打包训练代码。 最后,MLflow模型用于将模型部署到简单的HTTP服务器。

This tutorial uses a dataset to predict the quality of wine based on quantitative features like the wine’s “fixed acidity”, “pH”, “residual sugar”, and so on. The dataset is from UCI’s machine learning repository.

本教程使用数据集基于定量特征(如葡萄酒的“固定酸度”,“ pH”,“残糖”等)来预测葡萄酒的质量。 数据集来自UCI的机器学习存储库。

Training the Model

训练模型

First, we train the linear regression model that takes two hyperparameters: alpha and l1_ratio.

首先,我们训练具有两个超参数的线性回归模型:alpha和l1_ratio。

This example uses the familiar pandas, numpy, and sklearn APIs to create a simple machine learning model. The MLflow Tracking APIs log information about each training run like hyperparameters (alpha and l1_ratio) used to train the model, and metrics (root mean square error, mean absolute error, and r2) used to evaluate the model. The example also serializes the model in a format that MLflow knows how to deploy.

本示例使用熟悉的pandas,numpy和sklearn API创建简单的机器学习模型。 MLflow跟踪API记录有关每次训练运行的信息,例如用于训练模型的超参数(alpha和l1_ratio)以及用于评估模型的指标(均方根误差,均值绝对误差和r2)。 该示例还以MLflow知道如何部署的格式序列化了模型。

Each time you run the example MLflow logs information about your experiment runs in the directory mlruns.

每次运行示例MLflow时,都会在目录mlruns记录有关实验运行的信息。

There is a script containing the training code called train.py. You can run the example through the .py script using the following command.

有一个包含训练代码的脚本,名为train.py 。 您可以使用以下命令通过.py脚本运行示例。

python train.py <alpha> <l1_ratio>

There is also a notebook function of the training script. You can use the notebook to run the training (train() function shown below).

培训脚本还具有笔记本功能。 您可以使用笔记本计算机运行训练(如下所示的train()函数)。

# Wine Quality Sampledef train(in_alpha, in_l1_ratio):
import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, \
mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
import mlflow
import mlflow.sklearn def eval_metrics(actual, pred):
rmse = np.sqrt(mean_squared_error(actual, pred))
mae = mean_absolute_error(actual, pred)
r2 = r2_score(actual, pred)
return rmse, mae, r2 np.random.seed(40) # Read the wine-quality csv file from the URL
csv_url =\
'http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv'
data = pd.read_csv(csv_url, sep=';') # Split the data into training and test sets (0.75, 0.25) split
train, test = train_test_split(data)
# The predicted column is "quality" which is a scalar [3, 9]
train_x = train.drop(["quality"], axis=1)
test_x = test.drop(["quality"], axis=1)
train_y = train[["quality"]]
test_y = test[["quality"]]
# Set default values if no alpha is provided
if float(in_alpha) is None:
alpha = 0.5
else:
alpha = float(in_alpha)
# Set default values if no l1_ratio is provided
if float(in_l1_ratio) is None:
l1_ratio = 0.5
else:
l1_ratio = float(in_l1_ratio)
# Useful for multiple runs
with mlflow.start_run():
# Execute ElasticNet
lr = ElasticNet(
alpha=alpha,
l1_ratio=l1_ratio,
random_state=42
)
lr.fit(train_x, train_y)
# Evaluate Metrics
predicted_qualities = lr.predict(test_x)
(rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)
# Print out metrics
print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
print(" RMSE: %s" % rmse)
print(" MAE: %s" % mae)
print(" R2: %s" % r2)
# Log parameter, metrics, and model to MLflow
mlflow.log_param("alpha", alpha)
mlflow.log_param("l1_ratio", l1_ratio)
mlflow.log_metric("rmse", rmse)
mlflow.log_metric("r2", r2)
mlflow.log_metric("mae", mae)
mlflow.sklearn.log_model(lr, "model")

Comparing the Models

比较模型

Use the MLflow UI (as described above) to compare the models that you have produced.

使用MLflow UI(如上所述)比较您生成的模型。

Image for post

You can use the search feature to quickly filter out many models. For example the query (metrics.rmse < 0.8) returns all of the models with root mean square error less than 0.8. For more complex manipulations, you can download this table as a CSV and use your favorite data munging software to analyze it.

您可以使用搜索功能快速筛选出许多模型。 例如,查询(metrics.rmse <0.8)返回均方根误差小于0.8的所有模型。 对于更复杂的操作,您可以将该表下载为CSV,并使用自己喜欢的数据处理软件对其进行分析。

Image for post

Loading a Saved Model

加载保存的模型

After a model has been saved using MLflow Models within MLflow Tracking you can easily load the model in a variety of flavors (python_function, sklearn, etc.). We need to choose a model from the mlruns folder for the model path.

使用MLflow Tracking中的MLflow模型保存模型后,您可以轻松加载各种样式的模型(python_function,sklearn等)。 我们需要从mlruns文件夹中选择一个模型作为模型路径。

model_path = './mlruns/0/<run_id>/artifacts/model'
mlflow.<model_flavor>.load_model(modelpath)

Packaging the Training Code in a Conda Env with MLflow Projects

将培训代码与MLflow项目打包在Conda Env中

Now that you have your training code, you can package it so that other data scientists can easily reuse the training script, or so that you can run the training remotely.

现在您有了培训代码,可以打包它,以便其他数据科学家可以轻松地重用培训脚本,或者可以远程运行培训。

You do this by using MLflow Projects to specify the dependencies and entry points to your code. The MLproject file specifies the project has the dependencies located in a Conda environment (defined by conda.yaml) and has one entry point (train.py) that takes two parameters: alpha and l1_ratio.

您可以通过使用MLflow Projects指定代码的依赖项和入口点来实现。 MLproject文件指定该项目具有位于Conda环境(由conda.yaml定义)中的依赖项,并具有一个入口点( train.py ),该入口点train.py两个参数:alpha和l1_ratio。

Image for post

To run this project use mlflow run on the folder containing the MLproject file.

要运行此项目,请在包含MLproject文件的文件夹上mlflow run mlflow。

mlflow run . -P alpha=1.0 -P l1_ratio=1.0

After running this command, MLflow runs your training code in a new Conda environment with the dependencies specified in conda.yaml.

运行此命令后,MLflow运行在一个新conda环境中指定的依赖你的训练码conda.yaml

If a repository has an MLproject file you can also run a project directly from GitHub. This tutorial lives in the https://github.com/Noodle-ai/mlflow_part1_condaEnv repository which you can run with the following command. The symbol “#” can be used to move into a subdirectory of the repo. The “ — version” argument can be used to run code from a different branch.

如果存储库中有MLproject文件,您也可以直接从GitHub运行项目。 本教程位于https://github.com/Noodle-ai/mlflow_part1_condaEnv存储库中,您可以使用以下命令运行该存储库。 符号“#”可用于移至存储库的子目录。 “ — version”自变量可用于从其他分支运行代码。

mlflow run https://github.com/Noodle-ai/mlflow_part1_condaEnv -P alpha=1.0 -P l1_ratio=0.8

Serving the Model

服务模型

Now that you have packaged your model using the MLproject convention and have identified the best model, it is time to deploy the model using MLflow Models. An MLflow Model is a standard format for packaging machine learning models that can be used in a variety of downstream tools — for example, real-time serving through a REST API or batch inference on Apache Spark.

既然您已经使用MLproject约定打包了模型并确定了最佳模型,那么现在该使用MLflow Models部署模型了。 MLflow模型是包装机器学习模型的标准格式,可以在多种下游工具中使用,例如,通过REST API进行实时服务或在Apache Spark上进行批量推断。

In the example training code above, after training the linear regression model, a function in MLflow saved the model as an artifact within the run.

在上面的示例训练代码中,在训练了线性回归模型之后,MLflow中的一个函数在运行中将模型另存为工件。

mlflow.sklearn.log_model(lr, "model")

To view this artifact, you can use the UI again. When you click a date in the list of experiment runs you’ll see this page.

要查看此工件,可以再次使用UI。 当您点击实验运行列表中的日期时,您会看到此页面。

Image for post

At the bottom, you can see the call to mlflow.sklearn.log_model produced three files in ./mlruns/0/<run_id>/artifacts/model. The first file, MLmodel, is a metadata file that tells MLflow how to load the model. The second file is a conda.yaml that contains the model dependencies from the Conda environment. The third file, model.pkl, is a serialized version of the linear regression model that you trained.

在底部,你可以看到呼叫mlflow.sklearn.log_model中产生的三个文件./mlruns/0/<run_id>/artifacts/model 。 第一个文件MLmodel是元数据文件,它告诉MLflow如何加载模型。 第二个文件是conda.yaml ,其中包含来自Conda环境的模型依赖项。 第三个文件model.pkl是您训练的线性回归模型的序列化版本。

In this example, you can use this MLmodel format with MLflow to deploy a local REST server that can serve predictions.

在此示例中,您可以将此MLmodel格式与MLflow一起使用,以部署可以提供预测的本地REST服务器。

To deploy the server, run the following command.

要部署服务器,请运行以下命令。

mlflow models serve -m ./mlruns/0/<run_id>/artifacts/model -p 1234

Note: The version of Python used to create the model must be the same as the one running mlflow models serve. If this is not the case, you may see the error UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0x9f in position 1: ordinal not in range(128) or raise ValueError, “unsupported pickle protocol: %d”.

注意:用于创建模型的Python版本必须与运行mlflow models serve的版本相同。 如果不是这种情况,您可能会看到错误UnicodeDecodeError: 'ascii' codec can't decode byte 0x9f in position 1: ordinal not in range(128) or raise ValueError, “unsupported pickle protocol: %d”

Once you have deployed the server, you can pass it some sample data and see the predictions. The following example uses curl to send a JSON-serialized pandas DataFrame with the split orientation to the model server. For more information about the input data formats accepted by the model server, see the MLflow deployment tools documentation.

部署服务器后,可以向其传递一些示例数据并查看预测。 以下示例使用curl将具有拆分方向的JSON序列化的熊猫DataFrame发送到模型服务器。 有关模型服务器接受的输入数据格式的更多信息,请参阅MLflow部署工具文档 。

curl -X POST -H "Content-Type:application/json; format=pandas-split" --data '{"columns":["alcohol", "chlorides", "citric acid", "density", "fixed acidity", "free sulfur dioxide", "pH", "residual sugar", "sulphates", "total sulfur dioxide", "volatile acidity"],"data":[[12.8, 0.029, 0.48, 0.98, 6.2, 29, 3.33, 1.2, 0.39, 75, 0.66]]}' http://127.0.0.1:1234/invocations

The server should respond with output similar to:

服务器应使用类似于以下内容的输出进行响应:

[3.7783608837127516]

翻译自: https://medium.com/noodle-labs-the-future-of-ai/introduction-to-mlflow-for-mlops-part-1-anaconda-environment-1fd9e299226f

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/391032.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

[WCF] - 使用 [DataMember] 标记的数据契约需要声明 Set 方法

WCF 数据结构中返回的只读属性 TotalCount 也需要声明 Set 方法。 [DataContract]public class BookShelfDataModel{ public BookShelfDataModel() { BookList new List<BookDataModel>(); } [DataMember] public List<BookDataModel>…

sql注入语句示例大全_SQL Group By语句用示例语法解释

sql注入语句示例大全GROUP BY gives us a way to combine rows and aggregate data.GROUP BY为我们提供了一种合并行和汇总数据的方法。 The data used is from the campaign contributions data we’ve been using in some of these guides.使用的数据来自我们在其中一些指南…

ConcurrentHashMap和Collections.synchronizedMap(Map)的区别是什么?

ConcurrentHashMap和Collections.synchronizedMap(Map)的区别是什么&#xff1f; 我有一个会被多个线程同时修改的Map 在Java的API里面&#xff0c;有3种不同的实现了同步的Map实现 HashtableCollections.synchronizedMap(Map)ConcurrentHashMap 据我所知&#xff0c;HashT…

pymc3 贝叶斯线性回归_使用PyMC3估计的贝叶斯推理能力

pymc3 贝叶斯线性回归内部AI (Inside AI) If you’ve steered clear of Bayesian regression because of its complexity, this article shows how to apply simple MCMC Bayesian Inference to linear data with outliers in Python, using linear regression and Gaussian ra…

Hadoop Streaming详解

一&#xff1a; Hadoop Streaming详解 1、Streaming的作用 Hadoop Streaming框架&#xff0c;最大的好处是&#xff0c;让任何语言编写的map, reduce程序能够在hadoop集群上运行&#xff1b;map/reduce程序只要遵循从标准输入stdin读&#xff0c;写出到标准输出stdout即可 其次…

mongodb分布式集群搭建手记

一、架构简介 目标 单机搭建mongodb分布式集群(副本集 分片集群)&#xff0c;演示mongodb分布式集群的安装部署、简单操作。 说明 在同一个vm启动由两个分片组成的分布式集群&#xff0c;每个分片都是一个PSS(Primary-Secondary-Secondary)模式的数据副本集&#xff1b; Confi…

归约归约冲突_JavaScript映射,归约和过滤-带有代码示例的JS数组函数

归约归约冲突Map, reduce, and filter are all array methods in JavaScript. Each one will iterate over an array and perform a transformation or computation. Each will return a new array based on the result of the function. In this article, you will learn why …

为什么Java里面的静态方法不能是抽象的

为什么Java里面的静态方法不能是抽象的&#xff1f; 问题是为什么Java里面不能定义一个抽象的静态方法&#xff1f;例如&#xff1a; abstract class foo {abstract void bar( ); // <-- this is okabstract static void bar2(); //<-- this isnt why? }回答一 因为抽…

python16_day37【爬虫2】

一、异步非阻塞 1.自定义异步非阻塞 1 import socket2 import select3 4 class Request(object):5 def __init__(self,sock,func,url):6 self.sock sock7 self.func func8 self.url url9 10 def fileno(self): 11 return self.soc…

朴素贝叶斯实现分类_关于朴素贝叶斯分类及其实现的简短教程

朴素贝叶斯实现分类Naive Bayes classification is one of the most simple and popular algorithms in data mining or machine learning (Listed in the top 10 popular algorithms by CRC Press Reference [1]). The basic idea of the Naive Bayes classification is very …

python:改良廖雪峰的使用元类自定义ORM

概要本文仅仅是对廖雪峰老师的使用元类自定义ORM进行改进&#xff0c;并不是要创建一个ORM框架 编写fieldclass Field(object):def __init__(self, column_type,max_length,**kwargs):1&#xff0c;删除了参数name&#xff0c;field参数全部为定义字段类型相关参数&#xff0c;…

2019年度年中回顾总结_我的2019年回顾和我的2020年目标(包括数量和收入)

2019年度年中回顾总结In this post were going to take a look at how 2019 was for me (mostly professionally) and were also going to set some goals for 2020! &#x1f929; 在这篇文章中&#xff0c;我们将了解2019年对我来说(主要是职业)如何&#xff0c;我们还将为20…

在Java里重写equals和hashCode要注意什么问题

问题&#xff1a;在Java里重写equals和hashCode要注意什么问题 重写equals和hashCode有哪些问题或者陷阱需要注意&#xff1f; 回答一 理论&#xff08;对于语言律师或比较倾向于数学的人&#xff09;&#xff1a; equals() (javadoc) 必须定义为一个相等关系&#xff08;它…

vray阴天室内_阴天有话:第1部分

vray阴天室内When working with text data and NLP projects, word-frequency is often a useful feature to identify and look into. However, creating good visuals is often difficult because you don’t have a lot of options outside of bar charts. Lets face it; ba…

【codevs2497】 Acting Cute

这个题个人认为是我目前所做的最难的区间dp了&#xff0c;以前把环变成链的方法在这个题上并不能使用&#xff0c;因为那样可能存在重复计算 我第一遍想的时候就是直接把环变成链了&#xff0c;wa了5个点&#xff0c;然后仔细思考一下就发现了问题 比如这个样例 5 4 1 2 4 1 1 …

渐进式web应用程序_渐进式Web应用程序与加速的移动页面:有什么区别,哪种最适合您?

渐进式web应用程序Do you understand what PWAs and AMPs are, and which might be better for you? Lets have a look and find out.您了解什么是PWA和AMP&#xff0c;哪一种可能更适合您&#xff1f; 让我们看看并找出答案。 So many people own smartphones these days. T…

高光谱图像分类_高光谱图像分析-分类

高光谱图像分类初学者指南 (Beginner’s Guide) This article provides detailed implementation of different classification algorithms on Hyperspectral Images(HSI).本文提供了在高光谱图像(HSI)上不同分类算法的详细实现。 目录 (Table of Contents) Introduction to H…

在Java里如何给一个日期增加一天

在Java里如何给一个日期增加一天 我正在使用如下格式的日期: yyyy-mm-dd. 我怎么样可以给一个日期增加一天&#xff1f; 回答一 这样应该可以解决问题 String dt "2008-01-01"; // Start date SimpleDateFormat sdf new SimpleDateFormat("yyyy-MM-dd&q…

CentOS 7安装和部署Docker

版权声明&#xff1a;本文为博主原创文章&#xff0c;未经博主允许不得转载。 https://blog.csdn.net/u010046908/article/details/79553227 Docker 要求 CentOS 系统的内核版本高于 3.10 &#xff0c;查看本页面的前提条件来验证你的CentOS 版本是否支持 Docker 。通过 uname …

JavaScript字符串方法终极指南-拆分

The split() method separates an original string into an array of substrings, based on a separator string that you pass as input. The original string is not altered by split().split()方法根据您作为输入传递的separator字符串&#xff0c;将原始字符串分成子字符串…