如果您不将Docker用于数据科学项目,那么您将生活在1985年

重点 (Top highlight)

One of the hardest problems that new programmers face is understanding the concept of an ‘environment’. An environment is what you could say, the system that you code within. In principal it sounds easy, but later on in your career you begin to understand just how difficult it is to maintain.

新程序员面临的最困难的问题之一是了解“环境”的概念。 您可以说的是环境,即您在其中编码的系统。 从原则上讲,这听起来很容易,但是在职业生涯的后期,您开始了解维护的难易程度。

The reason being is that libraries and IDE’s and even the Python Code itself goes through updates and version changes, then sometimes, you’ll update one library, and a separate piece of code will fail, so you’ll need to go back and fix it.

原因是库和IDE甚至Python代码本身都会进行更新和版本更改,因此有时您将更新一个库,而另一段代码将失败,因此您需要返回并进行修复它。

Moreover, if we have multiple projects being developed at the same time, there can be dependency conflicts, which is when things really get ugly as code fails directly because of another piece of code.

而且,如果我们同时开发多个项目,则可能存在依赖冲突,这是当代码由于另一段代码而直接失败时,事情变得非常难看。

Also, say you want to share a project to a team mate working on a different OS, or even ship your project that you’ve built on your Mac to a production server on a different OS, would you have to reconfigure your code? Yes, you probably will have to.

另外,假设您想与在不同OS上工作的团队共享一个项目,或者甚至将在Mac上构建的项目运送到在不同OS上的生产服务器上,是否需要重新配置代码? 是的,您可能必须这样做。

So to mitigate any of these issues, containers were proposed as a method to separate projects and the environments that they exist within. A container is basically a place where an environment can run, separate to everything else on the system. Once you define what’s in your container, it becomes so much easier to recreate the environment, and even share the project with teammates.

因此,为了缓解这些问题中的任何一个,提出了将containers作为一种将项目及其所处环境分开的方法。 一个 container 基本上是一个可以运行环境的地方,与系统上的所有其他地方分开。 一旦定义了container,中的container,就可以轻松地重新创建环境,甚至与队友共享项目。

要求 (Requirements)

To get started, we need to install a few things to get set up:

首先,我们需要安装一些东西进行设置:

  • Windows or macOS: Install Docker Desktop

    Windows或macOS: 安装Docker桌面

  • Linux: Install Docker and then Docker Compose

    Linux:先安装Docker ,再安装Docker Compose

容器化Python服务 (Containerise a Python service)

Let’s imagine we’re creating a Flask service called server.py and let’s say the contents of the file are as follows:

假设我们正在创建一个名为server.py的Flask服务,并假设文件的内容如下:

from flask import Flask
server = Flask(__name__)@server.route("/")
def hello():
return "Hello World!"if __name__ == "__main__":
server.run(host='0.0.0.0')

Now as I said above, we need to keep a record of the dependencies for our code so for this, we can create a requirements.txt file that can contain the following requirement:

现在,如上所述,我们需要记录代码的依赖关系,因此,我们可以创建一个requirements.txt文件,其中可以包含以下要求:

Flask==1.1.1

So our package has the following structure:

因此,我们的软件包具有以下结构:

app
├─── requirements.txt
└─── src
└─── server.py

The structure is pretty logical (source kept is kept in a separate directory). To execute our Python program, all is left to do is to install a Python interpreter and run it.

该结构非常合理(源代码保存在单独的目录中)。 要执行我们的Python程序,剩下要做的就是安装一个Python解释器并运行它。

Now to run the program, we could run it locally but suppose we have 15 projects we’re working through — it makes sense to run it in a container to avoid any conflicts with any other projects.

现在要运行该程序,我们可以在本地运行它,但假设我们正在处理15个项目-在容器中运行它以避免与任何其他项目发生任何冲突都是有意义的。

Let’s move onto containerisation.

让我们进入集装箱化。

Image for post
Photo by Victoire Joncheray on Unsplash
Victoire Joncheray在Unsplash上拍摄的照片

Docker文件 (Dockerfile)

To run Python code, we pack the container as a Docker image and then run a container based on it. So as follows:

要运行Python代码,我们将容器打包为Docker image ,然后基于该容器运行一个容器。 因此如下:

  1. Create a Dockerfile that contains instructions needed to build the image

    创建一个Dockerfile,其中包含构建映像所需的指令
  2. Then create an image by the Docker builder

    然后通过Docker构建器创建image

  3. The simple docker run <image> command then creates a container that is running an app

    简单的docker run <image>命令然后创建一个运行应用程序的容器

Dockerfile的分析 (Analysis of a Dockerfile)

A Dockerfile is a file that contains instructions for assembling a Docker image (saved as myimage):

Dockerfile是一个文件,其中包含有关组装Docker映像(保存为myimage )的说明:

# set base image (host OS)
FROM python:3.8# set the working directory in the container
WORKDIR /code# copy the dependencies file to the working directory
COPY requirements.txt .# install dependencies
RUN pip install -r requirements.txt# copy the content of the local src directory to the working directory
COPY src/ .# command to run on container start
CMD [ "python", "./server.py" ]

A Dockerfile is compiled line by line so the builder generates an image layer and stacks it upon previous images.

Dockerfile是逐行编译的,因此构建器会生成图像层并将其堆叠在先前的图像上。

We can also observe in the output of the build command the Dockerfile instructions being executed as steps.

我们还可以在build命令的输出中观察到作为步骤执行的Dockerfile指令。

$ docker build -t myimage .
Sending build context to Docker daemon 6.144kBStep 1/6 : FROM python:3.8
3.8.3-alpine: Pulling from library/python

Status: Downloaded newer image for python:3.8.3-alpine
---> 8ecf5a48c789Step 2/6 : WORKDIR /code
---> Running in 9313cd5d834d
Removing intermediate container 9313cd5d834d
---> c852f099c2f9Step 3/6 : COPY requirements.txt .
---> 2c375052ccd6Step 4/6 : RUN pip install -r requirements.txt
---> Running in 3ee13f767d05

Removing intermediate container 3ee13f767d05
---> 8dd7f46dddf0Step 5/6 : COPY ./src .
---> 6ab2d97e4aa1Step 6/6 : CMD python server.py
---> Running in fbbbb21349be
Removing intermediate container fbbbb21349be
---> 27084556702b
Successfully built 70a92e92f3b5
Successfully tagged myimage:latest

Then, we can see that the image is in the local image store:

然后,我们可以看到该图像在本地图像存储中:

$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
myimage latest 70a92e92f3b5 8 seconds ago 991MB

During development, we may need to rebuild the image for our Python service multiple times and we want this to take as little time as possible.

在开发过程中,我们可能需要多次重建Python服务的映像,并且我们希望这样做花费尽可能少的时间。

Note: Docker and virtualenv are quite similar but different. Virtualenv only allows you to switch between Python Dependencies but you’re stuck with your host OS. However with Docker, you can swap out the entire OS — install and run Python on any OS (think Ubuntu, Debian, Alpine, even Windows Server Core). Therefore if you work in a team and want to future proof your technology, use Docker. If you don’t care about it — venv is fine, but remember it’s not future proof. Please reference this if you still want more information.

注意: Dockervirtualenv非常相似,但有所不同。 Virtualenv只允许您在Py​​thon依赖关系之间进行切换,但是您对主机OS感到Virtualenv 。 但是,使用Docker ,您可以换出整个OS -在任何OS上安装并运行Python(请考虑使用Ubuntu,Debian,Alpine甚至Windows Server Core)。 因此,如果您在团队中工作,并且希望将来验证您的技术,请使用Docker 。 如果您不关心它, venv很好,但是请记住,这并不是未来的证明。 如果您仍需要更多信息,请参考此内容。

There you have it! We’ve shown how to containerise a Python service. Hopefully, this process will make it a lot easier and gives your project a longer shelf life as it’ll be less likely to come down with code-bugs as dependencies change.

你有它! 我们已经展示了如何容器化Python服务。 希望这个过程将使它变得更容易,并为您的项目提供更长的保存期限,因为随着依赖关系的改变,代码错误的可能性将降低。

Thanks for reading, and please let me know if you have any questions!

感谢您的阅读,如果您有任何疑问,请告诉我!

Keep up to date with my latest articles here!

在这里了解我的最新文章!

翻译自: https://towardsdatascience.com/youre-living-in-1985-if-you-don-t-use-docker-for-your-data-science-projects-858264db0082

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/387910.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

jmeter对oracle压力测试

下载Oracle的jdbc数据库驱动包&#xff0c;注意Oracle数据库的版本&#xff0c;这里使用的是&#xff1a;Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production&#xff1b; 一般数据库的驱动包文件在安装路径下&#xff1a;D:\oracle\product\10.2.…

docker部署flask_使用Docker,GCP Cloud Run和Flask部署Scikit-Learn NLP模型

docker部署flaskA brief guide to building an app to serve a natural language processing model, containerizing it and deploying it.构建用于服务自然语言处理模型&#xff0c;将其容器化和部署的应用程序的简要指南。 By: Edward Krueger and Douglas Franklin.作者&am…

SQL的执行计划

SQL的执行计划实际代表了目标SQL在Oracle数据库内部的具体执行步骤&#xff0c;作为调优&#xff0c;只有知道了优化器选择的执行计划是否为当前情形下最优的执行计划&#xff0c;才能够知道下一步往什么方向。 执行计划的定义&#xff1a;执行目标SQL的所有步骤的组合。 我们首…

[转帖]USB-C和Thunderbolt 3连接线你搞懂了吗?---没搞明白.

USB-C和Thunderbolt 3连接线你搞懂了吗&#xff1f; 2018年11月25日 07:30 6318 次阅读 稿源&#xff1a;威锋网 3 条评论按照计算行业的风潮&#xff0c;USB Type-C 将会是下一代主流的接口。不过&#xff0c;在过去两年时间里&#xff0c;关于 USB-C、Thunderbolt 3、USB 3.1…

大数据技术 学习之旅_为什么聚焦是您数据科学之旅的关键

大数据技术 学习之旅David Robinson, a data scientist, has said the following quotes:数据科学家David Robinson曾说过以下话&#xff1a; “When you’ve written the same code 3 times, write a function.”“当您编写了3次相同的代码时&#xff0c;请编写一个函数。” …

无监督学习 k-means_无监督学习-第4部分

无监督学习 k-means有关深层学习的FAU讲义 (FAU LECTURE NOTES ON DEEP LEARNING) These are the lecture notes for FAU’s YouTube Lecture “Deep Learning”. This is a full transcript of the lecture video & matching slides. We hope, you enjoy this as much as …

vCenter 升级错误 VCSServiceManager 1603

近日&#xff0c;看到了VMware发布的vCenter 6.7 Update 1b的更新消息。其中有一条比较震撼。有误删所有VM的概率&#xff0c;这种BUG谁也承受不起。Removing a virtual machine folder from the inventory by using the vSphere Client might delete all virtual machinesIn t…

day28 socketserver

1. socketserver 多线程用的 例 import socket import timeclientsocket.socket() client.connect(("127.0.0.1",9000))while 1:cmdinput("请输入指令")client.send(cmd.encode("utf-8"))from_server_msgclient.recv(1024).decode("utf…

车牌识别思路

本文源自我之前花了2天时间做的一个简单的车牌识别系统。那个项目&#xff0c;时间太紧&#xff0c;样本也有限&#xff0c;达不到对方要求的95%识别率&#xff08;主要对于车牌来说&#xff0c;D,0&#xff0c;O&#xff0c;I&#xff0c;1等等太相似了。然后&#xff0c;汉字…

深度学习算法原理_用于对象检测的深度学习算法的基本原理

深度学习算法原理You just got a new drone and you want it to be super smart! Maybe it should detect whether workers are properly wearing their helmets or how big the cracks on a factory rooftop are.您刚刚拥有一架新无人机&#xff0c;并希望它变得超级聪明&…

【python】numpy库linspace相同间隔采样 详解

linspace可以用来实现相同间隔的采样&#xff1b; numpy.linspace(start,stop,num50,endpointTrue,retstepFalse, dtypeNone) 返回num均匀分布的样本&#xff0c;在[start, stop]。 Parameters(参数): start : scalar(标量) The starting value of the sequence(序列的起始点)…

Spring整合JMS——基于ActiveMQ实现(一)

Spring整合JMS——基于ActiveMQ实现&#xff08;一&#xff09; 1.1 JMS简介 JMS的全称是Java Message Service&#xff0c;即Java消息服务。它主要用于在生产者和消费者之间进行消息传递&#xff0c;生产者负责产生消息&#xff0c;而消费者负责接收消息。把它应用到实际的…

CentOS7+CDH5.14.0安装全流程记录,图文详解全程实测-8CDH5安装和集群配置

Cloudera Manager Server和Agent都启动以后&#xff0c;就可以进行CDH5的安装配置了。 准备文件 从 http://archive.cloudera.com/cdh5/parcels/中下载CDH5.14.0的相关文件 把CDH5需要的安装文件放到主节点上&#xff0c;新建目录为/opt/cloudera/parcel-repo把我们之前下载的…

node.js安装部署测试

&#xff08;一&#xff09;安装配置&#xff1a; 1&#xff1a;从nodejs.org下载需要的版本 2&#xff1a;直接安装&#xff0c;默认设置 &#xff0c;默认安装在c:\program files\nodejs下。 3&#xff1a;更改npm安装模块的默认目录 &#xff08;默认目录在安装目录下的node…

社群系统ThinkSNS+ V2.2-V2.3升级教程

WARNING本升级指南仅适用于 2.2 版本升级至 2.3 版本&#xff0c;如果你并非 2.2 版本&#xff0c;请查看其他升级指南&#xff0c;Plus 程序不允许跨版本升级&#xff01;#更新代码预计耗时&#xff1a; 2 小时这是你自我操作的步骤&#xff0c;确认将你的 2.2 版本代码升级到…

activemq部署安装

一、架构和技术介绍 1、简介 ActiveMQ 是Apache出品&#xff0c;最流行的&#xff0c;能力强劲的开源消息总线。完全支持JMS1.1和J2EE 1.4规范的 JMS Provider实现 2、activemq的特性 1. 多种语言和协议编写客户端。语言: Java, C, C, C#, Ruby, Perl, Python, PHP。应用协议: …

主串与模式串的匹配

主串与模式串的匹配 &#xff08;1&#xff09;BF算法&#xff1a; BF算法比较简单直观&#xff0c;其匹配原理是主串S.ch[i]和模式串T.ch[j]比较&#xff0c;若相等&#xff0c;则i和j分别指示串中的下一个位置&#xff0c;继续比较后续字符&#xff0c;若不相等&#xff0c;从…

什么是 DDoS 攻击?

欢迎访问网易云社区&#xff0c;了解更多网易技术产品运营经验。 全称Distributed Denial of Service&#xff0c;中文意思为“分布式拒绝服务”&#xff0c;就是利用大量合法的分布式服务器对目标发送请求&#xff0c;从而导致正常合法用户无法获得服务。通俗点讲就是利用网络…

nginx 并发过十万

一般来说nginx 配置文件中对优化比较有作用的为以下几项&#xff1a; worker_processes 8; nginx 进程数&#xff0c;建议按照cpu 数目来指定&#xff0c;一般为它的倍数。 worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000; 为每…

神经网络使用情景

神经网络使用情景 人脸&#xff0f;图像识别语音搜索文本到语音&#xff08;转录&#xff09;垃圾邮件筛选&#xff08;异常情况探测&#xff09;欺诈探测推荐系统&#xff08;客户关系管理、广告技术、避免用户流失&#xff09;回归分析 为何选择Deeplearning4j&#xff1f; …