gcp devops_将GCP AI平台笔记本用作可重现的数据科学环境

gcp devops

By: Edward Krueger and Douglas Franklin.

作者: 爱德华·克鲁格 ( Edward Krueger)和道格拉斯·富兰克林 ( Douglas Franklin) 。

In this article, we will cover how to set up a cloud computing instance to run Python with or without Jupyter Notebook. Then we show how to connect that instance to Github for a smooth cloud workflow.

在本文中,我们将介绍如何设置云计算实例以在有或没有Jupyter Notebook的情况下运行Python。 然后,我们展示了如何将该实例连接到Github,以实现流畅的云工作流程。

We utilize cloud computing instances to get flexible Python and Jupyter environments while maintaining the reproducibility of enterprise data science platforms.

我们利用云计算实例来获得灵活的Python和Jupyter环境,同时保持企业数据科学平台的可重复性。

These AI platform notebooks come configured with many data science and analytics packages, including NumPy, Pandas, Scikit-learn and TensorFlow. Typically, we would discourage the use of bloated virtual machines. However, package bloat on our analytics machine isn’t as much of a problem because we only save the result (model, data, report) for later use. Needing only this result and the few packages needed to run our model allows us to disregard the numerous packages on the VM.

这些AI平台笔记本配置了许多数据科学和分析软件包,包括NumPy,Pandas,Scikit-learn和TensorFlow。 通常,我们不鼓励使用of肿的虚拟机。 但是,由于我们只保存结果(模型,数据,报告)供以后使用,因此我们的分析机上的软件包膨胀并不是什么大问题。 只需要这个结果和运行模型所需的几个软件包,就可以忽略VM上的众多软件包。

For example, in this Medium article, we push an NLP mode to the cloud without having to worry about dependencies.

例如,在这篇中型文章中,我们将NLP模式推到了云端,而不必担心依赖关系。

Note that AI platform notebooks have all of the client packages for GCP services installed and are already authenticated to allow easy access to anything within the same GCP project. Additionally, this platform gives us not just access to Jupyter Notebooks, but also a Python console and a CLI where we can run BASH commands.

请注意,AI平台笔记本电脑已安装了所有用于GCP服务的客户端软件包,并且已经过身份验证,可以轻松访问同一GCP项目中的任何内容。 此外,该平台使我们不仅可以访问Jupyter Notebook,而且还可以使用Python控制台和CLI来运行BASH命令。

取得GCP帐户 (Getting a GCP account)

Google’s AI Platform Notebooks offer a JupyterLab and Python environment for data scientists and machine learning developers to experiment, develop, and deploy models into production. Users can create instances running JupyterLab that come pre-installed with common packages.

Google的AI平台笔记本为数据科学家和机器学习开发人员提供JupyterLab和Python环境,以进行实验,开发并将模型部署到生产中。 用户可以创建预装有通用软件包的运行JupyterLab的实例。

Before we can set up an AI Platform Notebook, we will have to set up an account and billing, don’t worry new users get $300 in free credits!

在我们设置AI Platform Notebook之前,我们必须先设置一个帐户并进行结算,不要担心新用户将获得300美元的免费积分!

Visit GCP AI Platform and click ‘go to console.’

访问GCP AI平台 ,然后单击“转到控制台”。

Be sure to click ‘Enable API’ below to access notebooks.

确保单击下面的“启用API”以访问笔记本。

Image for post
Enable API
启用API

Once we have billing set up, we can start a project.

设置好帐单后,我们可以开始一个项目。

启动您的第一个GCP AI Platform Notebook实例 (Starting up your first GCP AI Platform Notebook Instance)

Now we need to select the hardware we want our virtual machine to run on. Be sure to set up the cheapest machine possible if you are testing this out!

现在,我们需要选择要在其上运行虚拟机的硬件。 如果要进行测试,请务必设置最便宜的机器!

Once we have the API enabled, the popup selections will change to those seen below, click ‘Go to instances page’ to get started.

启用API后,弹出式菜单选择将变为以下所示,单击“转到实例页面”开始使用。

Image for post
Click GO TO INSTANCES PAGE
单击转到实例页面

The instances page might have you select ‘Enable API’ another time, be sure to do so. Then click on the ‘New Instance’ button and select ‘Python 2 and 3.’

实例页面可能会让您再次选择“启用API”,请务必选择。 然后点击“新实例”按钮并选择“ Python 2和3”。

Image for post
Notebook Instances
笔记本实例

This will open up an options menu where you’ll input the region you’d like to use. Note that different regions can have different pricing. Once you have a region selected, you will want to click ‘Customize’ and select the machine with the least RAM to have the lowest cost. In our case, it is the ‘n1-standard-1’ VM with 3.75GB of RAM.

这将打开一个选项菜单,您可以在其中输入要使用的区域。 请注意,不同地区的定价可能不同。 选定区域后,将需要单击“自定义”,然后选择RAM最少的机器以降低成本。 在我们的案例中,它是具有3.75GB RAM的“ n1-standard-1” VM。

This instance will only generate fees when it is running and can be easily paused at any time! If needed, you can swap out hardware with the dropdown menus seen below while the instance is paused.

该实例仅在运行时才会产生费用,并且可以随时轻松暂停! 如果需要,您可以在实例暂停时通过下面显示的下拉菜单交换硬件。

Image for post
Selecting a low-cost machine
选择低成本机器

Now we can use SSH to connect our VM to GitHub to allow us to push and pull to our repositories with ease.

现在,我们可以使用SSH将虚拟机连接到GitHub,从而使我们可以轻松地push存储库pushpull

设置SSH (Setting Up SSH)

Be aware you will only have to do this once per instance.

请注意,每个实例只需执行一次。

使用SSH连接到GitHub (Connecting to GitHub with ssh)

  1. Generate an ssh key by running ssh-keygen and accepting the defaults by leaving them blank and pressing the enter key. This command generates files at user/.ssh/id_rsa that you’ll need to enter into GitHub.

    通过运行ssh-keygen生成ssh密钥,并通过将其保留为空白并按Enter键来接受默认值。 此命令在user/.ssh/id_rsa处生成文件,您需要将这些文件输入GitHub。

Image for post
ssh-keygen
ssh-keygen

2. Copy your public key to your clipboard. One way to do this is by running cat ~/.ssh/id_rsa.pub to return the public key text into your console, display its contents, and then copy with the mouse and keyboard.

2.将您的公钥复制到剪贴板。 一种方法是运行cat ~/.ssh/id_rsa.pub将公钥文本返回到控制台,显示其内容,然后使用鼠标和键盘进行复制。

Image for post
using cat to get our key
用猫拿到我们的钥匙

3. Go to github.com and sign in.

3.转到gi​​thub.com并登录。

4. Click your profile image in the top right and then click “Settings.”

4.单击右上角的个人资料图片,然后单击“设置”。

5. On the left-hand side, click “SSH and GPG keys.”

5.在左侧,单击“ SSH和GPG密钥”。

6. On the top right, click “New SSH key.”

6.在右上方,单击“新建SSH密钥”。

7. Set the title to whatever you like. The “Title” is your choice, but it will help you identify what computer this authorization authorizes. Paste the copied key into the “Key” field and press “Add SSH key.”

7.将标题设置为任何您喜欢的名称。 您可以选择“标题”,但这将帮助您确定此授权授权的计算机。 将复制的密钥粘贴到“密钥”字段中,然后按“添加SSH密钥”。

Image for post

8. Go back to your computer and run eval 'ssh-agent -s' to start your ssh authentication agent.

8.返回计算机并运行eval 'ssh-agent -s'以启动ssh身份验证代理。

Image for post
Steps 8 and 9 adding our ssh-key
步骤8和9添加我们的ssh-key

9. Run ssh-addto add your private key so that the agent can authenticate the public key.

9.运行ssh-add添加您的私钥,以便代理可以验证公钥。

10. Set your git configuration so that GitHub knows who you are by running git config --global user.email you@email.com and git config --global user.name username, where the email and username are those attached to your GitHub account.

10.设置您的git配置,以便GitHub通过运行git config --global user.email you@email.comgit config --global user.name username知道您的git config --global user.name username ,其中电子邮件和用户名是附加到GitHub上的电子邮件和用户名帐户。

Now you can git clone any repository you have access too right onto the VM, make changes to the code, and push them back to the repository!

现在,您可以git clone任何有权访问的存储库直接git clone到VM上,对代码进行更改,然后将其推回到存储库中!

结论 (Conclusion)

We’ve discussed how to set up a cloud computing instance to run Python, BASH, and Jupyter Notebooks and how to connect that instance to Github for an easy and secure cloud workflow.

我们已经讨论了如何设置一个云计算实例来运行Python,BASH和Jupyter Notebook,以及如何将该实例连接到Github,以实现简单而安全的云工作流程。

This workflow is great because it is so reproducible! Teams using VMs like this will encounter less of the ‘it works on my machine’ bugs. Using ssh to connect the cloud VM and our remote repositories provide a safe connection to protect your data. Additionally, if you want to run code on expensive hardware, you don’t have to buy that hardware! Instead, run what you need and pause your instance to save costs.

这个工作流程很棒,因为它是如此的可复制! 使用此类VM的团队将遇到较少的“在我的计算机上运行”错误。 使用ssh连接云VM和我们的远程存储库可提供安全的连接来保护您的数据。 此外,如果您想在昂贵的硬件上运行代码,则不必购买该硬件! 而是运行所需的内容并暂停实例以节省成本。

We hope this guide has been helpful and that your coding skills are leveling up with us!

我们希望本指南对您有所帮助,并且您的编码技能正在与我们一起发展!

翻译自: https://towardsdatascience.com/using-gcp-ai-platform-notebooks-as-reproducible-data-science-environments-964cba32737

gcp devops

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/392510.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

迅为工业级iMX6Q开发板全新升级兼容PLUS版本|四核商业级|工业级|双核商业级...

软硬件全面升级 1. 新增Yocto项目的支持 增加opencv等软件功能 2. 新近推出i.MX6增强版本核心板(PLUS) -性能更强 四种核心板全兼容 四核商业级2G/16G;双核商业级1G/8G ;四核工业级1G/8G ;四核增强版(PLUS) 3. 豪华配…

电力现货市场现货需求_现货与情绪:现货铜市场中的自然语言处理与情绪评分

电力现货市场现货需求Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works with…

java做主成分分析_主成分分析PCA

PCA(Principal Component Analysis),即主成分分析,一种常用于数据降维分析的方法。要理解PCA的原理,首先需要理解矩阵变换的意义。矩阵变换,有两种意义:1,在当前坐标系下的向量,经过矩阵M变换后…

个人学习进度(第十六周)

转载于:https://www.cnblogs.com/lhj1017/p/7011993.html

用python绘制箱线图_用卫星图像绘制世界海岸线图-第一部分

用python绘制箱线图At the UKHO, we use data science to gain valuable insight into the data sets we hold and further our understanding of the marine environment around us.在UKHO,我们使用数据科学获得对所拥有数据集的宝贵见解,并进一步了解周…

在ASP.NET Atlas中调用Web Service——创建Mashup调用远端Web Service(基础知识以及简单示例)...

作者:Dflying Chen (http://dflying.cnblogs.com/) 注:Atlas中的Mashup极其复杂,其中涉及众多的对象与架构,为了写这篇文章,我花了不少时间学习研究。同时,关于这方面资源的匮乏简直…

java弹框形式输入_java中点击一个按钮弹出两个输入文本框的源代码

展开全部写了一个很简单的案例,可以参考和修改import java.awt.BorderLayout;import java.awt.GridLayout;import java.awt.event.ActionEvent;import java.awt.event.ActionListener;import javax.swing.JButton;import javax.swing.JDialog;import javax.swing.JFrame;import…

7时过2小时是几时_2017最北师大版二年级下册数学第七单元《时、分、秒》过关检测卷...

二年级数学下册时分秒测试卷一、填一填。(每空1分,共36分)1.钟面上有()大格,()个小格,时针走1个大格是()时,分针走一个大格是()分。2.1分()秒()分1时1分15秒()秒3.1小时20分()分90分()小时()分 70秒()分()秒4.用时、分、秒填空a)我…

java 加载class文件路径_动手实现MVC: 1. Java 扫描并加载包路径下class文件

背景用过spring框架之后,有个指定扫描包路径,然后自动实例化一些bean,这个过程还是比较有意思的,抽象一下,即下面三个点如何扫描包路径下所有的class文件如何扫描jar包中对应包路径下所有的class文件如何加载class文件…

java jolt tuxedo_java通过jolt调用tuxedo服务.xls

java通过jolt调用tuxedo服务.xls还剩20页未读,继续阅读下载文档到电脑,马上远离加班熬夜!亲,喜欢就下载吧,价低环保!内容要点:?private bea.jolt.pool.servlet.ServletSessionPoolManager bool…

pandas之Seris和DataFrame

pandas是一个强大的python工具包,提供了大量处理数据的函数和方法,用于处理数据和分析数据。 使用pandas之前需要先安装pandas包,并通过import pandas as pd导入。 一、系列Series Seris为带标签的一维数组,标签即为索引。 1.Seri…

机器学习:分类_机器学习基础:K最近邻居分类

机器学习:分类In the previous stories, I had given an explanation of the program for implementation of various Regression models. Also, I had described the implementation of the Logistic Regression model. In this article, we shall see the algorithm of the K…

安卓中经常使用控件遇到问题解决方法(持续更新和发现篇幅)(在textview上加一条线、待续)...

TextView设置最多显示30个字符。超过部分显示...(省略号)&#xff0c;有人说分别设置TextView的android:signature"true",而且设置android:ellipsize"end";可是我试了。居然成功了&#xff0c;供大家參考 [java] view plaincopy<TextView android:id…

垃圾邮件分类 python_在python中创建SMS垃圾邮件分类器

垃圾邮件分类 python介绍 (Introduction) I have always been fascinated with Google’s gmail spam detection system, where it is able to seemingly effortlessly judge whether incoming emails are spam and therefore not worthy of our limited attention.我一直对Goo…

简单易用的MongoDB

从我第一次听到Nosql这个概念到如今已经走过4个年头了&#xff0c;但仍然没有具体的去做过相应的实践。最近获得一段学习休息时间&#xff0c;购买了Nosql技术实践一书&#xff0c;正在慢慢的学习。在主流观点中&#xff0c;Nosql大体分为4类&#xff0c;键值存储数据库&#x…

java断点续传插件_视频断点续传+java视频

之前仿造uploadify写了一个HTML5版的文件上传插件&#xff0c;没看过的朋友可以点此先看一下~得到了不少朋友的好评&#xff0c;我自己也用在了项目中&#xff0c;不论是用户头像上传&#xff0c;还是各种媒体文件的上传&#xff0c;以及各种个性的业务需求&#xff0c;都能得到…

tomcat中设置Java 客户端程序的http(https)访问代理

1、假定http/https代理服务器为 127.0.0.1 端口为8118 2、在tomcat/bin/catalina.sh脚本文件中设置JAVA_OPTS&#xff0c;如下图&#xff1a; 保存后重启tomcat就能生效。转载于:https://www.cnblogs.com/zhangmingcheng/p/11211776.html

MQTT服务器搭建--Mosquitto用户名密码配置

前言&#xff1a; 基于Mosquitto服务器已经搭建成功&#xff0c;大部分都是采用默认的是允许匿名用户登录模式&#xff0c;正式上线的系统需要进行用户认证。 1.用户参数说明 Mosquitto服务器的配置文件为/etc/mosquitto/mosquitto.conf&#xff0c;关于用户认证的方式和读取的…

压缩/批量压缩/合并js文件

写在前面 如果文件少的话&#xff0c;直接去网站转化一下就行。 http://tool.oschina.net/jscompress?type3 1.压缩单个js文件 cnpm install uglify-js -g 安装 1>压缩单个js文件打开cmd,目录引到当前文件夹&#xff0c;cduglifyjs inet.js -o inet-min.js 或者 uglifyjs i…

软件安装(JDK+MySQL+TOMCAT)

一&#xff0c;JDK安装 1&#xff0c;查看当前Linux系统是否已经安装了JDK 输入 rpm -qa | grep java 如果有&#xff1a; 卸载两个openJDK&#xff0c;输入rpm -e --nodeps 要卸载的软件 2&#xff0c;上传JDK到Linux 3&#xff0c;安装jdk运行需要的插件yum install gl…