python中api_通过Python中的API查找相关的工作技能

python中api

工作技能世界 (The World of Job Skills)

So you want to figure out where your skills fit into today’s job market. Maybe you’re just curious to see a comprehensive constellation of job skills, clean and standardized. Or you need a taxonomy of skills for a Resume parsing project. Well, the EMSI skills API is one possible tool for the job!

因此,您想弄清楚自己的技能适合当今的就业市场。 也许您只是好奇地看到一个完整,标准化的工作技能组合。 或者,您需要针对简历解析项目的技能分类。 嗯, EMSI技能API是一项可行的工具!

In this tutorial, I’ll walk you through some boilerplate code you can use to access a few key endpoints from the API: a global list of skills, skill extraction from a document, skill lookup by name, and lastly finding related skills by skill ID. Let’s get started.

在本教程中,我将指导您完成一些样板代码,您可以使用这些样板代码从API访问一些关键端点: 技能全局列表,从文档中提取技能,按名称查找技能以及最后按技能查找相关技能ID 。 让我们开始吧。

建立 (Setup)

Getting started is as easy as signing up for the API’s free access. You’ll get authentication credentials emailed to you once you complete that process.

入门就像注册 API的免费访问一样容易。 完成该过程后,您将通过电子邮件将身份验证凭据发送给您。

进口声明 (Import Statements)

We’ll use a few packages here, so let’s import those first:

我们将在此处使用一些软件包,因此让我们首先导入它们:

source)来源 )

All of these are pretty standard. I’m using the json_normalize package which is an easy means of converting JSON to Pandas DataFrames, which will be nicer for readability.

所有这些都是相当标准的。 我正在使用json_normalize包,这是将JSON转换为Pandas DataFrames的一种简便方法,这对于可读性会更好。

验证您的连接 (Authenticating Your Connection)

The first part of accessing the API is simply using the credentials in that signup email to establish a connection and get an access token. I ran the following in a cell in a Jupyter Notebook with Python.

访问API的第一部分只是使用注册电子邮件中的凭据来建立连接并获取访问令牌。 我在使用Python的Jupyter Notebook的单元格中运行了以下内容。

source)来源 )

Sidenote: if my code blocks (like the one above) are cut off, please follow the source link in their caption to read the full code!

旁注:如果我的代码块(如上面的代码块)被切除,请按照其标题中的源链接阅读完整的代码!

This code results in an authentication JSON object, where one of the keys is the access_token. Here I’ve explicitly accessed the value of that key and assigned it to a variable of the same name for later use.

这段代码生成一个身份验证JSON对象,其中的键之一是access_token 。 在这里,我已显式访问该键的值,并将其分配给同名变量,以供以后使用。

“你好,世界!” EMSI的技能API (The “Hello, World!” of EMSI’s Skills API)

EMSI has multiple APIs, but we’ll be focused on the Skills API in this tutorial. To get started, we’re just going to use that access token to pull the full list of skills available to us.

EMSI有多个API,但是在本教程中我们将重点介绍Skills API。 首先,我们将使用该访问令牌提取可供我们使用的完整技能列表。

拉全球职业技能清单 (Pull the Global List of Job Skills)

I wrote a simple function to pull the skills list and write it to a Pandas DataFrame for nicer formatting and readability.

我编写了一个简单的函数来提取技能列表,并将其写入Pandas DataFrame,以获得更好的格式和可读性。

source)来源 )

I set the url to the skills list endpoint, concatenated the access token in with the necessary syntactical specifications for the API, and used the requests library to get the data. This results in the following global list of skills:

我将URL设置为技能列表端点,将访问令牌与API的必要语法规范连接在一起,并使用请求库获取数据。 这将产生以下全局技能列表:

DataFrame showing all skills in the EMSI skills API. Columns: id, name, type id, type name
DataFrame of the global skills list
全球技能清单的DataFrame

You can see here there are both hard and soft skills, each skill has a unique ID, and each skill is standardized and proper cased. Each skill type has a type ID as well. There are nearly 30,000 skills listed here!

您可以在此处看到硬技能和软技能,每种技能都有唯一的ID,并且每种技能都经过标准化和适当的区分。 每个技能类型也都有一个类型ID。 这里列出了将近30,000种技能!

提取给定文档中出现的技能 (Extract the Skills That Appear in a Given Document)

Say instead you have a document (a resume or job description for example), and you want to find relevant skills that the resume holder has or the job poster wants. The following function will prompt you for a text input. Paste the text in there and set a confidence interval between 0 and 1 (I usually do 0.4 to see a longer list of skills), and voilà — skills extracted!

假设您有一个文档(例如,一份简历或职位描述),并且想找到简历持有人或职位发布者想要的相关技能。 以下功能将提示您输入文本。 在其中粘贴文本,并在0到1之间设置一个置信区间(通常我会做0.4来查看更多的技能列表),然后瞧瞧-提取出来的技能!

source)来源 )

I had typed “python and such” as a simple example, which returned this skill extraction with a 100% (1.0) confidence level to no surprise:

我以简单的示例输入了“ python之类”,它以100%(1.0)的置信度返回了此技能提取,这并不奇怪:

Python skill DataFrame extracted from a sample doc that simply said “python and such”
DataFrame of Extracted Skills from a Doc
从文档中提取技能的数据框

This is all well and good. But what if you want to find how a skill is referred to in this taxonomy? Well, there’s an API that finds related skills by ID, but we need to know the ID first! Let’s find that now.

这一切都很好。 但是,如果您想查找此分类法中如何提及一项技能,该怎么办? 嗯,有一个API可通过ID查找相关技能,但我们需要首先了解ID! 让我们现在找到它。

通过名称查找技能以找到其ID (Look Up a Skill by Name to find its ID)

The following code uses Python’s str.contains method to find skills that contain the substring entered as an argument to the function.

以下代码使用Python的str.contains方法查找包含包含作为函数参数输入的子字符串的技能。

source)来源 )

As you can see, using the str.contains(name_substring) method results in finding all skills that have the word Python in it. This allows us to see the full range of possibilities and select the IDs of the ones we want to find related skills for. The DataFrame returned by the above function is shown below:

如您所见,使用str.contains(name_substring)方法会发现其中包含单词Python所有技能。 这使我们能够看到所有可能性,并选择我们想要查找相关技能的ID。 上面的函数返回的DataFrame如下所示:

Image for post

There is a lot of granularity here! Let’s next find related skills to Pandas and Python as an example by grabbing their IDs and inputting them into the next block of code.

这里有很多粒度! 接下来,让我们通过获取它们的ID并将其输入到下一个代码块中,来找到与Pandas和Python相关的技能作为示例。

查找与技能相关的技能 (Find Related Skills to a Skill)

We have our IDs for the skills of interest. Now we want to find related skills to them. I’ve added the IDs of the skills in question to the code in the payload and as comments at the top of the following code block. If you want to add more, pay close attention to the formatting of payload. It escapes the “ and other nuances like needing the spacing before the closing }.

我们拥有感兴趣技能的ID。 现在,我们想找到与他们相关的技能。 我已经将有关技能的ID添加到有效负载中的代码中,并在以下代码块的顶部作为注释。 如果要添加更多内容,请密切注意payload的格式。 它避免了“”和其他细微差别,例如在结束}前需要间隔。

source)来源 )

We saw in the previous output of skills involving the word Python that there were many options. I chose to find skills related to Python and Pandas. The resultant DataFrame is shown below:

在前面涉及Python的技能输出中,我们看到了很多选择。 我选择查找与PythonPandas相关的技能。 结果数据框如下所示:

DataFrame of Skills Related to Python and Pandas
DataFrame of Skills Related to Python and Pandas
与Python和Pandas相关的技能的DataFrame

This is great performance! It shows us other Python packages essentially, including NumPy which almost always accompanies Pandas in our import statements in Data Science!

这是很棒的表现! 它从本质上向我们展示了其他Python软件包,包括NumPy,它几乎总是在数据科学中的import语句中伴随Pandas!

结论和未来的工作 (Conclusion and Future Work to be Done)

Thanks for reading this quick tutorial on the EMSI Skills API. I hope you found it useful for whatever your use case may be. If you want to see this developed in a specific further direction, please leave me a comment below! There are many more interesting datasets from EMSI as well that are worth checking out, including those with information on the labor markets, job postings, and much more.

感谢您阅读有关EMSI Skills API的快速教程。 我希望您发现它对您的用例可能有用。 如果您想看到这个方向的进一步发展,请在下面给我留言! 有来自许多EMSI更有趣的数据集,以及那些值得检查,包括那些在劳动力市场信息,招聘信息,以及更多 。

For the next steps, I can re-engineer the related skills code block so that it’s a function, taking in a list of skill IDs as keyword arguments and adding them into the payload. Right now it’s a little finicky and not standardized. I’d like to engineer this into a module, where a connection is a class, and utilization of each endpoint is a method with more robust attributes and arguments. That would certainly save many lines of code.

对于下一步,我可以重新设计相关的技能代码块,使其成为一个功能,将技能ID的列表作为关键字参数,并将其添加到有效负载中。 现在,它有点挑剔且不规范。 我想将其设计到一个模块中,其中连接是一个类,每个端点的利用是一种具有更可靠的属性和参数的方法。 那肯定会节省很多行代码。

But till next time — happy coding!

但是直到下一次-编码愉快!

Riley

赖利

翻译自: https://towardsdatascience.com/finding-relevant-job-skills-via-api-in-python-ced56cbb3493

python中api

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388538.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

欺诈行为识别_使用R(编程)识别欺诈性的招聘广告

欺诈行为识别背景 (Background) Online recruitment fraud (ORF) is a form of malicious behaviour that aims to inflict loss of privacy, economic damage or harm the reputation of the stakeholders via fraudulent job advertisements.在线招聘欺诈(ORF)是一种恶意行为…

c语言实验四报告,湖北理工学院14本科C语言实验报告实验四数组

湖北理工学院14本科C语言实验报告实验四 数组.doc实验四 数 组实验课程名C语言程序设计专业班级 14电气工程2班 学号 201440210237 姓名 熊帆 实验时间 5.12-5.26 实验地点 K4-208 指导教师 祁文青 一、实验目的和要求1. 掌握一维数组和二维数组的定义、赋值和输入输出的方法&a…

rabbitmq channel参数详解【转】

1、Channel 1.1 channel.exchangeDeclare(): type:有direct、fanout、topic三种durable:true、false true:服务器重启会保留下来Exchange。警告:仅设置此选项,不代表消息持久化。即不保证重启后消息还在。原…

nlp gpt论文_GPT-3:NLP镇的最新动态

nlp gpt论文什么是GPT-3? (What is GPT-3?) The launch of Open AI’s 3rd generation of the pre-trained language model, GPT-3 (Generative Pre-training Transformer) has got the data science fraternity buzzing with excitement!Open AI的第三代预训练语言…

真实不装| 阿里巴巴新人上路指北

新手上路,总想听听前辈们分享他们走过的路。橙子选取了阿里巴巴合伙人逍遥子(阿里巴巴集团CEO) 、Eric(蚂蚁金服董事长兼CEO)、Judy(阿里巴巴集团CPO)的几段分享,他们是如何看待职场…

小程序学习总结

上个周末抽空了解了一下小程序,现在将所学所感记录以便日后翻看;需要指出的是我就粗略过了下小程序的api了解了下小程序的开发流程以及工具的使用,然后写了一个小程序的demo;在我看来,如果有前端基础学习小程序无异于锦上添花了,而我这个三年的码农虽也写过不少前端代码但离专业…

uber 数据可视化_使用R探索您在Uber上的活动:如何分析和可视化您的个人数据历史记录

uber 数据可视化Perhaps, dear reader, you are too young to remember that before, the only way to request a particular transport service such as a taxi was to raise a hand to make a signal to an available driver, who upon seeing you would stop if he was not …

java B2B2C springmvc mybatis电子商城系统(四)Ribbon

2019独角兽企业重金招聘Python工程师标准>>> 一:Ribbon是什么? Ribbon是Netflix发布的开源项目,主要功能是提供客户端的软件负载均衡算法,将Netflix的中间层服务连接在一起。Ribbon客户端组件提供一系列完善的配置项如…

基于plotly数据可视化_[Plotly + Datashader]可视化大型地理空间数据集

基于plotly数据可视化简介(我们将创建的内容): (Introduction (what we’ll create):) Unlike the previous tutorials in this map-based visualization series, we will be dealing with a very large dataset in this tutorial (about 2GB of lat, lon coordinat…

Centos用户和用户组管理

inux系统是一个多用户多任务的分时操作系统,任何一个要使用系统资源的用户,都必须首先向系统管理员申请一个账号,然后以这个账号的身份进入系统。1、添加新的用户账号使用useradd命令,其语法如下:useradd 选项 用户名-…

划痕实验 迁移面积自动统计_从Jupyter迁移到合作实验室

划痕实验 迁移面积自动统计If you want to use Google Colaboratory to perform your data analysis, for building data pipelines and data visualizations, here is the beginners’ guide to migrate from one tool to the other.如果您想使用Google Colaboratory进行数据分…

数据开放 数据集_除开放式清洗之外:叙述是开放数据门户的未来吗?

数据开放 数据集There is growing consensus in the open data community that the mere release of open data — that is data that can be freely accessed, remixed, and redistributed — is not enough to realize the full potential of openness. Successful open data…

ios android 交互 区别,很多人不承认:iOS的返回交互,对比Android就是反人类。

宁之的奥义2020-09-21 10:54:39点灭只看此人举报给你解答:美国人都是左撇子,所以他们很方便🐶给你解答:美国人都是左撇子,所以他们很方便🐶亮了(504)回复查看评论(19)回忆的褶皱楼主2020-09-21 11:01:01点灭…

Servlet+JSP

需要说明的是,其实工具的版本不是主要因素,所以我下面忽略版本。 你能搜到这篇文章,说明你已经知道怎么部署Tomcat,并运行自己的网页了。 但是,我们知道,每次修改源文件,我们总得手工把文件co…

正态分布高斯分布泊松分布_正态分布:将数据转换为高斯分布

正态分布高斯分布泊松分布For detailed implementation in python check my GitHub repository.有关在python中的详细实现,请查看我的GitHub存储库。 介绍 (Introduction) Some machine learning model like linear and logistic regression assumes a Gaussian di…

BABOK - 开篇:业务分析知识体系介绍

本文更新版已挪至 http://www.zhoujingen.cn/itbang/328.html ---------------------------------------------- 当我们作项目时,下面这张图很多人都明白,从计划、构建、测试、部署实施后发现提供的方案并不能真正解决用户的问题,那么我们是…

黑苹果 wifi android,动动手指零负担让你的黑苹果连上Wifi

动动手指零负担让你的黑苹果连上Wifi2019-12-02 10:08:485点赞36收藏4评论购买理由黑苹果Wifi是个头疼的问题,高“贵”的原机Wifi蓝牙很贵,比如我最近偶然得到的BCM94360CS2,估计要180。稍微便宜的一点的,搞各种ID,各种…

float在html语言中的用法,float属性值包括

html中不属于float常用属性值的是float常用的值就三个:left\right\none。没有其他的值了。 其中none这个值是默认的,所以一般不用写。css中float属性有几种用法?值 描述left 元素向左浮动。 right 元素向右浮动。 none 默认值。元素不浮动,并…

它们是什么以及为什么我们不需要它们

Once in a while, when reading papers in the Reinforcement Learning domain, you may stumble across mysterious-sounding phrases such as ‘we deal with a filtered probability space’, ‘the expected value is conditional on a filtration’ or ‘the decision-mak…

LoadRunner8.1破解汉化过程

LR8.1版本已经将7.8和8.0中通用的license封了,因此目前无法使用LR8.1版本,包括该版本的中文补丁。 破解思路:由于软件的加密程序和运行的主程序是分开的,因此可以使用7.8的加密程序覆盖8.1中的加密程序,这样老的7.8和…