数据科学项目
In this article, I would like to showcase what might be my simplest data science project ever.
在本文中,我想展示一下有史以来最简单的数据科学项目 。
I have spent hours training a much more complex models in the past, and struggled to find the right parameters to create machine learning pipelines.
过去,我花费了数小时来训练更复杂的模型,并努力寻找合适的参数来创建机器学习管道。
Despite its simplicity, if I could only display one project on my resume, it would be this one.
尽管它很简单,但如果我只能在简历中显示一个项目,那就是这个。
Let me explain why.
让我解释一下原因。
包装是否确定礼物的价值? (Does the package determine the value of the gift?)
As a child, I would always get excited about holidays because I could get gifts. (Just humour me here, I do have a point, I promise). My aunt presented me with this beautiful dress, perhaps more beautiful than any other gift I received that day.
小时候,我总是会对假期感到兴奋,因为我可以得到礼物。 ( 我保证我在这里很幽默,我有一点要保证)。 我的姨妈给了我这件漂亮的衣服,也许比那天我收到的任何其他礼物都要漂亮。
Here’s the thing though — I didn’t even want to open it. She had shabbily wrapped it with newspaper, and the gift seemed to have lost half its value before I even saw what was inside.
不过,这是东西–我什至不想打开它。 她用报纸把它包裹起来,礼物似乎失去了一半的价值,我什至没有看到里面的东西。
To answer the question above, no. The package by no means determines the value of the gift.
要回答上述问题, 否 。 包装决不会决定礼物的价值。
However, it can greatly influence your expectation of what’s inside and can change the way you perceive it.
但是,它会极大地影响您对内部内容的期望,并会改变您对其的感知方式。
The machine learning models you spend weeks training are great. Demonstrate that. Don’t let them die in your Jupyter Notebook.
您花费数周训练的机器学习模型很棒。 证明这一点。 不要让它们在Jupyter Notebook中死亡。
Recruiters have hundreds of resumes to read. It is almost impossible for them to read through all your code on GitHub and understand all your projects.
招聘人员有数百份简历可供阅读。 他们几乎不可能阅读GitHub上的所有代码并理解所有项目。
To stand out, you need to do something slightly different. Create an interface they can interact with. Maybe a live dashboard they can play around with.
要脱颖而出,您需要做些不同的事情。 创建一个可以与之交互的界面。 也许他们可以玩的实时仪表板。
Even if it's not the best dashboard or interface out there, it will create interest, because you created something they can actually use.
即使不是最佳的仪表板或界面,它也会引起人们的兴趣,因为您创建了它们可以实际使用的东西。
I wanted to do exactly that, which is why I came up with this portfolio project. In the next few sections, I will explain exactly what I did without going too much into the technical detail.
我想做到这一点,这就是为什么我提出这个投资组合项目的原因。 在接下来的几节中,我将准确解释我所做的事情,而无需过多地讨论技术细节。
目标 (Aim)
I aimed to display skills in the following areas:
我旨在展示以下领域的技能:
- Data Collection 数据采集
- Data Wrangling 数据整理
- Data Visualization 数据可视化
- Machine Learning 机器学习
- Web Development Web开发
In order to do so, I created the following components in my project:
为此,我在项目中创建了以下组件:
- Front-end interface 前端界面
- Movie Dashboard 电影仪表板
- Movie Recommender System 电影推荐系统
I will explain and demonstrate each component in detail.
我将详细解释和演示每个组件。
Note: If you don’t want to read through the entire article and just want to take a look at the final product, just scroll down and take a look at the ‘Links’ section.
注意:如果您不想通读整篇文章,只想看一下最终产品,只需向下滚动并看一下“ 链接 ”部分。
前端接口 (Front-End Interface)
In the past, I would create projects and let the code sit in my GitHub repository. I write an occasional article explaining the project on Medium.
过去,我将创建项目并将代码放在我的GitHub存储库中。 我偶尔写一篇文章,解释Medium上的项目。
Here, I took a different approach.
在这里,我采取了另一种方法。
I created a web-page and explained the different components in my project. I wrote briefly about how users can interact with the systems I created, and put up links to my code and Medium article.
我创建了一个网页,并解释了项目中的不同组件。 我简短地写了关于用户如何与我创建的系统进行交互的文章,并提供了指向我的代码和中型文章的链接。
The entire project can be understood and accessed through just one page, which makes it so much easier for people to engage with.
整个项目仅需一页即可理解和访问,这使人们更容易进行互动。
You can check the site out here — View on laptop or PC for better UI experience.
您可以在此处 查看 该站点 — 在便携式计算机或PC上查看以获得更好的UI体验。
电影仪表板 (Movie Dashboard)
Next, I created a movie dashboard with Tableau.
接下来,我使用Tableau创建了一个电影仪表板。
The steps involved:
涉及的步骤:
数据采集 (Data Collection)
I had to collect data from a variety of different places. I also wanted to visualize Bechdel scores of these movies (a measure of female representation in Hollywood), so I used an API to get that data.
我不得不从许多不同的地方收集数据。 我还想可视化这些电影的Bechdel分数( 好莱坞中女性代表的度量 ),因此我使用API来获取该数据。
数据整理 (Data Wrangling)
I cleaned the data and merged the datasets together. Once I was done, I could finally visualize it!
我清理了数据并将数据集合并在一起。 完成后,我终于可以将其可视化!
数据可视化 (Data Visualization)
Surprisingly, this took up a huge portion of my time compared to other parts of this project.
令人惊讶的是,与该项目的其他部分相比,这花费了我大量的时间。
I spent two days trying to create a visually appealing dashboard.
我花了两天的时间来创建一个吸引人的仪表板。
I created one with a Python Dash app. I wasn’t too satisfied with the layout, and tried creating a Shiny web app in R instead.
我用Python Dash应用程序创建了一个。 我对布局不太满意,而是尝试在R中创建一个Shiny Web应用程序。
It turned out better than my Dash app, and I loved the functionality. However, I simply didn’t find the design appealing.
事实证明,它比我的Dash应用程序好,我喜欢它的功能。 但是,我只是觉得设计没有吸引力。
Finally, I decided to use Tableau. This only took me about an hour to create. If you want to get started with Tableau, you can read this tutorial I created.
最后,我决定使用Tableau。 这只花了我大约一个小时的时间。 如果要开始使用Tableau,可以阅读我创建的本教程 。
You can view my dashboard here — View on laptop or PC for better UI experience.
您可以在此处查看我的仪表板- 在笔记本电脑或PC上查看以获得更好的UI体验 。
推荐系统 (Recommender System)
Finally, machine learning!
最后,机器学习!
I created a simple recommendation system with the same data I used for the dashboard and deployed it with a Dash app.
我使用与仪表板相同的数据创建了一个简单的推荐系统,并通过Dash应用程序进行了部署。
Just enter a movie name, and it uses the back-end recommendation system to generate movie suggestions for you.
只需输入电影名称,它就会使用后端推荐系统为您生成电影建议。
Actually, this recommendation system was created when I was just starting to learn machine learning.
实际上,这个推荐系统是在我刚开始学习机器学习时创建的。
I found the code in my Jupyter Notebook, and decided to clean it up a bit to create this simple application.
我在Jupyter Notebook中找到了代码,并决定对其进行一些清理以创建此简单应用程序。
You can take a look at the recommendation system here — View on laptop or PC for better UI experience.
您可以在这里 查看推荐系统- 在笔记本电脑或PC上查看以获得更好的UI体验 。
That’s it!
而已!
链接 (Links)
Front-End Interface
前端接口
Movie Dashboard
电影仪表板
Recommender System
推荐系统
Code (I apologize since the codes are pretty messy, I will clean them and re-upload soon.)
代码 ( 我很抱歉,因为代码太乱了,我将清理它们并尽快重新上传。 )
I hope you enjoyed this article and found the tips above helpful. Jupyter Notebooks are great, but don’t let your projects just sit there.
希望您喜欢这篇文章,并发现以上提示对您有所帮助。 Jupyter Notebooks很棒,但不要让您的项目只坐在那儿。
Use your creativity to create something other people can interact with.
利用您的创造力创造其他人可以与之互动的东西。
I’ve seen some incredible projects on GitHub with only one star. On the other hand, I’ve also seen some really simple projects gain a lot of attention just because of how it was presented.
我在GitHub上仅看到一颗星星就看到了一些令人难以置信的项目。 另一方面,我也看到一些非常简单的项目因其呈现方式而引起了很多关注。
Most importantly though, create projects you like to work on and do what you feel is enjoyable!
不过,最重要的是,创建您喜欢的项目并做自己认为愉快的事情!
翻译自: https://towardsdatascience.com/a-complete-data-science-portfolio-project-ebbced35ea84
数据科学项目
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/390627.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!