数据科学学习心得
When trying to learn anything all by yourself, it is easy to lose motivation and get thrown off track.
尝试自己学习所有东西时,很容易失去动力并偏离轨道。
In this article, I will provide you with some tips that I used to stay focused in my data science journey.
在本文中,我将向您提供一些我过去一直专注于数据科学之旅的技巧。
问题 (The Problem)
There are just too many resources available online.
在线上有太多可用资源。
Data science is a very deep field, with branches in statistics, mathematics, programming, and development.
数据科学是一个非常深入的领域,在统计,数学,编程和开发领域设有分支。
Due to this, it is very easy to get sidetracked during the learning process.
因此,在学习过程中很容易被忽视。
There are so many online courses that promise to make you a data scientist in three months, and many students end up in tutorial hell.
在线课程如此之多,有望使您在三个月内成为一名数据科学家,并且许多学生最终陷入了教程地狱 。
However, taking ten of these online courses will not make you a data scientist. You will need to hone skills in each of these areas, which means that you have to create personal projects and read more.
但是,参加这些在线课程中的十门课程不会使您成为数据科学家。 您将需要在每个领域中磨练技能,这意味着您必须创建个人项目并内容。
All of this takes time.
所有这些都需要时间。
If you are working a full time job or are a university student (I do both), you will need to find a way to manage your time to study. If you don’t do this well, you will end up getting frustrated, and eventually give up on trying to learn.
如果您正在从事全职工作或是大学生(我都做),则需要找到一种方法来管理学习时间。 如果做得不好,您最终会感到沮丧,最终放弃学习。
解决方案? (The Solution?)
有最终目标 (Have an end goal)
After trial and error, I have found that having an end goal really works for me.
经过反复试验,我发现最终目标确实对我有用。
I always make it a point to learn something new, and give myself a time-frame to do it.
我总是以学习新知识为重点,并给自己一个时间表。
例如 (For example)
I want to enhance my data collection skills, and learn web scraping. I give myself one day to learn the basics of web scraping.
我想增强我的数据收集技能,并学习网络抓取。 我给自己一天的时间来学习网络抓取的基础知识。
Then, I will allocate two days to create a complete web-scraping project.
然后,我将分配两天时间来创建一个完整的网络抓取项目。
In three days, I will have learnt something new — how to scrape any kind of data from the Internet. This will also deepen my knowledge of Python libraries.
在三天内,我将学到一些新知识-如何从Internet抓取任何类型的数据。 这也将加深我对Python库的了解。
When I didn’t give myself an end goal like this, I found my focus constantly shifting.
当我没有给自己这样的最终目标时,我发现自己的注意力在不断变化。
I used to get really excited about starting new things, but never ended up completing any of it. This made me feel like I wasn’t actually learning anything, and led to a lot of frustration.
我曾经对开始新事物感到非常兴奋,但从未最终完成任何事情。 这让我觉得我实际上并没有学到任何东西,并导致很多挫败感。
时间管理 (Time Management)
When I first started learning data science, I could spend around eight hours a day just studying.
刚开始学习数据科学时,我每天可以花大约8个小时来学习。
However,university classes have started now. I am also doing a data science internship, which takes up a large portion of my time.
但是,大学课程现在已经开始。 我也在进行数据科学实习,这占用了我很大一部分时间。
Even with all this, I still make it a point to put aside time to study and learn new things.
即使有所有这些,我仍然要特别留出时间去学习和学习新事物。
At least two hours a day on weekdays, and four hours on weekends is the kind of time I like to put aside to study.
在工作日中,每天至少要有两个小时,而在周末,则要至少四个小时,这是我喜欢留给学习的时间。
However, I found that I tend to waste a lot of this time because my focus jumps easily from one thing to another.
但是,我发现我倾向于浪费很多时间,因为我的注意力很容易从一件事跳到另一件事。
To prevent that from happening, and to make sure I’m actually getting things done everyday, I use a Trello board to keep track of my tasks.
为了防止这种情况发生,并确保我每天都能真正完成工作,我使用Trello板来跟踪我的任务。
Here is an example — my Trello board for today:
这是一个示例-我今天的Trello板:
I strongly suggest you create a Trello board, or simply write down a list of things you have to get done each day.
我强烈建议您创建Trello板,或者简单地写下您每天必须完成的事情清单。
You might not end up finishing all of it, but it will help you keep track of how much you’ve achieved each day.
您可能不会最终完成所有这一切,但是它将帮助您跟踪每天的成就。
Remember, you don’t have to put too much pressure on yourself everyday and burn yourself out. Even if you got one thing done today that you didn’t yesterday, it is progress.
请记住,您不必每天对自己施加太大压力,也不必筋疲力尽。 即使您今天完成了昨天没有做过的一件事,这也是进步。
专注于学习技术 (Focus on Learning the Technique)
I mention this a lot, but I think this point needs re-iterating. Always focus on learning a technique, rather than the tools you can use to achieve it.
我经常提到这一点,但我认为这一点需要重申。 始终专注于学习一种技术,而不是可以用来实现该技术的工具。
Let’s go back to the web scraping example.
让我们回到网络抓取示例。
There are many different languages that can be used to scrape the Internet. Even in Python, there is a large variety of libraries that can do the job —BeautifulSoup, Scrapy, Selenium, etc.
有许多种可用于刮擦Internet的语言。 即使在Python中,也可以使用各种各样的库来完成这项工作-BeautifulSoup,Scrapy,Selenium等。
More important than the library you use, however, is the technique of web scraping. Once you learn the technique, you can quickly learn different tools to get the job done.
但是,比您使用的库更重要的是Web抓取技术。 一旦学习了该技术,就可以快速学习不同的工具来完成工作。
After learning the technique, make sure to apply it.
学习完该技术后,请确保将其应用。
Create projects using the techniques you learnt. Use what you learnt in a variety of different real life scenarios, since this is where you will learn the most. Just following tutorials won’t get you very far, since you won’t really learn any topic in depth.
使用您学到的技术创建项目。 使用您在各种不同的现实生活场景中学到的知识,因为这是您学习最多的地方。 仅仅跟随教程并不会使您走得太远,因为您不会真正深入地学习任何主题。
学习是一种快乐—不要害怕 (Learning is a Joy — Don’t Fear It)
As mentioned above, data science is a very large field with branches in mathematics, statistics, and programming.
如上所述,数据科学是一个非常大的领域,在数学,统计和编程领域都有分支。
Learning even one of these topics can be daunting. There is just so much to know. People dedicate their entire lives towards learning these individual topics.
学习这些主题之一甚至可能也是艰巨的。 有太多要知道的事。 人们毕生致力于学习这些个人主题。
As a beginner data scientist, it is easy to get overwhelmed at the sheer amount of things you need to know.
作为初学者,数据科学家很容易为您需要了解的大量内容所淹没。
This can lead to anxiety, and the fear that you are never going to reach your end goal of learning data science.
这可能会导致焦虑,并担心您将永远无法达到学习数据科学的最终目标。
To get over this fear, you first need to embrace the learning curve. Remember that everybody started somewhere.
为了克服这种恐惧,您首先需要拥抱学习曲线。 请记住,每个人都从某个地方开始。
Break down your end goal of learning data science into smaller chunks. Create a list of daily goals — things that you want to know by the end of each day.
将学习数据科学的最终目标分解为较小的块。 创建每日目标列表,这是您希望在每天结束时知道的事情。
When you do this, you will realize that you are making progress and learning something new everyday. This will motivate you to continue your data science learning journey.
当您这样做时,您将意识到自己每天都在进步并学习新东西。 这将激励您继续进行数据科学学习。
分解大任务 (Breaking Down Large Tasks)
If you have read any of my previous articles, I always advice aspiring data scientists to create projects. Creating personal projects is a way for your resume to stand out, and shows your interest in the subject.
如果您阅读过我以前的任何文章,我总是建议有抱负的数据科学家创建项目。 创建个人项目是使您的简历脱颖而出的一种方式,并显示出您对该主题的兴趣。
I always get questions from aspiring data scientists on getting started with data science projects, such as this one:
我总是从有抱负的数据科学家那里获得关于数据科学项目入门的问题,例如:
“How can I get started with creating data science projects? Online courses only teach us concepts. How do we apply these concepts in real life scenarios and create an end-to-end project?”
“如何开始创建数据科学项目? 在线课程仅教我们概念。 我们如何将这些概念应用于现实生活中并创建端到端项目?”
If you have the same question, I understand exactly what you are going through.
如果您有相同的问题,我将完全理解您正在经历的事情。
I remember completing courses in statistics, data science, and programming, and thinking to myself — “I am now ready to start my own project.”
我记得完成统计学,数据科学和编程方面的课程后,对自己进行思考: “我现在准备开始自己的项目。”
I was excited, and ready to apply what I learnt to real life situations!
我很兴奋,并准备将我学到的东西应用到现实生活中!
However, I didn’t know where to start. I searched for sample data science projects, and found some really amazing stuff on the Internet.
但是,我不知道从哪里开始。 我搜索了示例数据科学项目,并在Internet上找到了一些非常令人惊奇的东西。
I saw people deploy complex machine learning algorithms with an interactive interface. I saw systems like “fake news detector” — all you had to do was enter a URL, and it will predict whether or not the news was fake.
我看到人们通过交互界面部署复杂的机器学习算法。 我看到了诸如“假新闻检测器”之类的系统,您所要做的就是输入URL,它会预测新闻是否为伪造。
I was impressed, but wondered where they learnt how to do those things. I called it “real world stuff,” and was disappointed that there was no online course to teach me how to do them.
我印象深刻,但想知道他们在哪里学习了如何做这些事情。 我称它为“现实世界的东西” ,但对没有在线课程教我如何做它们感到失望。
随着时间的推移,我了解到启动这些项目的唯一方法就是开始。 (Over time, I learnt that the only way to start these projects was to just start.)
Here are the steps I take when creating data science project:
这是我创建数据科学项目时采取的步骤:
Have an idea: Come up with an idea first. Choose something that excites you, such as a Spotify music analysis project.
有一个想法:首先想出一个主意。 选择一些让您兴奋的东西,例如Spotify音乐分析项目。
Break it down: Think of the different steps you will have to take to complete the project. In this case, it would be:
分解:考虑完成项目所必须采取的不同步骤。 在这种情况下,它将是:
- Data Collection: Think about where you are going to get the data from. You might need to build a web scraper, or use an API. Google is your best friend in this case, and you will learn these skills along the way. Give yourself a deadline to complete this task. 数据收集:考虑从何处获取数据。 您可能需要构建网络抓取工具或使用API。 在这种情况下,Google是您最好的朋友,您将一路学习这些技能。 给自己一个完成任务的期限。
- Data Analysis: Next, you will have to come up with a way to analyze this data. If it was scraped from the web, it is going to be messy. You will need to clean it first, and store it in a data frame. 数据分析:接下来,您将不得不想出一种方法来分析这些数据。 如果是从网上刮下来的,那将是一团糟。 您将需要先对其进行清理,然后将其存储在数据框中。
- Data Visualization: Real world datasets are very different from the kinds of data handed to you in Kaggle. Visualizing this data usually takes a bit of work. You will need to change variable types, and play around with different tools to get your desired result. 数据可视化:现实世界的数据集与Kaggle中提供给您的数据类型有很大不同。 可视化这些数据通常需要一些工作。 您将需要更改变量类型,并使用不同的工具来获得所需的结果。
- Deployment: If you choose to deploy your project, you will need some knowledge of web development. There are a lot of tutorials out there on creating a user interface for your models and deploying them, and you will figure it out along the way. 部署:如果选择部署项目,则需要一些Web开发知识。 关于为模型创建用户界面并进行部署的大量教程,您将一路弄清楚。
My previous data science project was a simple movie recommender system and dashboard with a front-end UI. It started with an idea, and I drew out the flow of the project:
我之前的数据科学项目是一个简单的电影推荐系统和带有前端UI的仪表板。 它从一个想法开始,我得出了项目的流程:
If you take a look at the end product, you will see that my project is pretty similar to what I drew.
如果您看一下最终产品 ,您会发现我的项目与我绘制的项目非常相似。
This is because I first had and idea, and then proceeded to break it down into different parts. I drew those parts out, and gave myself some time to complete each of them.
这是因为我先有了主意,然后又将其分解为不同的部分。 我把那些部分抽出来,给自己一些时间来完成它们。
Breaking down large projects into simple tasks is very important. This way, you are slowly working towards your end goal one part at a time. You have some kind of direction on how to proceed. Most importantly, you will not get overwhelmed by the task at hand.
将大型项目分解为简单的任务非常重要。 这样,您一次就朝着最终目标缓慢地努力。 您对如何进行有一些指导。 最重要的是,您不会被手头的任务淹没。
I understand that it is easy to get sidetracked and lose motivation when learning a subject as deep as data science, especially if you’re already working a full time job and have other commitments.
我了解,在学习像数据科学这样的深层次课程时,容易陷入歧途并失去动力,尤其是如果您已经在从事全职工作并且有其他承诺的时候。
Venturing into a completely new field and having to learn everything on your own can be daunting.
冒险进入一个全新的领域,必须自己学习所有东西,这可能令人生畏。
If you feel overwhelmed, or feel like you are losing interest, remember why you started your data science journey in the first place.
如果您感到不知所措或感到失去兴趣,请记住为什么首先要开始数据科学之旅。
That’s it!
而已!
You don’t learn to walk by following the rules. You learn by doing, and by falling over.
您不会学会遵守规则。 您通过做事和跌倒来学习。
翻译自: https://medium.com/swlh/how-to-stay-motivated-when-learning-data-science-ccab719ae7c1
数据科学学习心得
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389742.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!