spotify 数据分析
Spotisis /spo-ti-sis/ noun The analysis of one’s Spotify streaming history using Python.
Spotisis / spo-ti-sis / 名词使用Python分析一个人的Spotify流历史。
I was reading through a lot of data science related guides and project ideas when I came across an article in which the author compared his song choices with his friend’s. I wanted to do something similar, so set out to analyse my own streaming history and compare it with what the world listens to.
当我看到一篇文章,作者将他的歌曲选择与朋友的歌曲选择进行比较时,我正在阅读许多与数据科学相关的指南和项目构想。 我想做类似的事情,因此着手分析自己的流媒体历史并将其与世界听的内容进行比较。
Through this, I aim to find out more about my music preferences and how that differs from the world’s genral picks.
通过这一工作,我旨在了解有关我的音乐喜好以及与世界各地的精选音乐有何不同的更多信息。
I never really put much thought into my music preference before this project — it was always kind of dependent on my mood, and when someone asked me what type of music I like, I had no answer — because it varied from one hour to another.
在这个项目开始之前,我从来没有真正考虑过我的音乐偏好-它总是取决于我的心情,当有人问我喜欢哪种音乐时,我没有答案-因为它从一个小时到另一个小时不等。
I’ve split this project into 2 sections:
我将该项目分为两个部分:
Part A is the analysis of my music streaming history.
A部分是对我的音乐流历史的分析。
- Timeline of my streaming history 我的流式传输历史的时间表
- Day preference 日偏好
- Favorite artist 最喜欢的艺术家
- Favorite songs 最喜欢的歌曲
- Spirit of the songs 歌曲的精神
- Diversity 多元化
Part B is the comparison of the top 50 songs streamed on my list with the top 50 songs streamed in 2019
B部分是我列表中前50首歌曲与2019年前50首歌曲的比较
数据 (The data)
Spotify allows every user to request a download of all their streaming history, so Part A is completely dependent on that. They also have an amazing Developer Platform in which the public can use the data available for their own interest. Along with my personal data, I used the audio features option — which breaks down a song and gives it ‘score’ for a number of different attributes. The attributes are as follows:
Spotify允许每个用户请求下载其所有流历史记录,因此A部分完全依赖于此。 他们还拥有一个了不起的开发人员平台 ,公众可以在其中使用自己感兴趣的数据。 除了我的个人数据,我还使用了音频功能选项-可以分解一首歌曲,并为许多不同的属性赋予它“得分”。 属性如下:
Acousticness — A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic
声学 -轨道是否声学的置信度,范围为0.0到1.0。 1.0代表高置信度轨道是声学的
Danceability — A description of how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.
舞蹈性 -基于音乐元素(包括速度,节奏稳定性,节拍强度和整体规律性)的组合,说明轨道是否适合跳舞。 值0.0最低可跳舞,而1.0最高可跳舞。
Energy — Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy.
能量 —能量是从0.0到1.0的量度,表示强度和活动的感知量度。 通常,充满活力的曲目会感觉快速,响亮且嘈杂。
Instrumentalness — Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content.
器乐性 —预测音轨是否不包含人声。 在这种情况下,“哦”和“啊”的声音被当作工具。 器乐性值越接近1.0,则轨道中没有声音的可能性越大。
Liveness — Detects the presence of an audience in the recording.
生动度 -检测记录中是否有听众。
Loudness — The overall loudness of a track in decibels (dB). Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typical range between -60 and 0 db.
响度 -轨道的整体响度,以分贝(dB)为单位。 响度是声音的质量,它是身体力量(振幅)的主要心理关联。 值的典型范围是-60至0 db。
Speechiness — Speechiness detects the presence of spoken words in a track.
语音性 -语音性可检测曲目中是否存在口语。
Valence — A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track.
价 -从0.0到1.0的量度,描述了轨道传达的音乐积极性。
Tempo — The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration
节奏 —曲目的总体估计节奏,单位为节拍/分钟(BPM)。 用音乐术语来说,节奏是指给定乐曲的速度或节奏,它直接来自平均拍子持续时间
Mode — Mode indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0.
模式 —模式表示轨道的形态(主要或次要),是其旋律内容所源自的音阶类型。 Major用1表示,minor用0表示。
Key — The estimated overall key of the track.
密钥 -曲目的估计总体密钥。
The dataset was a little messy, so I used Pandas to clean it up according to my need for each section. The entire code can be found on the GitHub link at the end of this article.
数据集有点混乱,因此我根据每个部分的需要使用Pandas对其进行清理。 完整的代码可以在本文结尾的GitHub链接上找到。
For Part B, I used this dataset from Kaggle.
对于B部分,我用这从Kaggle数据集。
Before we begin, I just want to say something… Don’t come at me for my music choice!
在我们开始之前,我只想说些什么...不要因为我的音乐选择而来找我!
甲部 (Part A)
1.我的流式传输历史的时间表 (1. Timeline of my streaming history)
I know that I spend a lot of time listening to music, but I didn’t know I spent that much time! The data dates back to late June of 2019 and was highly varied.
我知道我花了很多时间听音乐,但是我不知道我花了很多时间! 该数据可以追溯到2019年6月下旬,并且变化很大。
On February 24th 2020, I spent a gasping 535 minutes (which is almost 9 hours) on spotify — the most in the past year! There’s no definite answer as to why the difference between the highest and lowest value (which was in seconds) is so much, but I did register for Spotify Premium around that time, so maybe that was the reason? Push the promos harder you guys ;)
2020年2月24日,我在Spotify上花费了535分钟(将近9个小时),这是过去一年中最多的! 关于最高值和最低值(以秒为单位)之间的差异为何如此之大,没有确切的答案,但是我确实在那个时候注册了Spotify Premium,所以也许这就是原因吗? 加大促销力度;)
2.每日偏好 (2. Day preference)
Does the day of the week affect how long I spend listening to music?
星期几会影响我听音乐的时间吗?
I usually listen to music while walking to and back from college, so I would’ve predicted that more time would be spent during the weekdays. Sunday is chillday, so it makes sense that it was when I spent most time listening to music.
我通常在上大学和上大学时听音乐,因此我预计在工作日将花费更多时间。 星期日很冷,所以有意义的是那是我花大量时间听音乐的时候。
3.最喜欢的艺术家 (3. Favorite artists)
Do I have a favorite arist?
我有最喜欢的艺术家吗?
According the the data, I actually do. There were two factors I considered: the number of times I played an artist’s song and the total amount of time I spent listening to their songs.
根据数据,我实际上是这样做的。 我考虑了两个因素:播放歌手歌曲的次数和收听他们歌曲的总时间。
When looking through the data, I found that some of the songs were played only for a few seconds, so that was reducing the accuracy of the results.
查看数据时,我发现某些歌曲仅播放了几秒钟,因此降低了结果的准确性。
The graphs below show the top 15 artists under both categories.
下图显示了两个类别中的前15位艺术家。
Lauv, Shawn Mendes, One Direction and Justin Bieber maintained the top 4 positions under both graphs, whereas the others were rearranged.
劳夫,肖恩·门德斯,一个方向和贾斯汀·比伯在两个图表上均保持前4位,而其他两个则重新排列。
4.哪些歌曲播放最多? (4. Which songs were played most?)
Was it by the same 15 artists?
是由同一15位艺术家创作的吗?
Yes, it was — Lauv took 5 of the 15 spots!
是的,是的— Lauv占据了15个景点中的5个!
I realised that some of the top 15 artists (based on the amount of time spent listening to their songs) were on the list because of one or two songs which were repeated multiple times.
我意识到,排名前15位的艺术家中的一些(基于听他们的歌曲所花费的时间)在名单上是因为一首或两首歌曲被重复多次。
For example, Memories by Maroon 5 was the most played song (played for a total of 184 minutes). When comared to the total time spent listening to the group (430 minutes), the different was about 246 minutes. In percentage, it means that more than 40% of the time spent listening to Maroon 5 was spent only on Memories.
例如,Maroon 5的Memories是播放最多的歌曲(总共播放184分钟)。 将听完该小组所花费的总时间(430分钟)估算为大约246分钟。 以百分比表示,这意味着超过40%的时间在聆听Maroon 5上的时间仅花在记忆上。
It’s a good song. Admit it.
这是一首好歌。 承认吧
5.歌曲的精神 (5. Spirit of the song)
Do I listen to positive songs?
我会听正面的歌吗?
Using the valence attribute from Spotify’s audio analysis features, I tried to find out the general spirit of the top 50 songs I listen to. The valence scale is from 0–1, with one being the most positiveness conveyed in the track.
使用Spotify音频分析功能的valence属性,我试图找出我听的前50首歌曲的总体精神。 化合价的范围是0-1,其中一个是在曲目中传达的最多的积极性。
For the sake of classification:- low spirit = 0 ≤ valence < 0.5- netural = 0.5≤ valence < 0.6-high spirit = 0.6 ≤ valence ≤ 1
为了分类:-低酒精度= 0≤价<0.5-神经质=0.5≤价<0.6-高酒精度= 0.6≤价≤1
(I named it as ‘spirit’ because ‘positive’ and ‘negative’ didn’t feel right)
(我将其命名为“精神”,因为“正”和“负”感觉不正确)
I was pretty unsure about this one and was utterly surprised by the results.
我对此不太确定,对结果完全感到惊讶。
So I listen to more of low spirit songs?? That doesn’t make sense!
所以我听更多的低沉的歌曲吗? 那没有道理!
When I cross referenced the song names to its valence scale, I realised that this may not have been the most accurate representation. Ed Sheeran’s Photograph had a valence scale of 0.18, for which it was categorised as ‘low spirit’. Although it’s not a super high spirited song, it’s not so low either!
当我将歌曲名称以其效价比例交叉引用时,我意识到这可能不是最准确的表示形式。 埃德·希兰(Ed Sheeran)的摄影作品的化合价等级为0.18,因此其分类为“精神低落”。 尽管这不是一首超振奋的歌,但它也不是那么低!
6.歌曲的多样性 (6. Diversity of songs)
How do the audio features of the songs compare to one another?
歌曲的音频功能如何相互比较?
The spirit of the song built up my curiosity to know more about how the songs varied from one another in therms of the audio features, so I compared the top 3 most played songs. I believe that my song choices are highly diverse.
这首歌的精神激发了我的好奇心,以了解更多有关歌曲在音频功能方面的差异的信息,因此我比较了播放次数最多的前三首歌曲。 我相信我的歌曲选择非常多样化。
Those who are familiar with these songs know just how much they vary from one another — they give such different vibes, but I needed the data to prove it.
那些熟悉这些歌曲的人知道它们彼此之间有多少不同-它们具有不同的共鸣,但是我需要数据来证明这一点。
There is A LOT of difference — most noticable in the loudness and acousticness attributes.
有很多差异-响度和声学属性最明显。
The next part is based off of this diversity.
下一部分基于这种多样性。
B部分 (Part B)
Is my music too diverse? How does it fare when compared to the global top 50?
我的音乐太多样化了吗? 与全球前50名相比,情况如何?
Apart from the mode, everything is different! I prefer less groovy, instrumental based songs which have lower energy levels, while the global hits suggest people lean towards fast paced, energetic songs that they can dance to.
除了模式,其他都不同! 我更喜欢能量水平较低的低调,器乐性歌曲,而全球流行歌曲则建议人们倾向于快节奏,充满活力的歌曲,他们可以跳舞。
The difference between my music’s average tempo (beats per minute) and the global average is 4 BPM. According to research, songs which have 120 BPM are considered to be fast paced songs. My preference seems to be at a little slower pace, though not by much.
我的音乐的平均节奏(每分钟的节拍)与全局平均速度之间的差是4 BPM。 根据研究,具有120 BPM的歌曲被视为快节奏的歌曲。 我的喜好似乎放慢了一点,尽管速度并不慢。
结论 (Conclusion)
This project was a blast to do. I thoroughly enjoyed learning more about my music preferences and comparing that to the global hits. Now that I am backed with the data, I can say that my music is highly diversified and that I do have a favourite artist — Lauv (considering the amount of time I’ve spent listening to his songs, it wouldn’t be justified to say otherwise!).
这个项目是一个爆炸。 我非常喜欢学习有关自己的音乐喜好,并将其与全球流行歌曲进行比较。 现在,我有了这些数据的支持,可以说我的音乐非常多样化,而且确实有一位喜欢的艺术家Lauv(考虑到我花了很多时间听他的歌曲,这并没有理由否则说!)。
Following this article, I would like to continue by applying some machine learning knowledge to create a recommender system based on my music preferences.
在阅读完本文之后,我想继续应用一些机器学习知识,根据我的音乐喜好创建一个推荐系统。
Feel free to comment and view the entire code on my GitHub!
随时在我的GitHub上评论和查看整个代码!
Big thanks to Vlad Gheorghe for his brilliant explanation (huge savior!)
非常感谢弗拉德·格奥尔格(Vlad Gheorghe)出色的解释(救世主!)
翻译自: https://medium.com/swlh/analysis-of-my-spotify-streaming-history-57a6088c3d3
spotify 数据分析
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388977.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!