云尚制片管理系统_电影制片厂的未来

云尚制片管理系统

Data visualization is a key step of any data science project. During the process of exploratory data analysis, visualizing data allows us to locate outliers and identify distribution, helping us to control for possible biases in our data earlier on. Coupled with simple statistical tests, it can also answer many of the questions and can aid us in prioritizing areas to focus on.

数据可视化是任何数据科学项目的关键步骤。 在探索性数据分析过程中,可视化数据使我们能够找到异常值并识别分布,从而帮助我们尽早控制数据中可能存在的偏差。 结合简单的统计测试,它还可以回答许多问题,并可以帮助我们确定优先领域。

Here, I will go through some of the exploratory data analysis and data visualization steps in Python using Matplotlib and Seaborn libraries. The goal of the project is to analyze movie trends of the past decade to make suggestions in developing a new movie studio brand for a well-established corporation.

在这里,我将使用Matplotlib和Seaborn库完成一些探索性数据分析和数据可视化步骤。 该项目的目的是分析过去十年的电影趋势,为发展成熟的公司开发新的电影制片厂品牌提供建议。

方法 (Approach)

We explored the data with these two primary goals in mind.

考虑到这两个主要目标,我们探索了数据。

  1. Building a global brand — We don’t just make movies, we make good movies that appeal to a global audience.

    建立全球品牌- 我们不仅制作电影,而且制作吸引全球观众的优质电影。

  2. Establishing a sustainable long-term plan —Making a sustainable business plan, not just a movie production plan.

    建立可持续的长期计划- 制定可持续的商业计划,而不仅仅是电影制作计划。

数据结构 (Data Structure)

Image for post
Our data frame structure
我们的数据框结构

This is the basic structure of our cleaned Pandas data frame. We sourced our data from the Movie Database (TMDB), IMDB, and the Numbers. I recommend using the Movie Database (TMDB) API for the preliminary movie data.

这是我们清理过的熊猫数据框的基本结构。 我们从电影数据库(TMDB),IMDB和数字中获取数据。 我建议使用电影数据库(TMDB)API来获取初步的电影数据。

勘探 (Exploration)

最初设定 (Initial Setup)

总收入分配 (Distribution of Gross Revenue)

Let’s start looking at the distribution of the overall gross revenues for domestic and worldwide. Seaborn’s distplot plots histogram along with KDE (Kernel Density Estimate) plot.

让我们开始看看国内和全球总收入的分布。 Seaborn的distplot绘制直方图以及KDE(内核密度估计)图

Image for post

We can see that it is strongly right skewed, it is a pretty usual trend for income data. Taking the log transformation of this data can help us visualize what’s happening in the dense area more clearly.

我们可以看到它是非常右偏的,对于收入数据来说这是很常见的趋势。 对这些数据进行对数转换可以帮助我们更清晰地可视化密集区域中发生的情况。

Image for post

Not surprisingly, It seems like the global market yields higher revenues on average. Let’s look at the relationship between the budget and revenue.

毫不奇怪,似乎全球市场平均产生更高的收入。 让我们看一下预算与收入之间的关系。

预算收入 (Budget to Revenue)

Now we want to visualize the relationship between production budget and gross revenue, which are two continuous variables using scatter plots. There are many ways to achieve this. Here, I used the overlaid scatter plots to look at the global and domestic gross revenues together.

现在我们要形象化生产预算和总收入之间的关系,这是使用散点图的两个连续变量。 有很多方法可以实现这一目标。 在这里,我使用叠加的散点图一起查看了全球和国内总收入。

Image for post

It seems like a high budget does not always lead to high revenue especially in the domestic market. Also some movies yield high revenues with relatively lower budgets when it targets the global market. Let’s take a closer look at which genres might return the most return for its investment.

似乎高预算并不总是导致高收入,尤其是在国内市场。 此外,某些电影面向全球市场时,其预算却相对较低,可带来高额收入。 让我们仔细研究一下哪些类型的内容可能会为其投资带来最大的回报。

体裁分布 (Distribution of Genre)

We can look at the percentage of each genre in our dataset using a bar plot.

我们可以使用条形图查看数据集中每种类型的百分比

Image for post

We see that about 30% of our data is action movies.

我们看到大约30%的数据是动作电影。

各类型的收益与成本比率 (Revenue to Cost Ratio of Each Genre)

Which genres have the highest return per investment?

哪种类型的单笔投资回报最高?

Image for post

Based on the global gross revenue to budget ratio, horror films on average make the most return per investment. But this does not necessarily mean that horror movies bring the most profit. Horror movies might take less production budget to make, thus yielding a higher percentage of return per cost. We can compare the budget of each genre using a box plot.

根据全球总收入与预算的比率,恐怖电影平均每笔投资回报最高。 但这并不一定意味着恐怖电影会带来最大的收益。 恐怖电影可能需要较少的制作预算,因此产生更高的单位成本回报率。 我们可以使用箱形图比较每种类型的预算。

各类型的平均制作预算 (Average Production Budget of Each Genre)

Image for post

As we suspected, horror movies usually require a little budget to start out. On the other hand, action, animation and some family films tend to have higher budgets. Then which genre of movies yield the most profit? (Here I’m using the term “profit” liberally to mean global gross revenue minus the production budget. In reality, we cannot entirely know what the total cost involved in the movie production, distribution and marketing is to validate this measure.)

正如我们所怀疑的,恐怖电影通常需要很少的预算才能开始。 另一方面,动作,动画和一些家庭电影往往预算较高。 那么哪种类型的电影收益最大? (在这里,我用“利润”一词来表示全球总收入减去制作预算。实际上,我们不能完全知道电影制作,发行和营销所涉及的总成本是如何验证这一指标的。)

各类型的利润 (Profit of Each Genre)

Image for post

(code is similar to above)

(代码与上面类似)

In fact, the genre that usually yields the highest profit is animation, followed by family and action. We can also look at this relationship between production budget and gross revenue of each genre by plotting a linear model plot.

实际上,通常产生最高利润的类型是动画,其次是家庭和动作。 通过绘制线性模型图,我们还可以查看每种类型的生产预算与总收入之间的这种关系。

Image for post

Looking at the linear model plot, it’s clear that with a very few exceptions, horror movies are low-cost and do not quite make a lot of revenues. Also high average profit for adventures seem to be from a handful of rare successes. It seems like feasible money-makers are action and animation. Action shows stronger correlation between budget and gross revenue, while animation seems to allow some of the high successes with relatively lower budget.

查看线性模型图 ,很明显,除了少数例外,恐怖电影是低成本的,并且收入不高。 冒险的高平均利润似乎也来自少数难得的成功。 似乎可行的赚钱活动是动作和动画。 动作显示预算与总收入之间的相关性更强,而动画似乎可以在预算相对较低的情况下取得一些成功。

We can simply compute correlations for each genre to confirm this.

我们可以简单地计算每种类型的相关性以确认这一点。

for g in df[‘genre’].unique():corr = df[df.genre == g][‘budget’].corr(df[df.genre == g][‘glob_gross’])print(f”{g}: {round(corr, 2)}”)# Action: 0.74
# Animation 0.60
# slightly higher correlation between global gross revenue and budget for action films.

But the profit is not everything. As a brand new studio, we want to build a reputation and elevate our brand image to level with other established studio brands. This requires making reputable and award-worthy movies, as well as popular movies that go viral. Let’s see which genre tends to earn this status.

但是利润不是一切。 作为一个全新的工作室,我们希望建立声誉并提升我们的品牌形象,使其与其他知名工作室品牌保持一致。 这就要求制作著名的和值得奖赏的电影,以及流行的流行电影。 让我们看看哪种流派倾向于获得这种地位。

等级 (Ratings)

Image for post

A majority of horror movies don’t get high average ratings on IMDB, while biography or drama films tend to do well. We should investigate which type of biography or drama films are worth investing into. On the other hand, an all time winner seems like an animation, which often yields high revenue and high ratings. Only downside is that the award opportunities for animations are relatively slim.

大多数恐怖电影在IMDB上的平均收视率都不高,而传记或戏剧电影则表现良好。 我们应该调查哪些传记或戏剧电影值得投资。 另一方面,一个历来的赢家似乎就像一个动画,通常会带来高收入和高收视率。 唯一的缺点是动画的获奖机会相对较少。

人气度 (Popularity)

Image for post

We can see that action, adventure and animation are the most popular genres, based on the TMDB popularity score, while comedy, horror and biography films tend to be less so. For building a global brand presence and high profit, action, adventure and animation are good areas to target. We will look at these three genres first.

根据TMDB的人气得分,我们可以看到动作,冒险和动画是最受欢迎的类型,而喜剧,恐怖和传记电影则不那么受欢迎。 对于建立全球品牌影响力和高利润而言,动作,冒险和动画是理想的目标领域。 我们将首先看这三种类型。

超级英雄动作片 (Superhero Action Films)

One thing that stood out from our dataset was that 3 out of 5 top profit action movies were superhero movies from Marvel production. Superhero film market has skyrocketed in the past decade and will be a difficult wall to break as a new studio, since most of them are sequels based on deep-rooted fandoms. So I decided to filter these superhero films based on the name of writers and directors by adding a new column ‘superhero’.

从我们的数据集中脱颖而出的一件事是,五部最赚钱的动作片中有三部是来自漫威制作的超级英雄电影。 在过去的十年中,超级英雄电影市场飞速发展,作为一个新的制片厂,这将是很难打破的一堵墙,因为其中大多数都是基于根深蒂固的狂热分子的续集。 因此,我决定根据作者和导演的姓名来过滤这些超级英雄电影,方法是添加一个新列“ superhero”。

Image for post

Swarm plot is a good way to look at distribution of continuous values based on two other categorical values. Here, we can see that a big chunk of high profit action movies are indeed superhero films. Also even though not depicted here, most of successful non-superhero films are sequels (for both action and animation). It might be worthwhile to add a sequel as a feature for more deeper analysis.

Swarm图是查看基于其他两个分类值的连续值分布的好方法。 在这里,我们可以看到大量的高利润动作电影确实是超级英雄电影。 同样,尽管这里没有描述,但大多数成功的非超级英雄电影都是续集(用于动作和动画)。 可能需要添加续集作为更深入的分析功能。

Image for post

动作,动画,冒险 (Action, Animation, Adventure)

Image for post

We can see here that animation on average tends to be more successful globally and domestically.

我们在这里可以看到,动画在全球和国内平均而言更趋于成功。

Image for post

获奖电影 (Award Winning Films)

So far we established that given a high budget, animation is perhaps a less risky genre to invest in. But we also want to invest in non-animation films to expand our chance of winning awards and establishing the reputation. Earlier we saw that biography and drama films tend to get rated high.

到目前为止,我们已经确定,在预算较高的情况下,动画可能是投资风险较小的类型。但是,我们也希望投资于非动画电影,以扩大获得奖项和建立声誉的机会。 之前我们看到传记和戏剧电影的收视率往往很高。

Image for post

This plot shows that generally higher rating is associated with higher profit, but not by much. Also there seems to be some drama films that are following a different trend. We should look more into the sub-genre of drama films.

该图表明,较高的评级通常与较高的利润相关,但关系不大。 似乎有些戏剧电影也遵循不同的趋势。 我们应该更多地研究戏剧电影的子流派。

Image for post

Strip plot is a scatter plot for categorical value, which adds a bit of horizontal jitter making it easier to visualize the density of values. It’s hard to observe strong trends here as there are too many categories and not enough observation, other than that there many of the drama films have a sub-genre of romance.

带状图是分类值的散点图,它增加了一些水平抖动,从而更易于可视化值的密度。 在这里很难观察到强烈的趋势,因为类别太多,观察不够,除了许多戏剧电影都具有浪漫的亚体。

Simple t-test showed that there are statistically significant differences in average IMDB rating between drama and biography films (p < 0.01), but not in profit or budget. So we should focus on making a biography film instead.

简单的t检验表明,戏剧电影和传记电影之间的IMDB平均评分存在统计学差异( p <0.01 ),但利润或预算上没有差异。 因此,我们应该专注于制作传记电影。

每月趋势 (Monthly Trend)

Lastly, we looked at when is the best time to release the movie to maximize the profit using line plots.

最后,我们用线图研究了何时发行电影以最大化利润的最佳时间。

Image for post

Looking at the annual trend, we can see that movies released in April to June tend to be the highest revenue yielding. This would be a great time to release our globally appealing animation.

从年度趋势来看,我们可以看到4月至6月发行的电影收益最高。 这将是发布我们具有全球吸引力的动画的绝佳时机。

Highly acclaimed movies are released close to the end of the year during the “Oscar Seasons” to maximize their exposures to critics. We recommend releasing our award worthy biography films during this time and elevate our brand to the level of other established studios.

备受赞誉的电影将在“奥斯卡季”(Oscar Seasons)临近年底发行,以最大程度地提高对评论家的曝光率。 我们建议您在这段时间内发布我们的获奖传记电影,并将我们的品牌提升到其他知名制片厂的水平。

结论 (Conclusion)

We reviewed the movie data from the past decade to propose a few recommendations and guidelines to start a movie studio. Horror movies yield the highest percentage return per investment and it requires a little budget to start out. But it’s not a good genre to start with, as it is usually not popular or highly rated, and does not bring in high revenue. To maximize the profit and to develop global presence, investing in animation films is encouraged. As well to target awards, in order to elevate the brand reputation, we suggested making biography films. An annual plan to synergize productions of two separate lines of films (profitable animation and award-worthy biography) is suggested.

我们回顾了过去十年的电影数据,提出了一些建议和指导方针来建立电影制片厂。 恐怖电影的单笔投资回报率最高,而且制作预算也很少。 但这并不是一个很好的类型,因为它通常不受欢迎或评级很高,并且不会带来高收入。 为了最大化利润并发展全球影响力,鼓励在动画电影上投资。 除了获得奖项之外,为了提升品牌声誉,我们建议制作传记电影。 建议制定一项年度计划,以使两行不同的电影(有益的动画和获奖的传记)的制作相互协调。

For a more in depth process, you can check out the Github page here. This project was a collaboration done in collaboration with my colleague Paul Torres.

有关更深入的过程,您可以在此处查看Github页面。 这个项目是与我的同事Paul Torres合作完成的。

翻译自: https://medium.com/swlh/future-of-a-movie-studio-29a65fcf48c

云尚制片管理系统

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388181.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

JAVA单向链表实现

JAVA单向链表实现 单向链表 链表和数组一样是一种最常用的线性数据结构&#xff0c;两者各有优缺点。数组我们知道是在内存上的一块连续的空间构成&#xff0c;所以其元素访问可以通过下标进行&#xff0c;随机访问速度很快&#xff0c;但数组也有其缺点&#xff0c;由于数组的…

201771010102 常惠琢《面向对象程序设计(java)》第八周学习总结

1、实验目的与要求 (1) 掌握接口定义方法&#xff1b; (2) 掌握实现接口类的定义要求&#xff1b; (3) 掌握实现了接口类的使用要求&#xff1b; (4) 掌握程序回调设计模式&#xff1b; (5) 掌握Comparator接口用法&#xff1b; (6) 掌握对象浅层拷贝与深层拷贝方法&#xff1b…

新版 Android 已支持 FIDO2 标准,免密登录应用或网站

谷歌刚刚宣布了与 FIDO 联盟达成的最新合作&#xff0c;为 Android 用户带来了无需密码、即可登录网站或应用的便捷选项。 这项服务基于 FIDO2 标准实现&#xff0c;任何运行 Android 7.0 及后续版本的设备&#xff0c;都可以在升级最新版 Google Play 服务后&#xff0c;通过指…

t-sne原理解释_T-SNE解释-数学与直觉

t-sne原理解释The method of t-distributed Stochastic Neighbor Embedding (t-SNE) is a method for dimensionality reduction, used mainly for visualization of data in 2D and 3D maps. This method can find non-linear connections in the data and therefore it is hi…

Android Studio如何减小APK体积

最近在用AndroidStudio开发一个小计算器&#xff0c;代码加起来还不到200行。但是遇到一个问题&#xff0c;导出的APK文件大小竟然达到了1034K。这不科学&#xff0c;于是就自己动手精简APK。下面我们大家一起学习怎么缩小一个APK的大小&#xff0c;以hello world为例。 新建工…

js合并同类数组里面的对象_通过同类群组保留估算客户生命周期价值

js合并同类数组里面的对象This is Part I of the two-part series dedicated to estimating customer lifetime value. In this post, I will describe how to estimate LTV, on a conceptual level, in order to explain what we’re going to be doing in Part II with the P…

C#解析HTML

第一种方法&#xff1a;用正则表达式来分析 [csharp] view plaincopy 转自网上的一个实例&#xff1a;所有的href都抽取出来&#xff1a; using System; using System.Net; using System.Text; using System.Text.RegularExpressions; namespace HttpGet { c…

com编程创建快捷方式中文_如何以编程方式为博客创建wordcloud?

com编程创建快捷方式中文Recently, I was in need of an image for our blog and wanted it to have some wow effect or at least a better fit than anything typical we’ve been using. Pondering over ideas for a while, word cloud flashed in my mind. &#x1f4a1;Us…

ETL技术入门之ETL初认识

ETL技术入门之ETL初认识 分类&#xff1a; etl2014-07-10 23:11 3021人阅读 评论(2) 收藏 举报数据仓库商业价值etlbi目录(?)[-] ETL是什么先说下背景知识下面给下ETL的详细解释定义现在来看下kettle的transformation文件一个最简单的E过程例子windows环境 上图左边的是打开表…

ActiveSupport::Concern 和 gem 'name_of_person'(300✨) 的内部运行机制分析

理解ActiveRecord::Concern&#xff1a; 参考:include和extend的区别&#xff1a; https://www.cnblogs.com/chentianwei/p/9408963.html 传统的模块看起来像&#xff1a; module Mdef self.included(base)# base(一个类)扩展了一个模块"ClassMethods"&#xff0c; b…

Python 3.8.0a2 发布,面向对象编程语言

百度智能云 云生态狂欢季 热门云产品1折起>>> Python 3.8.0a2 发布了&#xff0c;这是 3.8 系列计划中 4 个 alpha 版本的第 2 个。 alpha 版本旨在更加易于测试新功能和 bug 修复状态&#xff0c;以及发布流程。在 alpha 阶段会添加新功能&#xff0c;直到 beta 阶…

基于plotly数据可视化_如何使用Plotly进行数据可视化

基于plotly数据可视化The amount of data in the world is growing every second. From sending a text to clicking a link, you are creating data points for companies to use. Insights that can be drawn from this collection of data can be extremely valuable. Every…

ESLint简介

ESLint简介 ESLint是一个用来识别 ECMAScript 并且按照规则给出报告的代码检测工具&#xff0c;使用它可以避免低级错误和统一代码的风格。如果每次在代码提交之前都进行一次eslint代码检查&#xff0c;就不会因为某个字段未定义为undefined或null这样的错误而导致服务崩溃&…

数据科学与大数据是什么意思_什么是数据科学?

数据科学与大数据是什么意思Data Science is an interdisciplinary field that uses a combination of code, statistical analysis, and algorithms to gain insights from structured and unstructured data.数据科学是一个跨学科领域&#xff0c;它结合使用代码&#xff0c;…

C#制作、打包、签名、发布Activex全过程

一、前言 最近有这样一个需求&#xff0c;需要在网页上面启动客户端的软件&#xff0c;软件之间的通信、调用&#xff0c;单单依靠HTML是无法实现了&#xff0c;因此必须借用Activex来实现。由于本人主要擅长C#&#xff0c;自然本文给出了用C#实现的范例&#xff0c;本文的预期…

用Python创建漂亮的交互式可视化效果

Plotly is an interactive Python library that provides a wide range of visualisations accessible through a simple interface.Plotly是一个交互式Python库&#xff0c;通过简单的界面即可提供广泛的可视化效果。 There are many different visualisation libraries avai…

Hadoop 2.0集群配置详细教程

Hadoop 2.0集群配置详细教程 前言 Hadoop2.0介绍 Hadoop是 apache 的开源 项目&#xff0c;开发的主要目的是为了构建可靠&#xff0c;可拓展 scalable &#xff0c;分布式的系 统&#xff0c; hadoop 是一系列的子工程的 总和&#xff0c;其中包含 1. hadoop common &#xff…

php如何减缓gc_管理信息传播-使用数据科学减缓错误信息的传播

php如何减缓gcWith more people now than ever relying on social media to stay updated on current events, there is an ethical responsibility for hosting companies to defend against false information. Disinformation, which is a type of misinformation that is i…

[UE4]删除UI:Remove from Parent

同时要将保存UI的变量清空&#xff0c;以释放占用的系统内存 转载于:https://www.cnblogs.com/timy/p/9842206.html

BZOJ2503: 相框

Description P大的基础电路实验课是一个无聊至极的课。每次实验&#xff0c;T君总是提前完成&#xff0c;管理员却不让T君离开&#xff0c;T君只能干坐在那儿无所事事。先说说这个实验课&#xff0c;无非就是把几根导线和某些元器件&#xff08;电阻、电容、电感等&#xff09;…