If someone asks you to define the Power Query, what should you say? If you’ve ever worked with Power BI, there is no chance that you haven’t used Power Query, even if you weren’t aware of it. Therefore, one could easily say that Power Query is the “heart and soul” of Power BI…
如果有人要求您定义超级查询,您应该怎么说? 如果您曾经使用过Power BI,那么即使您不了解Power Query,也绝不会使用它。 因此,可以轻松地说Power Query是Power BI的“灵魂” 。
In more official wording, Power Query is Microsoft’s technology for connecting and transforming data from multiple sources. As Microsoft’s official documentation states, you can connect to hundreds of different data sources and perform more than 300 transformations on your data.
用更正式的措辞来说,Power Query是Microsoft的技术,用于连接和转换来自多个源的数据。 如Microsoft的官方文档所述,您可以连接到数百个不同的数据源,并对数据执行300多次转换。
The key advantage of Power Query is that you can perform complex data transformations with little or no coding skills! Additionally, all steps you’ve applied during the data transformation process are being saved, so every time you refresh your dataset, those steps will be automatically applied to shape your data, which is a real time-saver.
Power Query的主要优势在于,您几乎不需要或几乎没有编码技能就可以执行复杂的数据转换! 此外,还将保存您在数据转换过程中应用的所有步骤,因此,每次刷新数据集时,这些步骤将自动应用以成形数据,这是真正的节省时间。
Out of those 300+ transformations, it’s extremely hard to choose the most useful ones, but I will share my top 3 tips related to Power Query (and its powerful M language). You should also learn non Power Query related tips for boosting Power BI development.
在这300多个转换中,很难选择最有用的转换,但是我将分享与Power Query(及其强大的M语言)相关的3个技巧。 您还应该学习与Power Query不相关的技巧,以促进Power BI的开发 。
技巧1 – Power Query Editor中的省时功能 (Tip #1 — Time Savers in Power Query Editor)
I need to say this immediately: there are tons of time-saving actions you can perform with clever usage of Power Query Editor, so I will narrow my recommendation to a few I’m using most often.
我需要立即说出这一点:通过巧妙地使用Power Query Editor,您可以执行许多省时的操作,因此,我将建议范围缩小到最常使用的几个。
I bet that you face this scenario almost every time you are preparing data for your Power BI report. You import a wide table with a lot of columns and you need to get rid of some of them. You are scrolling from left to right, choosing which columns to keep and which to remove.
我敢打赌,几乎每次为Power BI报告准备数据时,您都会遇到这种情况。 您导入一个包含很多列的宽表,您需要摆脱其中的一些列。 您从左到右滚动,选择要保留的列和要删除的列。
But, there is a much more sophisticated way to obtain this:
但是,有一种更复杂的方法可以实现此目的:
As you see in the illustration above, instead of exhausting scrolling, just open Choose Columns drop-down menu, select Choose Columns, and select columns you want to keep! Soooo handy!
如上图所示,只需打开“选择列”下拉菜单,选择“选择列”,然后选择要保留的列即可,而不是穷尽滚动! 太好用了!
Another tip under the same drop-down menu: choose Go To Column and you will be navigated straight to that column, so you can perform any kind of transformation you want on that specific column, again without the need to waste your time trying to find it in wide “30+ column tables”…
在同一下拉菜单下的另一个提示:选择“转到列”,您将直接导航到该列,因此您可以在该特定列上执行所需的任何类型的转换,而无需浪费您的时间来查找在“ 30多个列”表中…
Another hidden gem is the “Query Dependencies” button under the View tab.
另一个隐藏的元素是“视图”选项卡下的“查询依赖项”按钮。
This is extremely useful when working with complex models when data comes from multiple different sources, or not all data is being loaded to the report.
当数据来自多个不同来源或并非所有数据都已加载到报表中时,这在处理复杂模型时非常有用。
Using Query Dependencies will give you a quick visual overview of your data model:
使用查询依赖关系将使您快速直观地了解数据模型:
Instead of clicking on every single entity in your data model to check its status, you can have this all in one place, and even better, it’s visually represented!
您无需将数据模型中的每个实体都单击来检查其状态,而是可以将它们全部集中在一个地方,甚至更好地以可视方式进行显示!
Imagine having data coming from CSV files, SQL Server database, and Sharepoint lists, and part of that data doesn’t even being loaded to the report for whatever reasons. This is a huuuuge time saver!
想象一下,数据来自CSV文件,SQL Server数据库和Sharepoint列表,并且无论出于何种原因,这些数据的一部分甚至都没有加载到报表中。 这节省了很多时间!
技巧2-使用M语言进行频繁的计算 (Tip #2 — Use M language to perform frequent calculations)
One of the most common business requests is to calculate the day difference between different events. For example, I want to know the age structure of my customers, so I need to calculate their age every time data is refreshed.
最常见的业务请求之一是计算不同事件之间的日差。 例如,我想知道客户的年龄结构,因此我需要在每次刷新数据时计算他们的年龄。
Or, I need to check how many days are customers late with their payments. As you can assume, these figures need to be calculated dynamically, so here comes M language to the rescue!
或者,我需要检查客户延迟付款的天数。 如您所料,这些数字需要动态计算,因此可以使用M语言进行救援!
Let’s demonstrate on calculating customers’ age. Basically, there are two ways to achieve this in Power Query: first doesn’t need any coding, but it requires multiple steps to be applied. Therefore, I prefer the second option, when you put the whole calculation in one step!
让我们演示一下计算客户的年龄。 基本上,有两种方法可以在Power Query中实现此目的:首先不需要任何编码,但是需要应用多个步骤。 因此,当您将整个计算放在一个步骤中时,我更喜欢第二种选择!
This method visualized above requires three separate steps. First, we insert a new column and under the Date drop-down menu, we choose the Age option. However, Power Query calculates Age in days since BirthDate till today. Therefore, we need to convert this awkward number to years, which is done under the Duration drop-down menu and selecting Total Years. Again, we are getting an awkward result, because age is displayed as a decimal number with multiple decimal places (this is correct, but not intuitive). One last step is to round down that number, which is performed under Rounding.
上面可视化的此方法需要三个单独的步骤。 首先,我们插入一个新列,然后在“日期”下拉菜单下,选择“年龄”选项。 但是,Power Query会计算从出生日期到今天的天数。 因此,我们需要将此尴尬的数字转换为年,这可以在“持续时间”下拉菜单中选择“总年”来完成。 再次,我们得到一个尴尬的结果,因为年龄显示为带有多个小数位的十进制数字(这是正确的,但不直观)。 最后一步是将数字四舍五入,这在四舍五入下执行。
Now, nothing is wrong with this approach, but if you are doing multiple calculations, your Power Query Applied Steps pane will finish polluted with many unnecessary steps.
现在,这种方法没有什么问题,但是,如果您进行多次计算,那么“ Power Query Applied Steps”窗格将被许多不必要的步骤污染。
That’s why I prefer another option: under the Add Column tab, choose a Custom column and enter the following formula:
这就是为什么我喜欢另一个选项:在“添加列”选项卡下,选择一个“自定义”列并输入以下公式:
Number.RoundDown(Duration.TotalDays(Date.From(DateTime.LocalNow()) - [BirthDate])/365)
This way, we perform all iterations from the previous version in one run and we have only one step applied! Mission accomplished in a more elegant way…
这样,我们可以一次运行执行先前版本的所有迭代,并且只应用了一个步骤! 任务以更优雅的方式完成...
提示3 –自定义灵活日期维度 (Tip #3 — Custom Flexible Date Dimension)
This one is my favorite! First, I thought to dedicate a separate post to this, but in the end, I’ve decided to put it here, since I’ve already written the whole series on proper handling of the Date dimension.
这个是我的最爱! 首先,我想为此单独写一篇文章,但是最后,我决定将它放在这里,因为我已经写了整个系列有关Date维的正确处理 。
I won’t spend much time explaining the importance of having separate Date dimension (that is the topic of another article) — I will just briefly say: technically, Power BI allows you to “survive” without separate Date dimension, but don’t do it! Just don’t…
我不会花很多时间来说明使用单独的Date维度的重要性( 这是另一篇文章的主题 )—我只是简单地说:从技术上讲,Power BI允许您在没有单独的Date维度的情况下“生存”,但不要做吧! 只是不要...
There are multiple solutions to create separate Date dimension in your data model, but here I will focus on using M language to achieve this.
有多种解决方案可在您的数据模型中创建单独的Date维度 ,但是在这里,我将重点介绍使用M语言来实现这一点。
There are plenty of ready-made scripts on the web for creating a fully functional Date dimension, but I’ve chosen this solution from Reza Rad (by the way, on his blog you can learn a lot of useful stuff).
Web上有很多现成的脚本可用于创建功能齐全的Date维度,但是我从Reza Rad选择了此解决方案 (顺便说一句,在他的博客上,您可以学到很多有用的东西)。
Open new Power BI file and choose Blank query under Get data:
打开新的Power BI文件,然后在“获取数据”下选择“空白查询”:
This will navigate you to Power Query Editor. The next step is of key importance in order to have a highly customized Date dimension.
这会将您导航到Power Query Editor。 下一步对于拥有高度定制的“日期”维度至关重要。
Under Manage Parameters, select New Parameter and format it like on the following image:
在“管理参数”下,选择“新参数”,然后将其格式化,如下图所示:
This way, you are defining from which year you want your Date dimension to start. Do exactly the same for EndYear:
这样,您可以定义要从哪个年份开始“日期”维度。 对EndYear执行完全相同的操作:
Now that we have both our parameters defined, we can switch to Advanced Editor and paste the whole Reza’s script for creating specific columns of our Date dimension (of course, feel free to include/exclude more columns according to your needs).
现在我们已经定义了两个参数,我们可以切换到Advanced Editor并粘贴整个Reza的脚本,以创建Date维度的特定列(当然,可以根据您的需要随意包含/排除更多列)。
And here is the whole script:
这是整个脚本:
let
StartDate = #date(StartYear,1,1),
EndDate = #date(EndYear,12,31),
NumberOfDays = Duration.Days( EndDate - StartDate ),
Dates = List.Dates(StartDate, NumberOfDays+1, #duration(1,0,0,0)),
#"Converted to Table" = Table.FromList(Dates, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "FullDateAlternateKey"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"FullDateAlternateKey", type date}}),
#"Inserted Year" = Table.AddColumn(#"Changed Type", "Year", each Date.Year([FullDateAlternateKey]), type number),
#"Inserted Month" = Table.AddColumn(#"Inserted Year", "Month", each Date.Month([FullDateAlternateKey]), type number),
#"Inserted Month Name" = Table.AddColumn(#"Inserted Month", "Month Name", each Date.MonthName([FullDateAlternateKey]), type text),
#"Inserted Quarter" = Table.AddColumn(#"Inserted Month Name", "Quarter", each Date.QuarterOfYear([FullDateAlternateKey]), type number),
#"Inserted Week of Year" = Table.AddColumn(#"Inserted Quarter", "Week of Year", each Date.WeekOfYear([FullDateAlternateKey]), type number),
#"Inserted Week of Month" = Table.AddColumn(#"Inserted Week of Year", "Week of Month", each Date.WeekOfMonth([FullDateAlternateKey]), type number),
#"Inserted Day" = Table.AddColumn(#"Inserted Week of Month", "Day", each Date.Day([FullDateAlternateKey]), type number),
#"Inserted Day of Week" = Table.AddColumn(#"Inserted Day", "Day of Week", each Date.DayOfWeek([FullDateAlternateKey]), type number),
#"Inserted Day of Year" = Table.AddColumn(#"Inserted Day of Week", "Day of Year", each Date.DayOfYear([FullDateAlternateKey]), type number),
#"Inserted Day Name" = Table.AddColumn(#"Inserted Day of Year", "Day Name", each Date.DayOfWeekName([FullDateAlternateKey]), type text)
in
#"Inserted Day Name"
Hit Close & Apply and we now have fully functional Date dimension in our data model!
点击“关闭并应用”,我们现在在数据模型中具有完全可用的“日期”维!
We can easily change the time frame by managing parameters and switching year values.
通过管理参数和切换年份值,我们可以轻松更改时间范围。
And now comes ice on the cake, as a bonus tip: save your file as .pbit (Power BI Template file). This way, when you starting your Power BI projects, you don’t need to create a Date dimension from Scratch, wasting your time and energy — it will already be there for you!
现在,锦上添花了,作为一个额外的提示:将文件另存为.pbit(Power BI Template文件)。 这样,当您启动Power BI项目时,无需从头开始创建Date维度,这会浪费您的时间和精力—它已经可以为您服务!
You want more? There it is. Once you open your template file, you will be prompted to enter values for Start Year and End Year, which means that you can customize the time-frame from the report to report! How cool is that!
你想要更多? 在那里。 打开模板文件后,将提示您输入“开始年”和“结束年”的值,这意味着您可以自定义报告的时间范围以进行报告! 多么酷啊!
As soon as you enter values, Power BI will automatically create Date dimension for you, based on values you defined!
输入值后,Power BI会根据您定义的值自动为您创建日期维度!
结论 (Conclusion)
Power Query offers a whole range of features when it comes to data retrieval and especially for data transformation. Describing all of them will require a book or two, so I wanted to extract just a few of them which I consider most useful in my day-to-day work with Power BI.
Power Query在数据检索(尤其是数据转换)方面提供了广泛的功能。 描述所有这些都需要一两本书,因此我只想摘录其中的一些,我认为它们在与Power BI的日常工作中最有用。
What are your favorite Power Query features? Feel free to share them in the Comments section.
您最喜欢的Power Query功能是什么? 随时在“评论”部分中分享它们。
翻译自: https://towardsdatascience.com/power-query-tips-for-every-power-bi-developer-da9ebd3dcd93
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388743.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!