COVID 19 Dashboard – Article 2 - Transforming the Data – Calculating between rows in Power Query


I will continue to build on my earlier posts about my COVID19 Dashboard.  In this series, I am writing a few different blog articles describing not only how I built it, but some things I learned in the process.

The current list of planned topics include:
  •  Working with growing CSV files
  •  Using index and merge when transforming data
  •  Using time functions to calculate the daily change
  •  Reporting based on the population
  •  Working with maps, setting data types correctly
  •  Possibly more to come depending on how long this all goes….


PART 2 – Continuing to transform the data
In my first post, I explained how I brought in data from a CSV file and used “Unpivot Columns” to transform the data into a format that works much better for reporting.  Once the data was in this format, I quickly noticed that the values (Cases, Deaths and Recoveries) were cumulative.  In other words, these amounts were ending balances not daily transactions, if I look at it as an accountant.  These balances were going to make my ideas for reports more difficult, since I wanted to see the change day over day, net change over day, etc.

Sample File
County-State
Date
Confirmed Cases
Orange, California
3/15/2020
14
Orange, California
3/16/2020
17
Orange, California
3/17/2020
22
Orange, California
3/18/2020
29
Orange, California
3/19/2020
53

Desired File
County-State
Date
Confirmed Cases
Daily Cases
Orange, California
3/15/2020
14
14
Orange, California
3/16/2020
17
3
Orange, California
3/17/2020
22
5
Orange, California
3/18/2020
29
7
Orange, California
3/19/2020
53
24

So, I started my normal approach to problem solving for Power BI, I did an internet search.  I was looking for “combining rows in Power Query” or “add column power bi from previous row”.  I found a tutorial on ExcelTown that helped me solve the problem.


Step by step solution

Step 1 – Add an index column to my data








Step 2 - Add an additional column add 1 to the index column


Step 3 – Since my data has repeated dates (one date for each county and state combination (my Combined_Key column), I needed to add some additional keys for the future merge that I was going to do, because I only want to look at the prior row, if it is the same county and state combination.  To do this, I simply added to new columns.  The first one combined Combined_Key with Index and the second one combined Combined_Key with Addition.
·         Merged column = Table.AddColumn(#"Inserted Addition", "Merged", each Text.Combine({[Combined_Key], "-", Text.From([Index], "en-US")}), type text)
·         Merged column2 = Table.AddColumn(#"Inserted Merged Column", "Merged.1", each Text.Combine({[Combined_Key], "-", Text.From([Addition], "en-US")}), type text)












Step 4 - Once the keys were in place, I used Merge to merge the table with itself.


Step 5 – Now I can bring in the Confirmed Cases from the prior row by clicking on the doublesided arrow at the header and show the value from the connected query.

Step 6 – I removed all the index and key columns that are no longer needed and renamed my new column from the prior day to Prior Confirmed Cases.


Step 7 – I added a new column to calculate net change (Confirmed Cases – Prior Confirmed Cases) Table.AddColumn(#"Renamed Columns2", "New Cases", each [Confirmed Cases]-[Prior Confirmed Cases])


I repeated these steps with other tables, so I had net change in every table.  I also used merge to combine Cases data with Death data, so I had a simpler data model when I work with visualizations.

In my next article, I will move away from Power Query and into DAX to discuss some of the measures I created in the report.


Comments

Popular posts from this blog

Creating a date table with a fiscal year in your Power BI data model

Showing the last 12 months based on a date slicer

Using Excel, OneDrive and Power Automate for Dynamics 365 Finance & Supply Chains Journals