Shape the initial data
Promote headers
Rename columns
Remove top rows (blanks)
Remove columns
Remove Columns > Remove Other Columns.
Unpivot columns (flatten)
select the Transform tab in Power Query, and then select Unpivot.
Pivot columns (unflatten)
On the Transform tab, select Transform > Pivot Columns.
Simplify the data structure
Rename a query
(vproduct to product)
Replace values
typos out of bound value null values
remove duplicate
Best practices for naming tables, columns, and values
- A best practice is to give your tables, columns, and measures descriptive business terms and replace underscores (” _ ”) with spaces.
- Be consistent with abbreviations, prefixes, and words like “number” and “ID.”
- Excessively short abbreviations can cause confusion if they are not commonly used within the organization.
- Also, by removing prefixes or suffixes that you might use in table names and instead naming them in a simple format, you will help avoid confusion.
- When replacing values, try to imagine how those values will appear on the report.
- Values that are too long might be difficult to read and fit on a visual.
- Values that are too short might be difficult to interpret.
- Avoiding acronyms in values is also a good idea, provided that the text will fit on the visual.
Evaluate and change column data types
Implications of incorrect data types
measures and calculation in dax can depend on data types to work.
inability to create a date hierarchy,
Combine multiple tables into a single table
Append queries
home tab ⇒ append queries ⇒ Append Queries as New
Merge queries
Home on ribbon ⇒ Merge Queries drop-down menu ⇒ select Merge Queries as New.
Profile data in Power BI
Examine data structures
On the Model tab, you can
- edit specific column and table properties by selecting a table or columns,
- transform the data by using the Transform Data button, which takes you to Power Query Editor.
- manage, create, edit, and delete relationships between different tables by using Manage Relationships, which is located on the ribbon.
Find data anomalies and data statistics
Column Distribution feature. Select View on the ribbon, ⇒ under Data Preview ⇒ Column Distribution, Column Quality, and Column Profile options.
By default, Power Query examines the first 1000 rows of your data set. To change this, select the profiling status in the status bar and select Column profiling based on entire data set. ]
Column Distribution, - distribution of the data within the column and the counts of distinct and unique values - Distinct values are all the different values in a column, including duplicates and null values, - unique values do not include duplicates or nulls. - above the dataframe Column Quality, - percentages of data that is valid, in error, and empty. - above the dataframe
Column Profile Value distribution graph Column Statistics below the dataframe