Describe Power BI model fundamentals
Data model
query-able data resource optimized for analytics
- Power BI uses DAX,
- paginated reports can use either DAX or MDX.
- The Analyze in Excel features uses MDX.
Power BI dataset
- A dataset is a Power BI artifact that’s a source of data for visualizations in Power BI reports and dashboards.
- once pbi model published to a workspace in the Power BI service, it’s then known as a dataset
Analytic query
An analytic query has three phases that are executed in this order:
- Filter (slicing)
- Group (dicing)
- Summarize
Tabular model
queried via dax not mdx comprises of
- one or more tables of columns.
- relationships,
- hierarchies,
- calculations.
Star schema design
dimensions table columns filter or group. Fact table columns are summarized.
Table storage mode
- Import – Queries retrieve data that’s stored, or cached, in the model.
- DirectQuery – Queries pass through to the data source.
- Dual – Queries retrieve stored data or pass through to the data source. Power BI determines the most efficient plan, striving to use cached data whenever possible.
Model framework
Table storage mode settings determine the model framework,
- An import model comprises tables that have their storage mode property set to Import.
- A DirectQuery model comprises tables that have their storage mode property set to DirectQuery, and they belong to the same source group. Source group is described later in this module.
- A composite model comprises more than one source group.
Determine when to develop an import model
pro all data source supported can integrate different source support dax and mquery support calculated table best analytic query performance con model size (strive to reduce the amount of data stored in tables) shared capacity 1gb limit per dataset dedicated capacity (premium) can grow beyond 10 GB should remove unneeded row, col summarize optimize dataatype with preference for numeric aim for custom col in mquery than dax calc col disable m query load disable auto date/time use direct query table storage periodic refresh (8/day shared capacity) (48 /day in dedicated)
Determine when to develop a DirectQuery model
pro Model large or fast-changing data sources Enforce source RLS using source RLS with SSO Data sovereignty restrictions (regulations that prevents export of data from premise) consider installing Power BI Report Server for on-premises reporting Create specialized datasets Typically, DirectQuery mode supports relational database sources need to translate analytics queries to native queries You can connect to a Power BI dataset (or Azure Analysis Services model) and convert it to a DirectQuery local model A local model is a relative term that describes a model’s relationship to another model. con only maojor relational sys, pdi dataset, azure analysis services m query not possible (no pivot and unpivot) slow (esp if source sys not optimized via indices/ materialized view) optimization Data source optimizations DirectQuery user-defined aggregation tables (how does this work with rls?)
Determine when to develop a composite model
pro flexibility balance between import and passthrough boost query performance. can extend with new calculated columns and tables. https://learn.microsoft.com/en-us/power-bi/guidance/powerbi-implementation-planning-usage-scenario-customizable-managed-self-service-bi con still require periodic refresh to prevent out of sync performance issue when consolidate import and direct query chained model may break due to upstream mod relationship cross source is limited (cant determine the 1 side of 1:n) Limited relationships may result in different evaluations of model queries and calculations Import aggregation tables Chained aggregations covering three or more tables aren’t allowed. For example, aggregations on Table A can’t refer to a Table B that has aggregations referring to a Table C. The Detail Table must use DirectQuery storage mode, not Import. Dual storage mode Deliver real-time data from an import model When you set up an import table with incremental refresh, you can enable the Get the latest data in real-time with DirectQuery option. this is called hybrid table
Choose a model framework
Choose the DirectQuery model framework when your data source stores large volumes of data and/or your report needs to deliver near real-time data.
Choose the composite model framework to:
- Boost the query performance of a DirectQuery model.
- Deliver near real-time query results from an import model.
- Extend a Power BI dataset (or AAS model) with additional data.
more resource on how to choose which framework https://powerdobs.nl/blog/power-bi-storage-modes-explained/