Articles

User Friendly Data Models in Power Pivot & Tabular

Organization and user friendly data model development in Power Pivot & Tabular environments take precedence over almost all other concerns.

We will discuss best practices of creating a data dictionary, naming rules, etc. of Power Pivot & Tabular data models in other, related postings.  

In this posting, we will discuss some options of organizing the data.  As stated previously, if the data model is not organized in a friendly manner (i.e. dimensions split from facts, etc.), it may prove to be difficult to navigate.

Creating dimensions separate from facts allows easier navigation of the data model.  Creating your calculated fields in appropriate fact tables makes it easier for the end user to find those measures.  Naming fields appropriately is also key.  

Power Pivot & Tabular data models differ in the development of user friendly solutions.  In a Power Pivot model, you are limited to these practices when making the model more user-friendly.  In some engagements, I have created a new “table” just to house some of my calculated fields.  This allowed me to provide a custom name to the fact table and eased the understanding of where to access custom measures for the end user.

In a tabular model, you have other options of organizing the data.  BIDS Helper allows extra features and functionality that is not available in a Power Pivot model.

BIDS Helper

BIDS Helper offers several features missing in SQL Server Data Tools that were available for multidimensional models but not for Tabular.  These include the following:

Tabular Translations Editor

Per BIDS Helper documentation:

Unlike Multidimensional projects, Tabular projects in SQL 2012 do not support translations. This feature gap makes Tabular models less user-friendly when the users are international since the field list in Excel PivotTables and Power View is not in their native language.

 

Another common scenario where this BIDS Helper feature helps is the following. Imagine you are a developer building a Tabular model for users who speak Hebrew natively. It may be easier to code the model with English table and column names so that your DAX would be more readable. But then you want to show your model to users with Hebrew table and column names. This BIDS Helper feature would let you code in English but display to users in their native language.

 

Warning: While translations of metadata work in Tabular models, they are not officially supported by Microsoft. If you encounter a bug in how Tabular handles translations and open a support case, Microsoft may not provide support.

BIDS Helper provides a UI for editing translations on metadata (not on data). The following objects can be translated:

  • Database caption
  • Cube caption
  • Perspective caption
  • Table caption and description
  • Column caption, display folder, and description
  • Measure caption, display folder, and description
  • Hierarchy caption, display folder, and description
  • Level caption
  • Action caption and description

Note: Only metadata is translated (i.e. column names) not the data (i.e. values in the rows in tables) since Tabular models do not support translating data like Multidimensional models do.

This feature allows interesting workarounds to common problems.  For instance, if you would like to utilize the same cube but have different departments that require different nomenclature for the same field, you would typically have one option; create a new calculated column that duplicates the field in question and renames the field to the required nomenclature while creating a different perspective for the department requiring the rename.  With tabular translations editor, you are able to reduce the memory footprint required by not requiring the extra calculated field(s).  Instead, you would create a new translation “language” for each department requiring different nomenclature

* Obviously, it is preferred for the same organization to utilize a synonymous nomenclature across departments.  This may be more useful in the case of a multi-tenant environment where each tenant utilizes different naming conventions.  As well, be careful not to make the maintenance of multiple translations and perspectives a spaghetti nightmare!

Tabular Display Folders

Per BIDS Helper documentation:

Unlike Multidimensional projects, Tabular projects in SQL 2012 do not support display folders. This feature gap makes Tabular models less user-friendly since the field list in Excel PivotTables is longer and less organized than it could be.

Warning: While display folders work in Tabular models, they are not officially supported by Microsoft. If you encounter a bug in how Tabular handles display folders and open a support case, Microsoft may not provide support.

BIDS Helper provides a UI for editing display folders on measures, columns, and hierarchies. All display folders are edited in the same place. Right click on the .bim file and choose Tabular Display Folders…

This is a great feature.  Typically, developers only have the naming convention and/or table creation as options to categorize their fields.  This is definitely a limitation of the product as you may want your most used attributes / measures to be first in the list or grouped differently.  This feature allows developers to organize attributes under dimensions or calculated fields / measures in fact tables into categories that make more sense to an end user.  For example, if the date dimension has standard calendar attributes as well as fiscal calendar attributes and time related attributes, those could be organized under three different display folders under the date dimension so as to not clutter the end user’s view and make it easier to navigate to the attribute they actually require.

We will likely add to this posting as more on this in a later post as well).

Normalization vs. Denormalization

normalization vs. denormalizationNormalization vs. Denormalization best practices for Power Pivot / Tabular data modeling is typically not disputed.  

First, let’s quickly define in human terms what we are referencing when we speak of normalization vs. denormalization. Normalization is reducing data duplication by splitting dimensional data and attributes into their dimension tables.  Typically, dimensions/attributes in a hierarchy are modeled in a snowflake structure (i.e. Fact Table –> Item Table –> ProductCode Table –> Category Table).  Denormalization is modeling in an attempt to optimize performance by taking data from higher hierarchy tables and forcing the data into the lower dimension tables regardless of duplication.  This reduces the number of joins required to report on the attributes available in the higher hierarchy data.  

I will highlight first what I have read and what has resonated with me for the duration of my Power Pivot / Tabular experience in regards to normalization vs. denormalization.

Normalization vs. Denormalization top principles

More rows are preferred over more columns.

More columns are preferred over more dimensions/tables, especially if the attributes have a lower cardinality.

Everyone tends to agree (and data backs it up) that a snowflake data model should be denormalized as much as possible for performance optimization for Power Pivot / Tabular data models.

This principle of normalization vs. denormalization goes against many typical data warehouse architect’s mantra.  This is one of the many in-memory concepts that is hard for them to swallow.  Normalizing data reduces data warehouse disk space by reducing data duplication and dimension cardinality.  The goal for Power Pivot / Tabular data modeling is two-fold and sometimes one interferes with the other.  We want to reduce the cardinality AND reduce the memory footprint of our data model.  Typically, the more important goal is to increase performance.  By denormalizing the data model into a simple star schema, bringing in attributes from a snowflaked dimension into the lowest dimension in the hierarchy, you do not reduce cardinality.  However, you do reduce the number of joins required to report on the higher level attributes.  I say you do not reduce cardinality because the same number of elements for each attribute still exists in your lower dimension.  They are indeed duplicated in this lower dimension; however, the cardinality has not changed!  

It is important to note that you typically do not want a single, flat table structure.  It has been shown in performance testing that Power Pivot / Tabular performs better when utilizing a dimension vs. an attribute in a flat fact table.  As well, you will increase memory consumption if you duplicate data in the fact table due to the increase in pointers when compression occurs.  Power Pivot and Tabular can compress data better if it is ordered (more on this later).  Obviously, you are not able to order all attributes of a table.  The more data not ordered optimally, the more pointers, and the more memory consumption.

As well, third normal form is not preferred.  However, third normal form and/or single, flat tables can be utilized for prototyping.  There is no harm and it may even help profile the data for designing the optimal star schema structure.

When designing a data model, one must take great care in picturing the end result.  This is the structure that your end users will see when creating reports or performing ad-hoc analysis.  If it is not organized in a friendly manner (i.e. dimensions split from facts), it may prove to be difficult to navigate (more on this in a later post as well).  

In Power Pivot, if you want to display attributes in a separate folder, you will be forced to normalize.  If the attribute cardinality is low, it is typically safe to normalize.  A relationship between two large tables will degrade performance.  So, it would be optimal to keep an attribute with very high cardinality in the fact table rather than normalizing.  If you are only retrieving a single attribute from a dimension, it would also be optimal to denormalize. 

Test, test, test.  Power Pivot / Tabular surprises the best of us and provides unexpected results sometimes.  Always keep the end user perspective into account and then follow these basic concepts to optimize.  Normalization vs. denormalization will always have exceptions to the rule.

Data Model Best Practices

Best Practice - DMWe will start off discussing the data model itself.  

The data model that is created in Power Pivot or a Tabular solution is the foundation of any Microsoft business intelligence solution.

The data model consists of the data structure (including tables and their relationships, columns in the tables, and all measures created), the data itself (actual content in the table rows), and the metadata (the table/column/measure descriptions, formatting, security, etc.).

Starting the solution with a solid foundation (the data model) is critical to the success of the solution.  An optimal data model lends itself to the speed and performance of the solution as well as the usability and user acceptance or adoption of the solution.
We will discuss the following topics related to Power Pivot and Tabular data modeling best practices:

We will modify this posting as we add new content.  We will add links to the new postings and very possibly add new topics as we go.  We have so much information already that we could at least put an outline around the majority of it.  We look forward to sharing this information with you and helping grow the Power Pivot and tabular community.  Check back often or sign up for a subscription so you don’t miss new information!

Best Practices of Microsoft BI

Best PracticesWe have seen several dispersed articles on Power Pivot, SSAS Tabular, and Data Analysis Expressions (DAX) best practices.

 

Over many articles, we will attempt to consolidate many of these into a single, consolidated location as well as adding new material to the list based on our experiences as well as any missing information we could not come across.  

We would love your feedback as we move through the list of best practices.  With newer technologies, there are new things learned every day.  If you feel like there is something that you would like to be added or you would like to request a posting on a specific topic, please let us know.  We are highly receptive to feedback and requests for topics.  As well, please comment and feel free to subscribe to ensure you are notified when we post new articles.

We have a ton of material coming and are excited to share it with the community!

The listing below will be linked to the underlying article category header as we create articles under each category. Each category will have multiple articles created under each one as well.  We will add new categories as needed and content will be created in no particular order.  Like we stated previously, we have a huge backlog of information and articles to share.

 

Power BI Preview

I am so excited for the future of Microsoft BI.  On January 27th, Microsoft unveiled a new Power BI Preview (link) offering that I believe will radically change the BI atmosphere.
power bi preview
Since the Power BI inception, the biggest drawback has been its pricing.  Power BI can now compare to Tableau, IBM, Qlikview, and Spotfire pricing.  Other drawbacks or deficiencies?
  1. The pricing model discussed above priced Power BI out of the market.  This is now addressed with the new pricing model.Power BI Preview
  2. No mobile component for on premises or for the iPad?  Microsoft has introduced both the iPad application (Android coming soon) as well as the Analysis Services Connection Preview.  This is exciting!  A connector to connect to your on premises tabular data models while gaining the new features and functionality of Power BI in the cloud as soon as they are released and mobile applications!
  3. Forcing users to utilize Excel?  Virtually everyone utilizes Excel, but OK, the choice to utilize Power BI decoupled from Excel is now an option!
  4. Don’t want to be forced into utilizing SharePoint just for the Power BI features?  OK!
  5. Your business does not have a strong enough supporting infrastructure and/or personnel to administer the hardware required for Power BI on premises?  No problem!  Let Microsoft handle the hardware and infrastructure….you just model the data and develop reporting.
  6. The Power BI learning curve?  So many people are familiar with Excel and Pivot Table functionality, PowerPivot, DAX, and Power View is somewhat a natural progression for power users.
  7. Do you have limited SQL skills?  Power Query helps to bridge that gap.  It is essentially a very advanced macro for ETL and data transformation.
  8. Don’t trust your data in the cloud?  A new SSAS on premises connector alleviates this concern.
  9. Dislike the reliance on Silverlight?  Microsoft has moved to HTML5 on the new preview.
  10. Want a consolidated view of the most important measurements/KPI(s) across completely unrelated, multiple data sources?  Done with the new dashboard functionality!
  11. Don’t want or need the full Office 365 Suite of tools?  OK!  Standalone Power BI is here in the preview and is priced accordingly.
  12. Power View lacks the ability for customization?  You cannot export the underlying data?  You cannot configure subscriptions or data alerts?  OK, yeah, these are still lacking.  However, there are a ton of ways to to work around this.  Some things can still not be done (i.e. variable segmentation, variable titles, custom colors, exporting data, etc.) but this is a first generation product….it’s coming.
Microsoft is quickly answering the concerns of the masses and has really impressed me with their offerings.
Continue Reading