Efficient Workbooks
What is an “efficient” workbook?
There are several factors that make a workbook efficient. Some of these factors are technical and some more user-focused but in general an efficient workbook is:
- Simple – Is it easy to create the workbook and will it be easy to maintain in the future? Does it clearly communicate the message of the author and the data?
- Flexible – Can the workbook answer multiple questions the users want to ask, or just one? Does it engage the user in an interactive experience or is it simply a static report?
- Fast – Does the workbook respond quickly enough for the users? This could mean time to open, time to refresh or time to respond to interaction. This is a subjective measure but in general we want a workbook to provide an initial display of information and to respond to user interactions within seconds.
The performance of a dashboard is impacted by:
- The visual design at both the dashboard and worksheet levels – e.g. how many elements, how many data points, use of filters and actions, etc.;
- The calculations – e.g. what kind of calculation, where is the calculation performed, etc.;
- The queries – e.g. how much data is returned, is it custom SQL;
- The data connections and underlying data sources;
- Some differences between Tableau Desktop and Tableau Server;
- Other environmental factors such as hardware configuration and capacity.
Scalability is not the same as performance
Scalability is about ensuring that we can support multiple users viewing shared workbooks. Performance is about ensuring that an individual workbook runs as quickly as possible. While many of the recommendations offered in this document will have a positive influence on scalability for workbooks published to Tableau Server, the principal focus of this document is about improving performance.
Data Strategy has the most impact on the performance
Extracts are typically much faster to work with than a live data source. The key is to use domain-specific data sets. Tableau Server is not intended to be a replacement for a data warehouse. Rather, it’s meant for data discovery. Push as much work as possible to database or ETL.
An extract is a columnar data store, the wider the data set, slower the query performance
- Minimize the number of fields
- Minimize the number of records, use extract filters to minimize the number of rows.
- Optimize extracts, by materializing calculations, removing columns and using aggregate data as much as possible.
Design Approach
Know your audience
The best visualizations have a clear purpose and work for their intended audience. What will you be trying to say with this dashboard? Are you presenting a conclusion or a key question?
In addition to knowing what you're trying to say, it's important to know who you're saying it to. Does your audience know this subject matter extremely well or will it be new to them? What kind of cues will they need?
Leverage the most-viewed spot
Most viewers scan web content starting at the top left of a web page. Once you know the main purpose, be sure to place your most important view so that it occupies or spans the upper-left corner of your dashboard.
Keep it simple, limit the number of views
In general, it’s a good idea to limit the number of views on your dashboard to 2-3. If you add two many views, visual clarity and the big picture can get lost in the details. You can always create more dashboards and story.
Color Palette
As with numbers of views on a dashboard, it’s true for colors as well. Limit the use of different color palette on a dashboard.
Enable Highlighting
You can use the highlighting feature to allow users to highlight parts of a view based on what they enter or select.
Filters
Filters are a very powerful feature of Tableau that allow us to create rich, interactive dashboards for end users. However, each filter can require a query in order to enumerate the options. So adding too many can unexpectedly cause the dashboard to take a long time to render. Also, when you use “show relevant values” on a filter, it requires a query to update the shown values each time other filters are changed. Use this feature sparingly. Also, if you have your filters applied to multiple worksheets, be aware that each change will trigger multiple queries as all the affected worksheets that are visible will update (non-visible worksheets are not run).
Tweak your dashboard for performance
Once we have ensured our dashboard is as simple as can be, we can tweak our design to take advantage of caching for further performance improvement. Fixed-size dashboards One of the easiest things we can do to improve performance is to check that your dashboard is a fixed size.
If we have multiple requests for the same dashboard but from windows of different sizes, we would need to generate a layout for each request. By setting the dashboard layout to a fixed size we ensure that we only need to create a single layout that can be reused across all requests.
Using the viz level of detail to reduce queries
Although best practice is generally to only use the fields you need for each worksheet, sometimes we can improve performance by pulling more information on one worksheet to avoid queries on another.
Consider the following dashboard:
Building this as one would expect, the execution plan will result in two queries – one for each worksheet:
SELECT [Superstore APAC].[City] AS [City],
SUM([Superstore APAC].[Sales]) AS [sum:Sales:ok]
FROM [dbo].[Superstore APAC] [Superstore APAC]
GROUP BY [Superstore APAC].[City]
SELECT [Superstore APAC].[Country] AS [Country],
SUM([Superstore APAC].[Sales]) AS [sum:Sales:ok]
FROM [dbo].[Superstore APAC] [Superstore APAC]
GROUP BY [Superstore APAC].[Country]
If we make a change to our dashboard design and add Country to the Cities worksheet (on the Detail shelf), Tableau can complete the dashboard with just a single query. Tableau is smart enough to run the query for the Cities worksheet first and then use the query results cache to provide the data for the Countries worksheet. This feature is called “query batching”.
SELECT [Superstore APAC].[City] AS [City],
[Superstore APAC].[Country] AS [Country],
SUM([Superstore APAC].[Sales]) AS [sum:Sales:ok]
FROM [dbo].[Superstore APAC] [Superstore APAC]
GROUP BY [Superstore APAC].[City], [Superstore APAC].[Country]
Clearly this can’t be done in all cases as adding a dimension to the viz alters the level of detail so it could result in more marks being displayed. However, when you have a hierarchical relationship in your data like the above example, this is a useful technique as it won’t affect the visible level of detail.
Using the viz level of detail to optimise actions
We can use a similar approach with actions to reduce the number of queries we need to run.
Good worksheet design
The next level down from the dashboard is the worksheet. In Tableau the design of the worksheet is intrinsically related to the queries that are run. Each worksheet will generate one or more queries, so at this level we are trying to ensure we generate the most optimal queries possible.
Only include the fields you need.
Look at the detail shelf and remove any fields that aren’t directly used on the viz, needed in the tooltip or required to drive the required level of mark detail. This makes the query faster in the data source, and requires less data to be returned in the query results.
Show the minimum number of marks to answer the question
In Tableau there are often multiple ways to calculate the same number. Use the chart which answers the question being asked with minimal marks on the view.
Efficient filters
Filtering in Tableau is extremely powerful and expressive. However, inefficient filters are one of the most common causes of poorly performing workbooks and dashboards. The following sections lay out a number of best practices for working with filters.
Note: the efficiency of filters is dramatically impacted by the presence and maintenance of indexes in the data source. See the section on indexes for more detail.
Extract filters
These filters are only applicable when using data extracts, but in that case are logically applied before all other filters. They limit the data that is fetched from the underlying data source, and can be either dimension or measure filters. Additionally, they can perform either a TOP or SAMPLE to reduce the number of records returned, depending on the source data platform.
Data source filters
Data source filters are the highest level of filter available on live connections. A key difference between data source filters and context filters is that data source filters are scoped to the whole data source whereas context filters are set for each worksheet. This means that when used in a published data source, data source filters can be enforced whereas context filters are applied at the worksheet level. Data source filters can be an effective way of putting a constraint on a data source to prevent end users from accidentally running a massive query – e.g. you could put a data source filter limiting queries on a transaction table to only be for the last 6 months.
Context filters
By default, all filters that you set in Tableau are computed independently. That is, each filter accesses all rows in your data source without regard to other filters. However, by specifying a context filter you can make any other filters that you define dependent because they process only the data that passes through the context filter.
Other things to keep in mind
- Use an include filter instead of exclude fitlers.
- Use a continuous date filter. Relative and range-of-date filters can take advantage of indexing in your database and perform much better in comparison to discrete date filters.
- Use Boolean or numeric filters.
- Use Parameters and Actions filters, they don’t add any extra load on queries.
Calculations vs. Native features
Sometimes, users create calculated fields to perform functions when these can easily be achieved with native features of Tableau. For example:
- Grouping dimension members together – consider using groups or sets;
- Grouping measured values together into ranged bands – consider using bins;
- Truncating dates to a coarser granularity e.g. month or week – consider using custom date fields;
- Creating a set of values that is the concatenation of two different dimensions – consider using combined fields;
- Displaying dates with a specific format, or converting numeric values to KPI indicators – consider using built-in formatting features;
- Changing the displayed values for dimension members – consider using aliases.
Refine your Dashboard
Trying to answer too many questions with a single dashboard
It’s easy to get overly ambitious and want to provide highly detailed, real-time dashboards that cover every business challenge and offer users lots of drill-down options. You’re not doing your viewers any favors unless you keep the scope of your dashboard tight—and remember, you can create new dashboards
Cluttering the dashboard with low-value graphics and widgets
Resist the temptation to make your dashboard flashy or use gauge-like graphics and widgets. As addicting as it is to customize dashboards, unnecessary objects get in the way of your dashboard’s objective, which is to quickly inform your audience. Keep your dashboard simple.
Not viewing your dashboard as your users will
Taking the time to view your dashboard as your audience will, on whichever devices they’ll be using, can highlight things you need to adjust or tweak. Testing always pays off.
Forgetting to check in with your audience
Rolling out a dashboard is not a once-and-done exercise. Be sure to check in with your audience on how the dashboard is working, or not working for them. If the metrics and questions you present in your dashboard aren’t relevant or if they get out of date, your audience won’t use it.
Tools - Performance Recorder
In order to understand the performance of your workbooks, you need to understand a) what is happening and b) how long it is taking. This information is captured in multiple locations depending on where you are running the workbook (i.e. in Tableau Desktop or Tableau Server) and at multiple levels of detail. This section outlines the various options available.
The first place you should look for performance information is the Performance Recorder feature of Tableau Desktop and Tableau Server. In Tableau Desktop you enable this feature under the Help menu:
Fire up Tableau and start performance recording, then open your workbook (it's best practice to not have other workbooks open while doing this so you are not inadvertently competing for resources). Interact with it as if you were an end user and when you feel you have gathered enough data go back in the help menu and stop recording. Another Tableau Desktop window will open at this point with the data captured:
You can now identify the actions in the workbook that take the most time - for example in the above image the selected query from the Timeline worksheet takes 30.66 seconds to complete. Clicking on the bar shows the text of the query being executed. As the output of the performance recorder is a Tableau Workbook, you can also create additional views to explore this information in other ways.
For any questions or suggestions, please e-mail to [email protected]