Was this page helpful?
Yes No
Have a comment? Please leave it here. Thanks for your feedback!
All Tableau Help > Tableau Help > 
Applies to: Tableau Desktop

Best Practices for Published Data Sources

Publishing data sources to Tableau Online or Tableau Server is integral to maintaining a single source for your data. Publishing also enables sharing data among colleagues; including those who don’t use Tableau Desktop, but have permission to edit workbooks in the web editing environment.

Updates to a published data source flow to all connected workbooks, whether the workbooks themselves are published or not.

In this article

What makes up a published data source

A Tableau data source consists of the following:

The data connection information that describes what data you want to bring in to Tableau for analysis. When you connect to the data in Tableau Desktop, you can create joins, including joins between tables from different data types. You can rename fields on the Data Source page to be more descriptive for the people who work with your published data source.

An extract, if you decide to create one. Guidelines for when to create an extract are included below, as well as in the additional resources.

Information about how to access or refresh the data. The connection also includes access information. Examples of this type of information include:

For more information, see Set Credentials for Accessing Your Published Data.

Customization and cleanup that helps you and others use the data source efficiently. When you’re working with your view, you can add calculations, sets, groups, bins, and parameters; define any custom field formatting; hide unused fields; and so on.

All of these refinements become part of the metadata contained in the data source that you publish and maintain.

Preparing a data source for publishing

When you publish a data source, consider these best practices:

When to use an extract

Under the following conditions you might be required or choose to publish an extract instead of connecting live.

Publishing data to Tableau Online that it cannot reach directly

Tableau Online in the cloud cannot reach data sources that you maintain on your local network. Depending on the connection, you might be required to publish an extract and set up a refresh schedule using Tableau Bridge.

Some cloud-hosted data sources always require extracts. These include Google Analytics, Salesforce.com, Oracle, OData, and some ODBC data sources. You can set up refresh schedules for some of these data sources directly on Tableau Online; for others you use Tableau Bridge.

Web data connector data sources always require extracts. If you connect to the data source using standard user name and password authentication, you can refresh it using Tableau Bridge. If you connect to the WDC data source using OAuth authentication, you will need to use an alternative method to refresh it.

For more about how Tableau Bridge supports both extract and live connections to data Tableau Online cannot reach directly, see Use Tableau Bridge to Expand Data Freshness Options in the Tableau Online Help.

Improving performance

Even if the server supports live connections to your data, an extract might make more sense. For example, if the database is large or the connection slow, you can extract a subset that includes only the pertinent information. The extract can be easier and faster to work with than connecting live.

In cases where you can use a live connection or an extract that you refresh on a schedule, you might want to experiment with both options to see which works best for you.

Enabling functionality the data source does not inherently support

For example, suppose you want to use the Median function with SQL Server data.

To learn more about creating data extracts, see Extract Your Data.

Publishing data separately or embedded in workbooks

You can publish data sources as standalone resources that workbooks connect to, or you can publish workbooks with the data sources included within them.

When you publish a workbook, if any connection specifies anything other than a Tableau data source published to the same project, the data is published as part of the workbook (sometimes referred to as embedded in the workbook).

When data is embedded in a workbook:

Each way of publishing has its advantages. The table below shows a few common points of comparison. It is not a comprehensive list, and these are generalizations. How these and other factors apply to you are specific to your environment.

Published separately Embedded in workbook

Publishing data sources is a step toward centralizing data management. You can create policies geared toward minimizing data source proliferation and helping people find the right data for the work they do.

Each embedded data source has a separate connection to the data.

Each has the potential to show something different than the other at any given time (and data source proliferation is common).

Meant to be shared; becomes available for other Tableau users to connect to.

Data is available only inside the workbook; it is not available for other Tableau Desktop users to connect to.

Without content management and self-service guidelines, seeing a long list of data sources to connect to can be confusing to users who rely on the data to do their work, and is more difficult to manage on the server.

Users create their own connections, and they know exactly what data they’re getting.

Someone who changes a shared data source might be uncertain or unaware of the effects that those changes have on connected workbooks.

Changing the data requires opening the workbook, where you can see the result of the change.

Even if effects of data source changes on connected workbooks are planned, updating those connected workbooks is cumbersome.

Same as above; however, if multiple workbooks use similar data and need to be updated, it might be worth connecting to a published data source instead.

Extracts can be refreshed on a schedule. You set up one refresh schedule for the extract, and all workbooks that connect to it always show the most current data.

Embedded extracts that aren’t refreshed can be useful for showing snapshots in time.

If you want to keep the data fresh, each workbook must have its own refresh schedule.

Generally helps you to optimize performance on the server or site.

Performance might be affected when the server contains multiple workbooks that connect to the same original data, and each workbook has its own refresh schedule.

Keeping extracts up-to-date

When you publish a data source with an extract, you can refresh it on a schedule. The way you schedule refreshes depends on the data source type and whether you're publishing to Tableau Server or Tableau Online.

For more information, see the following topics: 

Additional resources

Disclaimer: Although we make every effort to ensure these links to external websites are accurate, up to date, and relevant, Tableau cannot take responsibility for the accuracy or freshness of pages maintained by external providers. Contact the external site for answers to questions regarding its content.