Was this page helpful?
Yes No
Have a comment? Please leave it here.Thanks for your feedback!

Home > 

Design Views and Analyze Data > Publish Data Sources and Workbooks > Best Practices for Published Data Sources

Best Practices for Published Data Sources

Publishing data sources to Tableau Online or Tableau Server is integral to maintaining a single source for your data. Publishing also enables sharing data among colleagues; including those who don’t use Tableau Desktop, but have permission to edit workbooks in the web editing environment.

Updates to a published data source flow to all connected workbooks, whether the workbooks themselves are published or not.

In this topic

What makes up a published data source

A Tableau data source consists of the following:

The data connection information that describes what data you want to bring in to Tableau for analysis. When you connect to the data in Tableau Desktop, you can create joins, including joins between tables from different data types. You can rename fields on the Data Source page to be more descriptive for the people who work with your published data source.

An extract, if you decide to create one. Guidelines for when to create an extract are included below, as well as in the additional resources.

Information about how to access or refresh the data. The connection also includes access information. Examples of this type of information include:

For more information, see Set Credentials for Accessing Your Published Data.

Customization and cleanup that helps you and others use the data source efficiently. When you’re working with your view, you can add calculations, sets, groups, bins, and parameters; define any custom field formatting; hide unused fields; and so on.

All of these refinements become part of the metadata contained in the data source that you publish and maintain.

Preparing a data source for publishing

When you publish a data source, consider these best practices:

When to publish an extract

Under the following conditions you might be required or choose to publish an extract instead of connecting live.

Publishing data to Tableau Online that it cannot reach directly

Tableau Online in the cloud cannot reach data sources that you maintain on your local network. For these data sources, you must publish an extract and set up a refresh schedule using the Tableau Online sync client.

Some cloud-hosted data sources always require extracts. These include Google Analytics, Salesforce.com, Oracle, OData, and some ODBC data sources. You can set up refresh schedules for some of these data sources directly on Tableau Online; for others you use the sync client.

Web data connector data sources always require extracts. If you connect to the data source using standard user name and password authentication, you can refresh it using the sync client. If you connect to the data source using OAuth authentication, you will need to republish the data source to refresh it.

For more information, see Get your Data to Tableau Online in the Tableau Online Help.

Improving performance

Even if the server supports live connections to your data, an extract might make more sense. For example, if the database is large or the connection slow, you can extract a subset that includes only the pertinent information. The extract can be easier and faster to work with than connecting live.

In cases where you can use a live connection or an extract that you refresh on a schedule, you might want to experiment with both options to see which works best for you.

Enabling functionality the data source does not inherently support

For example, suppose you want to use the Median function with SQL Server data.

To learn more about creating data extracts, see Extract Your Data.

Publishing data separately or embedded in workbooks

You can publish data sources as standalone resources that workbooks connect to, or you can publish workbooks with the data sources included within them.

Each way of publishing has its advantages. The table below shows a few common points of comparison. It is not a comprehensive list, and these are generalizations. How these and other factors apply to you are specific to your environment.

Published separately Embedded in workbook

Publishing data sources centralizes data management, enables policies around “certified” data and governance, and can help to minimize data source proliferation.

Each embedded data source has a disparate connection to the data.

Each has the potential to show something different than the other at any given time (and data source proliferation is common).

Meant to be shared; becomes available for other Tableau users to connect to.

Data is available only inside the workbook; it is not available for other Tableau Desktop users to connect to.

Extracts can be refreshed on a schedule. You set up one refresh schedule for the extract, and all workbooks that connect to it always show the most current data.

Embedded extracts that aren’t refreshed can be useful for showing snapshots in time.

If you want to keep the data fresh, each workbook must have its own refresh schedule.

Generally helps you to optimize performance on the server or site.

Performance might be affected when the server contains multiple workbooks that connect to the same original data, and each workbook has its own refresh schedule.

Keeping extracts up-to-date

When you publish a data source with an extract, you can refresh it on a schedule. The way you schedule refreshes depends on the data source type and whether you're publishing to Tableau Server or Tableau Online.

For more information, see the following topics: 

Additional resources

Disclaimer: Although we make every effort to ensure these links to external websites are accurate, up to date, and relevant, Tableau cannot take responsibility for the accuracy or freshness of pages maintained by external providers. Contact the external site for answers to questions regarding its content.