Show Table of Contents
Publishing data sources to Tableau Online or Tableau Server is integral to maintaining a single source for your data. Publishing also enables sharing data among colleagues; including those who don’t use Tableau Desktop, but have permission to edit workbooks in the web editing environment.
Updates to a published data source flow to all connected workbooks, whether the workbooks themselves are published or not.
A Tableau data source consists of the following:
The data connection information that describes what data you want to bring in to Tableau for analysis. When you connect to the data in Tableau Desktop, you can create joins, including joins between tables from different data types. You can rename fields on the Data Source page to be more descriptive for the people who work with your published data source.
An extract, if you decide to create one. Guidelines for when to create an extract are included below, as well as in the additional resources.
Information about how to access or refresh the data. The connection also includes access information. Examples of this type of information include:
The path to an original Excel file.
Embedded credentials or OAuth access tokens for accessing the data directly.
Alternatively, no credentials, so that users are prompted to enter them when they want to access the data (whether it’s to view a workbook that connects to it, or to connect a new workbook to it).
For more information, see Set Credentials for Accessing Your Published Data.
Customization and cleanup that helps you and others use the data source efficiently. When you’re working with your view, you can add calculations, sets, groups, bins, and parameters; define any custom field formatting; hide unused fields; and so on.
All of these refinements become part of the metadata contained in the data source that you publish and maintain.
When you publish a data source, consider these best practices:
Create the connection for the information you want to bring into Tableau and do any customization and cleanup that will help you and others use the data source efficiently.
If appropriate, create an extract of the data you want to publish. For more information, see the following section, When to publish an extract.
Develop a data source naming convention.
After publishing a data source you cannot rename it directly. Instead, you need to publish a new copy with the new name, and then update all workbook connections. A well-considered naming convention can also help other users of the data deduce which data source to connect to.
Consider designating the following roles among your Tableau users:
A data steward (or team) who creates and publishes the data sources for the Tableau community, which meet your organization’s data requirements.
A site administrator who manages published content, extract refreshes, and permissions on the server you publish to (Tableau Server or Tableau Online).
Central management helps to avoid data source proliferation. Authors who connect to managed data can be confident that the answers they find in it reflect the current state of the business.
Under the following conditions you might be required or choose to publish an extract instead of connecting live.
Tableau Online in the cloud cannot reach data sources that you maintain on your local network. For these data sources, you must publish an extract and set up a refresh schedule using the Tableau Online sync client.
Some cloud-hosted data sources always require extracts. These include Google Analytics, Salesforce.com, Oracle, OData, and some ODBC data sources. You can set up refresh schedules for some of these data sources directly on Tableau Online; for others you use the sync client.
Web data connector data sources always require extracts. If you connect to the data source using standard user name and password authentication, you can refresh it using the sync client. If you connect to the data source using OAuth authentication, you will need to republish the data source to refresh it.
For more information, see Get your Data to Tableau Online in the Tableau Online Help.
Even if the server supports live connections to your data, an extract might make more sense. For example, if the database is large or the connection slow, you can extract a subset that includes only the pertinent information. The extract can be easier and faster to work with than connecting live.
In cases where you can use a live connection or an extract that you refresh on a schedule, you might want to experiment with both options to see which works best for you.
For example, suppose you want to use the Median function with SQL Server data.
To learn more about creating data extracts, see Extract Your Data.
You can publish data sources as standalone resources that workbooks connect to, or you can publish workbooks with the data sources included within them.
Each way of publishing has its advantages. The table below shows a few common points of comparison. It is not a comprehensive list, and these are generalizations. How these and other factors apply to you are specific to your environment.
|Published separately||Embedded in workbook|
Publishing data sources centralizes data management, enables policies around “certified” data and governance, and can help to minimize data source proliferation.
Each embedded data source has a disparate connection to the data.
Each has the potential to show something different than the other at any given time (and data source proliferation is common).
Meant to be shared; becomes available for other Tableau users to connect to.
Data is available only inside the workbook; it is not available for other Tableau Desktop users to connect to.
Extracts can be refreshed on a schedule. You set up one refresh schedule for the extract, and all workbooks that connect to it always show the most current data.
Embedded extracts that aren’t refreshed can be useful for showing snapshots in time.
If you want to keep the data fresh, each workbook must have its own refresh schedule.
Generally helps you to optimize performance on the server or site.
Performance might be affected when the server contains multiple workbooks that connect to the same original data, and each workbook has its own refresh schedule.
When you publish a data source with an extract, you can refresh it on a schedule. The way you schedule refreshes depends on the data source type and whether you're publishing to Tableau Server or Tableau Online.
For more information, see the following topics:
Data Server—Training video by Tableau, with a helpful overview of data sources and publishing.
A version-agnostic, three-part series by Gordon Rose on the Tableau blog. It includes an in-depth look at the extract's file structure, guidelines for when to use extracts, and best practices.
Posts by Tableau Zen Master Jonathan Drummey on his blog Drawing with Numbers. Includes tips on extracts, explains the different file types, describes different publishing scenarios. (Read the comments, too.)
From the blog maintained by The Information Lab, a Tableau Gold Partner.
Disclaimer: Although we make every effort to ensure these links to external websites are accurate, up to date, and relevant, Tableau cannot take responsibility for the accuracy or freshness of pages maintained by external providers. Contact the external site for answers to questions regarding its content.