Was this page helpful?
Yes No
Have a comment? Please leave it here.Thanks for your feedback!
All Tableau Help > Tableau Prep Help > 

Clean and Shape Data

Clean and shape data to make it easier to combine with other data or to make it simpler for other people to understand.

In this article

Build your flow

Pivot your data

Apply cleaning operations

Merge fields

Cleaning (fixing) variations of the same value

Aggregate and group values

Build your flow

After you connect to the data that you want to include in your flow, begin cleaning and shaping your data by adding new steps to the flow. Hover over a step until the plus icon appears, and then click the icon to display options.

Use these options to build your flow:

Remove steps from the flow

At any point in the flow, you can remove steps or the flow lines between steps.

(Back to top)


Pivot your data

Sometimes analyzing data from a spreadsheet or crosstab format can be difficult in Tableau. Tableau prefers data to be "tall" instead of "wide", which means that you often have to pivot your data from columns to rows so that Tableau can evaluate it properly.

With Tableau Prep, you can pivot on one or more groups of fields to get the results you want from your data. Simply select the fields that you want to work with, pivot the data from columns to rows, and then interact directly with the results to get your data looking just the way you want it.

Note: Pivoting on multiple groups of fields is not supported in Tableau Prep 2018.1.1. Editing a pivot that includes pivoted columns on multiple field groups in Tableau Prep version 2018.1.1 can result in errors or unexpected results.

You can also use Tableau Prep's smart default naming feature to automatically rename your pivoted fields and values.

To pivot your data:

  1. Connect to your data source.

  2. Drag the table that you want to pivot to the Flow pane.

  3. Click the plus icon, and select Add Pivot from the context menu.

  4. (Optional) Search for fields to pivot.

  5. Select one or more fields from the left pane, and drag them to the Pivot1 Values column in the Pivoted Fields pane.

  6. (Optional) In the Pivoted Fields pane, click the plus icon to add more columns to pivot on, then repeat the previous step to select more fields to pivot.

    Note: You must select the same number of fields that you selected in Step 5. For example if you selected 3 fields to initially pivot on, then each subsequent column that you pivot on must also contain 3 fields.

  7. (Optional) Select the Automatically rename pivoted fields and values check box to enable Tableau Prep to rename the new pivoted fields using common values in the data. If no common values are found, the default name is used.

    Your results appear immediately in both the Pivot Results pane and the data grid.

  8. If you didn't enable the default naming option in the previous step, edit the names of the fields. You can also edit the names of the original fields in this pane to best describe the data.

  9. Rename the new Pivot step to keep track of your changes. For example "Pivot months".

Example: Pivoting on multiple fields

This example shows a spreadsheet for pharmaceutical sales, taxes and totals by month and year.

By pivoting the data you can create rows for each month and year and individual columns for sales, taxes and totals so that Tableau can more easily interpret this data for analysis.

Watch "pivot on multiple field" in action.

(Back to top)


Apply cleaning operations

You clean data by applying cleaning operations such as filtering, adding, renaming, splitting, or removing fields. As you make changes to your data, annotations are added to the corresponding step in the Flow pane.

The different types of cleaning operations are represented by icons over cleaning steps. If more than four types of operations are applied to a cleaning step, an ellipsis displays over the step. Hover over these icons to view annotations showing applied operations and the order in which they are performed.

You can also select a step and then expand the Changes pane in the Profile pane toolbar, where you can view the details for each change, edit or remove your changes, or drag changes up or down to change the order in which they're applied.

Cleaning annotation Changes pane

To apply cleaning operations to a field:

  1. In the Profile pane, select the field you want to make changes to.

  2. From either the drop-down arrow for the field or the toolbar, select from the following options:

  3. Review the results of these operations in the Profile pane or data grid.

(Back to top)


Merge fields

If you have fields that contain the same values but are named differently, you can easily merge them to combine them into one by dragging one field on top of the other. When you merge the fields, the field name of the target field persists.

Example:

Wildcard union results in 3 fields with the same values Merge 3 fields into 1

To merge fields, do one of the following:

(Back to top)


Cleaning (fixing) variations of the same value

Multiple variations of the same value can prevent you from accurately summarizing your data. You can quickly and easily correct these variations using the following options in Tableau Prep.

Note: Any edits that you make to the values must be compatible with the field data type.

Edit a single value

  1. In the Profile pane, click the value you want to edit, and enter the new value. Alternatively, right-click a value and click Edit Value. The change is recorded in the Changes pane on the left side of the screen.

  2. View the results in the Profile pane and data grid.

Edit multiple values

You have several options to edit multiple values at once. For example, you can manually edit multiple values in a field at one time, use quick cleaning operations to remove punctuation for all values in a field, or automatically group values together using fuzzy-match algorithms that find similar values.

Note: When you map multiple values to a single value, the original field shows a group icon next to the value when you're in the Group and Replace editor, showing you which values are grouped together.

Edit multiple values inline

  1. In the Profile pane, select the field you want to edit.

  2. Press Ctrl or Command (on macOS), and select the values that you want to edit.

  3. Right-click, and select Edit Value from the context menu.

  4. Type your changes into one of the selected fields, and press Enter to group the selected fields together and apply the edited text to them.

Edit multiple values using quick cleaning operations

  1. In the Profile pane, select the field you want to edit. This option applies only to text fields.

  2. Click the drop-down arrow, select Clean, and then select one of the following options:

    You can stack operations to apply multiple cleaning operations to the fields. For example first select Clean > Remove Numbers and then select Clean > Remove Punctuation to remove all numbers and punctuation from the field values.

  3. To undo your changes, click the Undo arrow at the top of the Flow pane, or remove the change from the change list.

Edit multiple values manually using Group and Replace

Use Group and Replace to map the value of a field from one value to another. You can even add new values to set up mapping relationships to organize your data.

For example, let’s say you have three values in a field: My Company, My Company Incorporated, and My Company Inc. All these values represent the same company, My Company. You can use Group and Replace to map the values My Company Incorporated and My Company Inc to My Company, so that all three values appear as My Company in the field.

  1. In the Profile pane, select the field you want to edit.

  2. Click the drop-down arrow, and select Group and Replace > Manual Selection from the context menu.

  3. In the left pane of the Group and Replace editor, select the field value that you want to use as the grouping value. This value now shows at the top of the right pane.

  4. In the lower section of the right pane in the Group and Replace editor, select the values you want to add to the group.

    To remove values from the group, in the upper section of the right pane in the Group and Replace editor, clear the check box next to the values.

Edit multiple values using Group and Replace with fuzzy match

To search for and automatically group similar values, use one of the fuzzy match algorithms. Field values are grouped under the value that appears most frequently. Review the grouped values and add or remove values in the group as needed.

Choose one of the following options to group values:

To automatically group and replace values using fuzzy match, do the following:

  1. In the Profile pane, select the field you want to edit.

  2. Click the drop-down menu, select Group and Replace and select one of these options:

    Tableau Prep finds and groups values that match and replaces them with the value that occurs most frequently in the group.

  3. Review the groupings and manually add or remove values or edit them as needed. Then click Done.

Add and identify values that aren't present in the data set

If you want to map values in your data set to a new value that doesn't exist, you can add it using Group and Replace. To easily identify any values that are not in the data set, these values are marked with a red dot next to the value name in the Group and Replace editor.

For example, in the screenshot below, California, Arizona, and Colorado, Idaho, Montana, and New Mexico aren’t in the data set.

Some reasons why a value might not be in the data set include the following:

To add a new value:

  1. In the Profile pane, select the field you want to edit.

  2. Click the drop-down arrow, and select Group and Replace > Manual Selection from the context menu.

  3. In the left pane of the Group and Replace editor, click the plus to add a new value.

  4. Type a new value in the field and press Enter to add it.

  5. In the right pane, select the values that you want to map to the new value.

  6. (Optional) To add additional new values to your mapped value, click the plus button in the right pane in the Group and Replace editor.

(Back to top)


Aggregate and group values

Sometimes you’ll need to adjust the granularity of some data, either to reduce the amount of data produced from the flow, or to align data with other data you might want to join or union together. For example, you might want to aggregate sales data by customer before joining a sales table with a customer table.

If you need to adjust the granularity of your data, use the Add Aggregate option to create a step to aggregate or group data. Whether data is aggregated or grouped depends on the data type (string, number, or date).

  1. In the Flow pane, click the plus icon, and select Add Aggregate. A new aggregation step displays in the Flow pane and the Profile pane updates to show the aggregate and group profile.

  2. To group or aggregate fields, drag them from the left pane to one of the columns in the right pane.

    You can also:

    Fields are distributed between the Grouped Fields and Aggregated Fields columns based on their data type. Click the group or aggregation type (for example, AVG or SUM) headings to change the group or aggregation type.

    In the data grids below the aggregation and group profile, you can see a sample of the members of the group or aggregation .

(Back to top)