Learn how to properly connect and structure data in Tableau Public, including preparing it for use with common file formats and understanding the importance of clean, well-organized data. Understand the impact of data formatting on Tableau’s functionality and how to avoid common pitfalls that disrupt analysis.
Key Insights
- Tableau Public supports a variety of local file formats including Excel, CSV, JSON, PDF, and spatial files, and can connect to live data sources like Google Sheets and OData without requiring the paid version.
- Data must be structured in a table format—rows as records and columns as categories—and must be reviewed for accuracy before importing into Tableau, as errors can undermine the entire analysis process.
- The Corporate Superstore Sales Data is frequently used in training due to its clean structure and variety of data types, while poorly formatted data (e.g., with non-data headers or years listed as columns) can break Tableau and require restructuring.
This lesson is a preview from our Tableau Course Online (includes software). Enroll in this course for detailed lessons, live instructor support, and project-based training.
There we go. Now we're going to talk about working with your data. Connecting your data in Tableau.
The formats that you'll be able to use in the free version of Tableau Public are Excel files, CSV, text files. You can even use JSON, PDF, statistical, and spatial file formats. Connecting your data to Tableau.
You can also use online versions of datasets, and that would be Google Sheets, a web data connection that somehow connects you to data through some kind of web data connector, OData. Server, you need the paid version of Tableau. Structuring your data.
Cleaning up data with Data Interpreter. We'll talk about that tomorrow. Creating relationships between sheets.
Sorting and filtering Tableau data. Again, just another list of the data types. Local files versus server-based.
Structuring your data. Tableau Desktop works best with data that is in tables, formatted like a spreadsheet. That is, data stored in rows and columns, with column headers in the first row.
So, what should be a row or column? Well, rows usually represent records of information, and columns usually represent categories of data. Types. Data types like price, title.
So, all similar data in a column or a field. Rows are going to have different information. Now, the thing that's important to mention, before you even bring the information to Tableau, is it's important to review your data.
You cannot make assumptions about the accuracy of your source, because it'll take much more work to back out of your data if you bring wrong information in. You'll have to start all over. You won't be able to trust anything.
It is a very common mistake to make that you should avoid. So, review your data, because garbage in, garbage out. And every once in a while in the slides, we have links that refer to stuff that can help you with additional resources.
In your folder for the class, there's a folder called Corporate Superstore Sales Data. 90% of the tutorials on Tableau use Superstore Dataset, because it's pre-formatted to work with Tableau. So, that's a good thing.
You don't have to worry about cleaning up the data, because it's pre-cleaned. 90% of the tutorials use the same information, and it has lots of data types. It has dates, text, numerical amounts.
It even has geographical information that we'll use to create maps tomorrow. Other files will be useful for different types of exercises, but we're mostly going to be working with the Superstore Sales Data. So, here is an example of the Superstore Sales Data.
You can feel free to review this file in your folder. I'm going to see if I can open it up here. Okay, my link didn't work.
I thought I created a link that takes me to the data. So, let me come out of this, and I'm going to jump over. I might just go in the folder if the link doesn't work.
Okay, so I had to double click on it. It looks like something's happening. I have the hour glass.
There we go. This is the Superstore Sales Data. This is the information we're going to import into Tableau.
So, you have orders. We have dates. We have geographic information.
You can see the way the information is arranged. You have column headers at the top. You have data down below.
There are no blank rows or columns. This is properly formatted data. We have a tab for people, and we have a tab for returns.
So, we're going to bring in these three tabs. I'm going to close this. So, that's data that's considered good.
We're going to take a at another type of data. This is not well structured data. We'll learn how to fix it tomorrow.
What's the problem with this? Someone decided that they wanted to include a title for this report. This is not related directly to the data. This is just like information.
This is going to break Tableau in terms of how you bring the information in. This is unnecessary information that should be deleted. So, these first three rows should be gone.
Another problem, and Tableau is not going to be able to fix this, the way the information is set up, it's set up so that you can easily read it. Aruba, and this is the life expectancy data. You'll be able to create the first visualization we saw in the class using this information.
This is the raw data, but all the years show up as individual columns. We shouldn't have it set up this way. If we did this correctly, we would have three columns.
What? Yes, we would only have three columns. What would those columns be? Country name, year, and the life expectancy for each year. So, you mean for every country, I'd have to repeat 1960 and then put all these numbers all the way? Yeah.
So, this would definitely increase the size of your data because you're going to have to duplicate a lot just so you can get three rows. This is confusing to Tableau, but we'll learn how to fix it. So, this is considered bad.
Data Interpreter is something that's automatically going to show up when it feels it can help you clean up your data, but it doesn't do everything. It does some things. Again, this is something we'll look at tomorrow.
That's why I put Tableau 2 in double quotes, I mean in parentheses. If you're really curious about Data Interpreter, this is what it looks like. How does it work? You just click this checkbox and it does its work for you and you're done.
That's it? Yep. I don't get to pick what? No, you don't get to pick what it does. Is it going to do everything? No, it's going to do what it wants to do, what it thinks it can do to help fix your data, and then you're on your own.
It's called Tableau Prep and you have to pay for it. Tableau Public Desktop is not made for cleaning data. It assumes that you already did that work, but it'll help you out with very common, simple issues.