Tableau Data Analysis and Visualization
Since we have started our data science bootcamp we have had Seaborn and Matplotlib shoved down our throats for visualization. Fortunately there are a lot more visualization tools to use for when we get out work field. A very popular one that a lot of companies use is Tableau. Tableau is an analytics platform for business intelligence and visualization. Some job titles that use Tableau are data analysts, business intelligence and you can just become a Tableau developer if you love it.
You can download the public version for free but you won’t get all the bell and whistles. With the Tableau public you can only use a limited amount of data types and you cannot connect to a server. Tableau Public allows you to use excel, text, JSON and PDF files.
One problem that I was running into when I was practicing with Tableau is that the data needs to be perfect before you upload it. If it is not, the headers and columns get whacky. If you take the time to clean the data it shouldn’t be a problem. I ended up using the trusty titanic CSV from kaggle. It uploaded and everything was smooth sailing from there.
I did not need to join any tables but you can if you need to. It is similar to SQL by being able to join tables on inner, left, right, and outer joins. Once you have your data joined you are ready to move onto the visualization. All you need to do is click the the sheet button in the bottom left corner.
You can see that Tableau breaks up you data into Dimensions and Measures automatically. Measures is usually numerical but you can easy move it to Dimensions by right clicking on the category. You can also group by and bin the data when you right click it. For my project I wanted to look to see survival rate based on class and where they embarked from. All I had to do was drag “Survived” to the Row shelf and “Embarked” to the Column shelf.
Now since the “Pclass” is a Measures I had to right click and “Group By” classes (1,2,3) move to the Dimensions. Once I did that I just dragged it to the Columns shelf with “Embarked”.
Embarked had a pesky Null in its column but that is no problem. All we have to do is click on Embarked while holding the command key and drag it to the Filters shelf. Once we drop it in there we just uncheck the Null box and it’s gone.
Finally we can hold command and drag the “SUM(Survived)” to the label box in the Marks shelf and we’ll have the values at the top of the bars.
Once you are all set with the way it looks you can move it to the dashboard to put it on a page with multiple graphs for a presentation. That’s it for a quick introduction to Tableau. I would suggest trying it out to see how you like it and if you would want to use it for your next presentation.
Credits below