Studio
We are going to continue uncovering what we need to know about the Walmart dataset from the exercises. First, add a new spreadsheet with the same dataset to your workbook so you have lots of room to work with.
- Calculate all 6 summary statistics with the
Weekly_Sales
column. Based on the results, do you think that the weekly sales data seems widely dispersed? - Sales may differ if it is a holiday week. We can use the
FILTER
function to return theWeekly_Sales
column values for when it is a holiday week and pass that to any of our statistical functions. Try it out with the guidance of the documentation and find the average weekly sales for holiday weeks and non-holiday weeks. Which average is higher? Does the result surprise you? - Use visualizations to see what happens if you chart the unemployment numbers versus the weekly sales. Do you see anything worth noting here? What happens if you switch what is on the x-axis and the y-axis?
- Pick one other variable to compare with weekly sales in a visualization. What chart worked best for you? Did you gain any insights from this visualization?
- CPI helps measure inflation. At any point in this dataset, was there deflation?
- Is there a column in this dataset that produces a bell curve?
- Is there any data in this dataset that seems to be an outlier or is otherwise something that you are concerned about affecting your analysis?
After you have answered these questions, partner up with a classmate and share your work. Was there something that they discovered in their EDA that you didn’t find? Are there any anomalies or concerns you two have about the data?