"B3 or D3?" An Excel Case Study Simulator for Google Data Analytics Capstone Project -Case study 3 -

"B3 or D3?"

An Excel Case Study Simulator

 for Google Data Analytics Capstone Project 

- Case Study 3 -

________________________


SCENARIO

 

The purpose of this case study is:

 

- compare the price lists of 8 supermarkets (S1, S2, S3,... S8)

- and then purchase from the one offering the best deal.

 

Also, if we are driving in a hypothetical area where these supermarkets are located, we wonder if it is more convenient:

 

- Going shopping to the nearest supermarket, or

- Going shopping to another supermarket that offers the best deal.


______________________________________________

 HOW THIS CASE STUDY ORIGINATED

 

In this case study I tried answer one of the many challenges of everyday life, namely how to save on the shopping list.

 

I had only a general idea of how to deal with the topic, but no data source to draw from, neither other material to rely on.

 

So, after many brainstorming and sleepless nights, I finally got an idea: stay tuned!


______________________________________________

THE HISTORY

 

I compile a shopping list in an app: I’d like to buy foodstuffs products (12, the number is purely indicative).

 

The app sends my list to 8 supermarkets, S1, S2, ... S8, and they send me an offer.

 

Excel or Google Sheets help me calculate the average, the minimum or the maximum price, and visualize everything with beautiful graphs. I could also turn everything into a small case study!

 

I buy from the supermarket that offers me the best price, pay by credit card and everything is shipped home: problem solved.

 

Simple, isn't it? No.


______________________________________________

CHAPTER 1: LET'S GO BEYOND

 

What would happen, instead, if in a car trip:

 

1) I receive on my smartphone offers from these supermarkets...

2) and if these 8 supermarkets are on the border of the area where our car is.


______________________________________________

 CHAPTER 2: THE BIRTH OF THE SIMULATOR "B3 OR D3?"

 

Google Sheets is very powerful and free, but I have been using Excel for years.

 

So, for my case study and to answer the questions above, I created a Simulator.

 

I called it "B3 or D3?": the name is a play on words consisting in referring to some hypothetical Excel coordinates (coordinate cell B3 or coordinate D3, which however have nothing to do with my study).

 

The name should be understood as "Be free", that is freedom to choose to shop where we want, or opt for a data-driven decision ("Data-Driven Decision").

 

This Simulator is a Data Analysis tool, located at the confluence of creativity, aesthetics, functionality, a game, an interactive way of learning, everything like a dashboard.

 

______________________________________________

 CHAPTER 3: LET'S GET TO WORK

 

Screenshot of the Simulator "B2 or D3"?


In the half top of an Excel sheet (see screenshot above) I placed two ranges.

 

Please, note that this screenshot was taken on the Italian version of Excel, but in the English one the decimal delimiter is a ".", not the comma ","

 

On the top right there are 8 supermarkets that send me the prices of Beans, Bread, Butter, Carrots, Cheese, Coffee, Eggs, Fish, Flour, Milk, Oil and Pasta.

 

The prices for each product category are controlled, verified and made public by supermarkets and, to ensure this situation, are randomly generated and fall within a range of credible values.

 

I choose to buy from the supermarket that offers me the lowest price, calculated by Excel.

 

In the range on the left, to simulate the possibility of receiving these purchase offers on the smartphone while driving:

 

1) I created a grid of 12 by 12 cells, each one having sides of hypothetical measure 1 Km (obviously not in scale);

 

2) the 8 supermarkets (S1, S2, S3... S8) are on the border of this area;

 

3) I represented my car with any letter, which I place in a cell wherever I want: in my case I entered the letter "X";

 

4) If I want to place the car in another cell, I click and delete the previous one: the data is sent again randomly;

 

5) I assume that my car travels 20 km with a liter of fuel and that...

 

6)... Fuel costs 1.80 euros per liter;

 

7) It is necessary to calculate the travel expenses to reach the supermarket closest to the location of the car or to reach the one that instead offers me the most convenient shopping list;

 

8) Travel expenses amount to (Km_to_reach_a_supermarket / 20) * 1.80

 

In the lower half of the Excel sheet, from left to right, the system shows in real time:

 

- a summary range consisting of 4 columns and 9 rows: the cells are filled with a gradation of colors ranging from red (longer distance or supermarket where shopping is more expensive) to blue (supermarket closest to car position or cheaper supermarket).

 

- a bar chart, "Best offer or nearest distance", and one to the right "Supermarket: Total Cost and Distance", which shows the differences in price and distance between the car and the 2 supermarkets

 

- another range, consisting of 6 columns and 3 rows, which shows the results in a "classic" way

 

- "Last but not least" I created an area where a pseudo artificial intelligence summarizes in a textual way what emerged from the analysis

 

As in an interactive dashboard, the graphs and text update automatically every time we carry out a new simulation.

 

To restart, we must first delete a car symbol used in a previous simulation and place it in another cell: instantaneously, other prices will be randomly generated.


______________________________________________

CHAPTER 4: THE CRUCIAL QUESTION

 

Given these premises, then the fateful question is:

 

Apart from the particular case in which the choice would be unique if the nearest supermarket was also the one that offers the best offer ...

 

... Would we choose the supermarket closest to the location of the car or would we go to another, perhaps further away, but which offered us the most advantageous offer at that time?

 

After this long dissertation, we recapitulate everything using the well-known steps of the data analysis process, which are Ask, Prepare, Process, Analyze, Share and Act.

 

I am also aware that this is an interactive simulation and that many answers may have already been provided, implicitly or explicitly, in previous chapters.


______________________________________________

ASK

 

What topic are you exploring?

Household economics

 

What is the problem you are trying to solve?

The ability to save on shopping list, based on different circumstances

 

Who is your audience?

 

All people who want to control the family budget, especially trying to save on the shopping list

 

Deliverables

 

I hope the results provided by the Simulator "B3 or D3?" stimulate reflection on the possibility of being able to save on the shopping list, depending on certain circumstances


______________________________________________

 PREPARE

 

The datasets come from and are incorporated as feature of the Simulator and consist of the price list (randomly generated) of each of the 8 supermarkets.

 

The numerical data, expressed in Euro, are provided virtually already cleaned and verified by the marketing managers of the supermarkets and their price list; in real life, is supposed to be public and verifiable visiting their websites.

 

In addition, the data are used in calculations made automatically by the simulator; the distance of the car (depending on where we have placed it in the area) from the nearest supermarket and all formulas are available on request.

 

The data are stored on the computers of supermarkets.

 

There is no bias, because the data source sends random data each time but still confined between a minimum and maximum credible value for each category of food.

 

Is Data ROCCC?

 

Limited to this case it is assumed that our data source can be considered ROCCC, especially in the "Current" aspect.

 

There are no privacy, security and accessibility issues, as data "live" in the virtual environment of the Simulator.

 

Deliverables

 

The datasets come from and are incorporated as an intrinsic feature of the Simulator and consist of the price list (randomly generated) of each of the 8 supermarkets.

 

The price of fuel per liter, is assumed to be 1.80 euros

 

It is also assumed that the car travels 20 km with 1 liter of fuel

 

In addition, the geographical position where we place our car is in a grid whose cells measure virtually 1 Km on each side (not in scale)

______________________________________________

 PROCESS

 

Everything was created in Excel 2019 for Mac and also tried in Excel for Windows.

 

The formulas I created identify the supermarket closest to the position in the grid, the supermarket that provides the best offer, travel expenses, Km and more.

 

Several simulations were carried out and screenshots taken.

 

Sometimes it has been necessary to copy and paste long Excel formulas into Word to check that parentheses or typos are correctly closed.

 

I posted this case study on https://mycasestudyhere.blogspot.com/

 

Deliverables

 

The simulator "B3 or D3?" and/or the formulas used are available on request.

 

These are mainly formulas such as vertical search, index, match, string concatenation, and conditional formatting.


______________________________________________

ANALYZE

After carrying out several simulations, looked at the graphs, data and even the textual response of a "pseudo artificial intelligence"...

- whereas the prices of shopping lists provided by supermarkets do not differ much from each other... 

- ... whereas the virtual area in which they are located, and/or the relative short distance between them and the position of the car, is achieved at...

Deliverables
 
... choosing between two supermarkets, for example S3 and S4, depends on personal priorities. 


- If saving money is the priority, then S3 would be more convenient, as its total cost is lower than S4's.

- If convenience and proximity are the priority, then S4 would be more convenient, as it is closer.
 

The final the decision depends on whether the goal is to save more on the total cost or to save time and effort by traveling a shorter distance. 

_______________________________________________________________________
SHARE

 

This case study, in the form of interactive simulation, is structured to provide bar charts, tables, comparative graphs with bars and lines and also a textual description as a sort of chat based on a pseudo artificial intelligence.

 

Deliverables

 

I paid deep attention to detail in the data visualization in particular (shape, size, color and labels) and in general to the whole design of the Simulator.


______________________________________________

ACT


As already emerged in the Analyze phase, choosing between two supermarkets, depends on whether the goal is to save more on the total cost or to save time and effort by traveling a shorter distance.

 

Deliverables

Also, probably because in the simulation:
 
- the final prices provided by supermarkets do not differ much from each other;
 
- the distances in relative terms are not excessive;
 
- or even because shopping in the nearest supermarket could be more convenient in terms of distance but not in economic terms instead of going shopping in a supermarket further away but more that has slightly lower prices and therefore you can save time or in economic terms;
 
- or because, despite everything, we prefer to travel a longer also for fun, without taking into account weather or traffic conditions (rush hours etc.),
 
I can say that there is not an automatic way to make an "absolute" decision as the choice depends on individual preferences and needs.
 
For example, some people may prioritize saving time and effort by traveling a shorter distance, while others may be willing to travel further to save money.
 
I don't know if, implementing an algorithm, we could generate a recommendation.
 
An algorithm could take into account many other factors such as an increase in fuel prices, or very large variations in prices between supermarkets, or particular traffic conditions (rush hours) or weather forecasts, or stops in other areas before reaching supermarkets and so on.
 
In the case treated, the final decision should be based on the user's personal preferences.


______________________________________________

CONCLUSION

 

Google Data Analytics Course by Coursera was an incredible tool for me to strengthen, integrate, expand and group concepts that I already had about Data Analytics, but that were scattered here and there in my mind in a confusing way.

 Thank you Google and thank you to Coursera!
 
And thank you everyone who had the patience to read my case study.

If you want to try the Simulator, you can e-mail here: filippo.cervellera@gmail.com


Filippo "Phil" Cervellera


Italy

February 2023