10 Real World Data Science Case Studies Projects with Example
Top 10 Data Science Case Studies Projects with Examples and Solutions in Python to inspire your data science learning in 2023.
BelData science has been a trending buzzword in recent times. With wide applications in various sectors like healthcare , education, retail, transportation, media, and banking -data science applications are at the core of pretty much every industry out there. The possibilities are endless: analysis of frauds in the finance sector or the personalization of recommendations on eCommerce businesses. We have developed ten exciting data science case studies to explain how data science is leveraged across various industries to make smarter decisions and develop innovative personalized products tailored to specific customers.
Walmart Sales Forecasting Data Science Project
Downloadable solution code | Explanatory videos | Tech Support
Table of Contents
Data science case studies in retail , data science case study examples in entertainment industry , data analytics case study examples in travel industry , case studies for data analytics in social media , real world data science projects in healthcare, data analytics case studies in oil and gas, what is a case study in data science, how do you prepare a data science case study, 10 most interesting data science case studies with examples.
So, without much ado, let's get started with data science business case studies !
With humble beginnings as a simple discount retailer, today, Walmart operates in 10,500 stores and clubs in 24 countries and eCommerce websites, employing around 2.2 million people around the globe. For the fiscal year ended January 31, 2021, Walmart's total revenue was $559 billion showing a growth of $35 billion with the expansion of the eCommerce sector. Walmart is a data-driven company that works on the principle of 'Everyday low cost' for its consumers. To achieve this goal, they heavily depend on the advances of their data science and analytics department for research and development, also known as Walmart Labs. Walmart is home to the world's largest private cloud, which can manage 2.5 petabytes of data every hour! To analyze this humongous amount of data, Walmart has created 'Data Café,' a state-of-the-art analytics hub located within its Bentonville, Arkansas headquarters. The Walmart Labs team heavily invests in building and managing technologies like cloud, data, DevOps , infrastructure, and security.
Walmart is experiencing massive digital growth as the world's largest retailer . Walmart has been leveraging Big data and advances in data science to build solutions to enhance, optimize and customize the shopping experience and serve their customers in a better way. At Walmart Labs, data scientists are focused on creating data-driven solutions that power the efficiency and effectiveness of complex supply chain management processes. Here are some of the applications of data science at Walmart:
i) Personalized Customer Shopping Experience
Walmart analyses customer preferences and shopping patterns to optimize the stocking and displaying of merchandise in their stores. Analysis of Big data also helps them understand new item sales, make decisions on discontinuing products, and the performance of brands.
ii) Order Sourcing and On-Time Delivery Promise
Millions of customers view items on Walmart.com, and Walmart provides each customer a real-time estimated delivery date for the items purchased. Walmart runs a backend algorithm that estimates this based on the distance between the customer and the fulfillment center, inventory levels, and shipping methods available. The supply chain management system determines the optimum fulfillment center based on distance and inventory levels for every order. It also has to decide on the shipping method to minimize transportation costs while meeting the promised delivery date.
Begin Your Big Data Journey with ProjectPro's Project-Based PySpark Online Course !
Here's what valued users are saying about ProjectPro
Director Data Analytics at EY / EY Tech
Anand Kumpatla
Sr Data Scientist @ Doubleslash Software Solutions Pvt Ltd
Not sure what you are looking for?
iii) Packing Optimization
Also known as Box recommendation is a daily occurrence in the shipping of items in retail and eCommerce business. When items of an order or multiple orders for the same customer are ready for packing, Walmart has developed a recommender system that picks the best-sized box which holds all the ordered items with the least in-box space wastage within a fixed amount of time. This Bin Packing problem is a classic NP-Hard problem familiar to data scientists .
Whenever items of an order or multiple orders placed by the same customer are picked from the shelf and are ready for packing, the box recommendation system determines the best-sized box to hold all the ordered items with a minimum of in-box space wasted. This problem is known as the Bin Packing Problem, another classic NP-Hard problem familiar to data scientists.
Here is a link to a sales prediction data science case study to help you understand the applications of Data Science in the real world. Walmart Sales Forecasting Project uses historical sales data for 45 Walmart stores located in different regions. Each store contains many departments, and you must build a model to project the sales for each department in each store. This data science case study aims to create a predictive model to predict the sales of each product. You can also try your hands-on Inventory Demand Forecasting Data Science Project to develop a machine learning model to forecast inventory demand accurately based on historical sales data.
Get Closer To Your Dream of Becoming a Data Scientist with 70+ Solved End-to-End ML Projects
Amazon is an American multinational technology-based company based in Seattle, USA. It started as an online bookseller, but today it focuses on eCommerce, cloud computing , digital streaming, and artificial intelligence . It hosts an estimate of 1,000,000,000 gigabytes of data across more than 1,400,000 servers. Through its constant innovation in data science and big data Amazon is always ahead in understanding its customers. Here are a few data analytics case study examples at Amazon:
i) Recommendation Systems
Data science models help amazon understand the customers' needs and recommend them to them before the customer searches for a product; this model uses collaborative filtering. Amazon uses 152 million customer purchases data to help users to decide on products to be purchased. The company generates 35% of its annual sales using the Recommendation based systems (RBS) method.
Here is a Recommender System Project to help you build a recommendation system using collaborative filtering.
ii) Retail Price Optimization
Amazon product prices are optimized based on a predictive model that determines the best price so that the users do not refuse to buy it based on price. The model carefully determines the optimal prices considering the customers' likelihood of purchasing the product and thinks the price will affect the customers' future buying patterns. Price for a product is determined according to your activity on the website, competitors' pricing, product availability, item preferences, order history, expected profit margin, and other factors.
Check Out this Retail Price Optimization Project to build a Dynamic Pricing Model.
iii) Fraud Detection
Being a significant eCommerce business, Amazon remains at high risk of retail fraud. As a preemptive measure, the company collects historical and real-time data for every order. It uses Machine learning algorithms to find transactions with a higher probability of being fraudulent. This proactive measure has helped the company restrict clients with an excessive number of returns of products.
You can look at this Credit Card Fraud Detection Project to implement a fraud detection model to classify fraudulent credit card transactions.
New Projects
Let us explore data analytics case study examples in the entertainment indusry.
Ace Your Next Job Interview with Mock Interviews from Experts to Improve Your Skills and Boost Confidence!
Netflix started as a DVD rental service in 1997 and then has expanded into the streaming business. Headquartered in Los Gatos, California, Netflix is the largest content streaming company in the world. Currently, Netflix has over 208 million paid subscribers worldwide, and with thousands of smart devices which are presently streaming supported, Netflix has around 3 billion hours watched every month. The secret to this massive growth and popularity of Netflix is its advanced use of data analytics and recommendation systems to provide personalized and relevant content recommendations to its users. The data is collected over 100 billion events every day. Here are a few examples of data analysis case studies applied at Netflix :
i) Personalized Recommendation System
Netflix uses over 1300 recommendation clusters based on consumer viewing preferences to provide a personalized experience. Some of the data that Netflix collects from its users include Viewing time, platform searches for keywords, Metadata related to content abandonment, such as content pause time, rewind, rewatched. Using this data, Netflix can predict what a viewer is likely to watch and give a personalized watchlist to a user. Some of the algorithms used by the Netflix recommendation system are Personalized video Ranking, Trending now ranker, and the Continue watching now ranker.
ii) Content Development using Data Analytics
Netflix uses data science to analyze the behavior and patterns of its user to recognize themes and categories that the masses prefer to watch. This data is used to produce shows like The umbrella academy, and Orange Is the New Black, and the Queen's Gambit. These shows seem like a huge risk but are significantly based on data analytics using parameters, which assured Netflix that they would succeed with its audience. Data analytics is helping Netflix come up with content that their viewers want to watch even before they know they want to watch it.
iii) Marketing Analytics for Campaigns
Netflix uses data analytics to find the right time to launch shows and ad campaigns to have maximum impact on the target audience. Marketing analytics helps come up with different trailers and thumbnails for other groups of viewers. For example, the House of Cards Season 5 trailer with a giant American flag was launched during the American presidential elections, as it would resonate well with the audience.
Here is a Customer Segmentation Project using association rule mining to understand the primary grouping of customers based on various parameters.
Get FREE Access to Machine Learning Example Codes for Data Cleaning , Data Munging, and Data Visualization
In a world where Purchasing music is a thing of the past and streaming music is a current trend, Spotify has emerged as one of the most popular streaming platforms. With 320 million monthly users, around 4 billion playlists, and approximately 2 million podcasts, Spotify leads the pack among well-known streaming platforms like Apple Music, Wynk, Songza, amazon music, etc. The success of Spotify has mainly depended on data analytics. By analyzing massive volumes of listener data, Spotify provides real-time and personalized services to its listeners. Most of Spotify's revenue comes from paid premium subscriptions. Here are some of the examples of case study on data analytics used by Spotify to provide enhanced services to its listeners:
i) Personalization of Content using Recommendation Systems
Spotify uses Bart or Bayesian Additive Regression Trees to generate music recommendations to its listeners in real-time. Bart ignores any song a user listens to for less than 30 seconds. The model is retrained every day to provide updated recommendations. A new Patent granted to Spotify for an AI application is used to identify a user's musical tastes based on audio signals, gender, age, accent to make better music recommendations.
Spotify creates daily playlists for its listeners, based on the taste profiles called 'Daily Mixes,' which have songs the user has added to their playlists or created by the artists that the user has included in their playlists. It also includes new artists and songs that the user might be unfamiliar with but might improve the playlist. Similar to it is the weekly 'Release Radar' playlists that have newly released artists' songs that the listener follows or has liked before.
ii) Targetted marketing through Customer Segmentation
With user data for enhancing personalized song recommendations, Spotify uses this massive dataset for targeted ad campaigns and personalized service recommendations for its users. Spotify uses ML models to analyze the listener's behavior and group them based on music preferences, age, gender, ethnicity, etc. These insights help them create ad campaigns for a specific target audience. One of their well-known ad campaigns was the meme-inspired ads for potential target customers, which was a huge success globally.
iii) CNN's for Classification of Songs and Audio Tracks
Spotify builds audio models to evaluate the songs and tracks, which helps develop better playlists and recommendations for its users. These allow Spotify to filter new tracks based on their lyrics and rhythms and recommend them to users like similar tracks ( collaborative filtering). Spotify also uses NLP ( Natural language processing) to scan articles and blogs to analyze the words used to describe songs and artists. These analytical insights can help group and identify similar artists and songs and leverage them to build playlists.
Here is a Music Recommender System Project for you to start learning. We have listed another music recommendations dataset for you to use for your projects: Dataset1 . You can use this dataset of Spotify metadata to classify songs based on artists, mood, liveliness. Plot histograms, heatmaps to get a better understanding of the dataset. Use classification algorithms like logistic regression, SVM, and Principal component analysis to generate valuable insights from the dataset.
Explore Categories
Below you will find case studies for data analytics in the travel and tourism industry.
Airbnb was born in 2007 in San Francisco and has since grown to 4 million Hosts and 5.6 million listings worldwide who have welcomed more than 1 billion guest arrivals in almost every country across the globe. Airbnb is active in every country on the planet except for Iran, Sudan, Syria, and North Korea. That is around 97.95% of the world. Using data as a voice of their customers, Airbnb uses the large volume of customer reviews, host inputs to understand trends across communities, rate user experiences, and uses these analytics to make informed decisions to build a better business model. The data scientists at Airbnb are developing exciting new solutions to boost the business and find the best mapping for its customers and hosts. Airbnb data servers serve approximately 10 million requests a day and process around one million search queries. Data is the voice of customers at AirBnB and offers personalized services by creating a perfect match between the guests and hosts for a supreme customer experience.
i) Recommendation Systems and Search Ranking Algorithms
Airbnb helps people find 'local experiences' in a place with the help of search algorithms that make searches and listings precise. Airbnb uses a 'listing quality score' to find homes based on the proximity to the searched location and uses previous guest reviews. Airbnb uses deep neural networks to build models that take the guest's earlier stays into account and area information to find a perfect match. The search algorithms are optimized based on guest and host preferences, rankings, pricing, and availability to understand users’ needs and provide the best match possible.
ii) Natural Language Processing for Review Analysis
Airbnb characterizes data as the voice of its customers. The customer and host reviews give a direct insight into the experience. The star ratings alone cannot be an excellent way to understand it quantitatively. Hence Airbnb uses natural language processing to understand reviews and the sentiments behind them. The NLP models are developed using Convolutional neural networks .
Practice this Sentiment Analysis Project for analyzing product reviews to understand the basic concepts of natural language processing.
iii) Smart Pricing using Predictive Analytics
The Airbnb hosts community uses the service as a supplementary income. The vacation homes and guest houses rented to customers provide for rising local community earnings as Airbnb guests stay 2.4 times longer and spend approximately 2.3 times the money compared to a hotel guest. The profits are a significant positive impact on the local neighborhood community. Airbnb uses predictive analytics to predict the prices of the listings and help the hosts set a competitive and optimal price. The overall profitability of the Airbnb host depends on factors like the time invested by the host and responsiveness to changing demands for different seasons. The factors that impact the real-time smart pricing are the location of the listing, proximity to transport options, season, and amenities available in the neighborhood of the listing.
Here is a Price Prediction Project to help you understand the concept of predictive analysis which is widely common in case studies for data analytics.
Uber is the biggest global taxi service provider. As of December 2018, Uber has 91 million monthly active consumers and 3.8 million drivers. Uber completes 14 million trips each day. Uber uses data analytics and big data-driven technologies to optimize their business processes and provide enhanced customer service. The Data Science team at uber has been exploring futuristic technologies to provide better service constantly. Machine learning and data analytics help Uber make data-driven decisions that enable benefits like ride-sharing, dynamic price surges, better customer support, and demand forecasting. Here are some of the real world data science projects used by uber:
i) Dynamic Pricing for Price Surges and Demand Forecasting
Uber prices change at peak hours based on demand. Uber uses surge pricing to encourage more cab drivers to sign up with the company, to meet the demand from the passengers. When the prices increase, the driver and the passenger are both informed about the surge in price. Uber uses a predictive model for price surging called the 'Geosurge' ( patented). It is based on the demand for the ride and the location.
ii) One-Click Chat
Uber has developed a Machine learning and natural language processing solution called one-click chat or OCC for coordination between drivers and users. This feature anticipates responses for commonly asked questions, making it easy for the drivers to respond to customer messages. Drivers can reply with the clock of just one button. One-Click chat is developed on Uber's machine learning platform Michelangelo to perform NLP on rider chat messages and generate appropriate responses to them.
iii) Customer Retention
Failure to meet the customer demand for cabs could lead to users opting for other services. Uber uses machine learning models to bridge this demand-supply gap. By using prediction models to predict the demand in any location, uber retains its customers. Uber also uses a tier-based reward system, which segments customers into different levels based on usage. The higher level the user achieves, the better are the perks. Uber also provides personalized destination suggestions based on the history of the user and their frequently traveled destinations.
You can take a look at this Python Chatbot Project and build a simple chatbot application to understand better the techniques used for natural language processing. You can also practice the working of a demand forecasting model with this project using time series analysis. You can look at this project which uses time series forecasting and clustering on a dataset containing geospatial data for forecasting customer demand for ola rides.
Explore More Data Science and Machine Learning Projects for Practice. Fast-Track Your Career Transition with ProjectPro
7) LinkedIn
LinkedIn is the largest professional social networking site with nearly 800 million members in more than 200 countries worldwide. Almost 40% of the users access LinkedIn daily, clocking around 1 billion interactions per month. The data science team at LinkedIn works with this massive pool of data to generate insights to build strategies, apply algorithms and statistical inferences to optimize engineering solutions, and help the company achieve its goals. Here are some of the real world data science projects at LinkedIn:
i) LinkedIn Recruiter Implement Search Algorithms and Recommendation Systems
LinkedIn Recruiter helps recruiters build and manage a talent pool to optimize the chances of hiring candidates successfully. This sophisticated product works on search and recommendation engines. The LinkedIn recruiter handles complex queries and filters on a constantly growing large dataset. The results delivered have to be relevant and specific. The initial search model was based on linear regression but was eventually upgraded to Gradient Boosted decision trees to include non-linear correlations in the dataset. In addition to these models, the LinkedIn recruiter also uses the Generalized Linear Mix model to improve the results of prediction problems to give personalized results.
ii) Recommendation Systems Personalized for News Feed
The LinkedIn news feed is the heart and soul of the professional community. A member's newsfeed is a place to discover conversations among connections, career news, posts, suggestions, photos, and videos. Every time a member visits LinkedIn, machine learning algorithms identify the best exchanges to be displayed on the feed by sorting through posts and ranking the most relevant results on top. The algorithms help LinkedIn understand member preferences and help provide personalized news feeds. The algorithms used include logistic regression, gradient boosted decision trees and neural networks for recommendation systems.
iii) CNN's to Detect Inappropriate Content
To provide a professional space where people can trust and express themselves professionally in a safe community has been a critical goal at LinkedIn. LinkedIn has heavily invested in building solutions to detect fake accounts and abusive behavior on their platform. Any form of spam, harassment, inappropriate content is immediately flagged and taken down. These can range from profanity to advertisements for illegal services. LinkedIn uses a Convolutional neural networks based machine learning model. This classifier trains on a training dataset containing accounts labeled as either "inappropriate" or "appropriate." The inappropriate list consists of accounts having content from "blocklisted" phrases or words and a small portion of manually reviewed accounts reported by the user community.
Here is a Text Classification Project to help you understand NLP basics for text classification. You can find a news recommendation system dataset to help you build a personalized news recommender system. You can also use this dataset to build a classifier using logistic regression, Naive Bayes, or Neural networks to classify toxic comments.
Get confident to build end-to-end projects
Access to a curated library of 250+ end-to-end industry projects with solution code, videos and tech support.
Pfizer is a multinational pharmaceutical company headquartered in New York, USA. One of the largest pharmaceutical companies globally known for developing a wide range of medicines and vaccines in disciplines like immunology, oncology, cardiology, and neurology. Pfizer became a household name in 2010 when it was the first to have a COVID-19 vaccine with FDA. In early November 2021, The CDC has approved the Pfizer vaccine for kids aged 5 to 11. Pfizer has been using machine learning and artificial intelligence to develop drugs and streamline trials, which played a massive role in developing and deploying the COVID-19 vaccine. Here are a few data analytics case studies by Pfizer :
i) Identifying Patients for Clinical Trials
Artificial intelligence and machine learning are used to streamline and optimize clinical trials to increase their efficiency. Natural language processing and exploratory data analysis of patient records can help identify suitable patients for clinical trials. These can help identify patients with distinct symptoms. These can help examine interactions of potential trial members' specific biomarkers, predict drug interactions and side effects which can help avoid complications. Pfizer's AI implementation helped rapidly identify signals within the noise of millions of data points across their 44,000-candidate COVID-19 clinical trial.
ii) Supply Chain and Manufacturing
Data science and machine learning techniques help pharmaceutical companies better forecast demand for vaccines and drugs and distribute them efficiently. Machine learning models can help identify efficient supply systems by automating and optimizing the production steps. These will help supply drugs customized to small pools of patients in specific gene pools. Pfizer uses Machine learning to predict the maintenance cost of equipment used. Predictive maintenance using AI is the next big step for Pharmaceutical companies to reduce costs.
iii) Drug Development
Computer simulations of proteins, and tests of their interactions, and yield analysis help researchers develop and test drugs more efficiently. In 2016 Watson Health and Pfizer announced a collaboration to utilize IBM Watson for Drug Discovery to help accelerate Pfizer's research in immuno-oncology, an approach to cancer treatment that uses the body's immune system to help fight cancer. Deep learning models have been used recently for bioactivity and synthesis prediction for drugs and vaccines in addition to molecular design. Deep learning has been a revolutionary technique for drug discovery as it factors everything from new applications of medications to possible toxic reactions which can save millions in drug trials.
You can create a Machine learning model to predict molecular activity to help design medicine using this dataset . You may build a CNN or a Deep neural network for this data analyst case study project.
Access Data Science and Machine Learning Project Code Examples
9) Shell Data Analyst Case Study Project
Shell is a global group of energy and petrochemical companies with over 80,000 employees in around 70 countries. Shell uses advanced technologies and innovations to help build a sustainable energy future. Shell is going through a significant transition as the world needs more and cleaner energy solutions to be a clean energy company by 2050. It requires substantial changes in the way in which energy is used. Digital technologies, including AI and Machine Learning, play an essential role in this transformation. These include efficient exploration and energy production, more reliable manufacturing, more nimble trading, and a personalized customer experience. Using AI in various phases of the organization will help achieve this goal and stay competitive in the market. Here are a few data analytics case studies in the petrochemical industry:
i) Precision Drilling
Shell is involved in the processing mining oil and gas supply, ranging from mining hydrocarbons to refining the fuel to retailing them to customers. Recently Shell has included reinforcement learning to control the drilling equipment used in mining. Reinforcement learning works on a reward-based system based on the outcome of the AI model. The algorithm is designed to guide the drills as they move through the surface, based on the historical data from drilling records. It includes information such as the size of drill bits, temperatures, pressures, and knowledge of the seismic activity. This model helps the human operator understand the environment better, leading to better and faster results will minor damage to machinery used.
ii) Efficient Charging Terminals
Due to climate changes, governments have encouraged people to switch to electric vehicles to reduce carbon dioxide emissions. However, the lack of public charging terminals has deterred people from switching to electric cars. Shell uses AI to monitor and predict the demand for terminals to provide efficient supply. Multiple vehicles charging from a single terminal may create a considerable grid load, and predictions on demand can help make this process more efficient.
iii) Monitoring Service and Charging Stations
Another Shell initiative trialed in Thailand and Singapore is the use of computer vision cameras, which can think and understand to watch out for potentially hazardous activities like lighting cigarettes in the vicinity of the pumps while refueling. The model is built to process the content of the captured images and label and classify it. The algorithm can then alert the staff and hence reduce the risk of fires. You can further train the model to detect rash driving or thefts in the future.
Here is a project to help you understand multiclass image classification. You can use the Hourly Energy Consumption Dataset to build an energy consumption prediction model. You can use time series with XGBoost to develop your model.
10) Zomato Case Study on Data Analytics
Zomato was founded in 2010 and is currently one of the most well-known food tech companies. Zomato offers services like restaurant discovery, home delivery, online table reservation, online payments for dining, etc. Zomato partners with restaurants to provide tools to acquire more customers while also providing delivery services and easy procurement of ingredients and kitchen supplies. Currently, Zomato has over 2 lakh restaurant partners and around 1 lakh delivery partners. Zomato has closed over ten crore delivery orders as of date. Zomato uses ML and AI to boost their business growth, with the massive amount of data collected over the years from food orders and user consumption patterns. Here are a few examples of data analyst case study project developed by the data scientists at Zomato:
i) Personalized Recommendation System for Homepage
Zomato uses data analytics to create personalized homepages for its users. Zomato uses data science to provide order personalization, like giving recommendations to the customers for specific cuisines, locations, prices, brands, etc. Restaurant recommendations are made based on a customer's past purchases, browsing history, and what other similar customers in the vicinity are ordering. This personalized recommendation system has led to a 15% improvement in order conversions and click-through rates for Zomato.
You can use the Restaurant Recommendation Dataset to build a restaurant recommendation system to predict what restaurants customers are most likely to order from, given the customer location, restaurant information, and customer order history.
ii) Analyzing Customer Sentiment
Zomato uses Natural language processing and Machine learning to understand customer sentiments using social media posts and customer reviews. These help the company gauge the inclination of its customer base towards the brand. Deep learning models analyze the sentiments of various brand mentions on social networking sites like Twitter, Instagram, Linked In, and Facebook. These analytics give insights to the company, which helps build the brand and understand the target audience.
iii) Predicting Food Preparation Time (FPT)
Food delivery time is an essential variable in the estimated delivery time of the order placed by the customer using Zomato. The food preparation time depends on numerous factors like the number of dishes ordered, time of the day, footfall in the restaurant, day of the week, etc. Accurate prediction of the food preparation time can help make a better prediction of the Estimated delivery time, which will help delivery partners less likely to breach it. Zomato uses a Bidirectional LSTM-based deep learning model that considers all these features and provides food preparation time for each order in real-time.
Data scientists are companies' secret weapons when analyzing customer sentiments and behavior and leveraging it to drive conversion, loyalty, and profits. These 10 data science case studies projects with examples and solutions show you how various organizations use data science technologies to succeed and be at the top of their field! To summarize, Data Science has not only accelerated the performance of companies but has also made it possible to manage & sustain their performance with ease.
FAQs on Data Analysis Case Studies
A case study in data science is an in-depth analysis of a real-world problem using data-driven approaches. It involves collecting, cleaning, and analyzing data to extract insights and solve challenges, offering practical insights into how data science techniques can address complex issues across various industries.
To create a data science case study, identify a relevant problem, define objectives, and gather suitable data. Clean and preprocess data, perform exploratory data analysis, and apply appropriate algorithms for analysis. Summarize findings, visualize results, and provide actionable recommendations, showcasing the problem-solving potential of data science techniques.
About the Author
ProjectPro is the only online platform designed to help professionals gain practical, hands-on experience in big data, data engineering, data science, and machine learning related technologies. Having over 270+ reusable project templates in data science and big data with step-by-step walkthroughs,
© 2024
© 2024 Iconiq Inc.
Privacy policy
User policy
Write for ProjectPro
Navigation Menu
Search code, repositories, users, issues, pull requests..., provide feedback.
We read every piece of feedback, and take your input very seriously.
Saved searches
Use saved searches to filter your results more quickly.
To see all available qualifiers, see our documentation .
- Notifications You must be signed in to change notification settings
Google Data Analytics Certificate case study of Fitbit data for Bella Beat
MirAnalysis/Google-Data-Analytics-Case-Study
Folders and files, repository files navigation, bella-beat-case-study-google-da-, introduction:.
Welcome to my capstone project for the Google Data Analytics Certificate! This study showcases the skills learned during the course including SQL and Tableau. I will be analyzing Fitbit data to make a recommendation to Bellabeat by using the data analysis process.
Business Task:
Bellabeat, a wellness and tech company whose mission is to empower women to reach their full potential, requests help with marketing their products. The company offers smart devices such as: leaf, ivy, and time. These items can track health data such as activity, sleep, menstrual cycles, heart rate, and hydration. In this scenario the Bellabeat marketing team requests recommendations based on competitor data. Bellabeat's competitor, Fitbit, will be analyzed to reveal user trends in the wellness device market. The findings from this will offer insights into areas of growth opportunity for Bellabeat going forward.
Data Sources
The data source, "Fitbit Fitness Tracker Data" was found on data science and coding website, Kaggle by data scientist, Möbius. The datasets were sourced from a survey performed on Amazon Mechanical Turk workers for a study which collected Fitbit tracking data. The original study states 30 participants were surveyed, however 33 can be found in the data. No demographic information such as age, height, or sex was provided. The exact Fitbit models are not specified, but it is noted that variation across the datasets is potentially due to varying device models and user tracking preferences. The data in my analysis is focused during 4-12-2016 to 5-12-2016. The data includes a total of 33 users over 4 datasets tracking data including: physical activity, steps count, sleep time, and weight information.
"Daily Activity Merged" includes daily activity logs for 33 users. This set compiles 3 activity types, their distance, minutes spent performing them. The 3 activity types are: light, fairly and very active. The distance columns are not defined but based on the step data provided resemble Kilometers. Minutes spent without activity are categorized as sedentary time. This set also includes steps taken and calories burned.
"Hourly Steps Merged" includes the same 33 user Ids, but expands the daily steps into hourly increments categorized in 24 hour format. As mentioned previously, there was a variance between the total steps calculated in this set compared to the daily logs in the "Daily Activity Merged" set above, likely due to device usage. Because of this variance I used the step information in this set only for my analysis on steps per time of day.
"Sleep Day Merged", details 24 user Ids, their minutes asleep, and minutes in bed but not asleep. Fitbit’s website states that the watch tracks heart rate and movement patterns to determine if the user is awake or asleep. Fitbit also states that the “Awake” category includes when users are somewhere in a sleep cycle but are restless and wake up briefly.
"Weight Log Info Merged", includes only 8 user Ids, weight (kg and lbs), BMI, and whether the data was logged manually or automatically. The set also included a “Fat” column but was only utilized in 2 cells.
The Cleaning Process
For this project I used Microsoft Excel and SQL for data cleaning. I started the cleaning process by checking all of my datasets for the same issues: blank spaces, duplicates, and inconsistencies. The following is my changelog for the cleaning process in Excel:
Shared Changes Across All Tables
- Removed blank spaces using conditional formatting
- Verified User Id column entries were uniform (10 characters) in length using LEN function (i.e. =LEN(A2))
- Added underscores between words in column names
- Added column “Day” using date function ( i.e. =TEXT(B2, "dddd"))
- Changed “DateTime” columns into two separate columns, “Date” and “Time” using INT function (i.e. =INT(A2), =A2 - INT(A2))
Changed column name “activitydate” to “Date”
Changed column name “totalsteps” to “steps”
Removed "Tracker Distance", "Logged_Activities_Distance", "Very_Active_Distance", "Moderately_Active_Distance", "Light_Active_Distance", and "Sedentary_Active_Distance" columns.
Changed column name “sleepday” to “Date”
Subtracted "Time Asleep" from column "Total Time In Bed" and created new column "Time Awake" from results.
Removed column "Total Sleep Records"
Changed column name “Is Manual Report” to “Report_Type”
Changed column “Report_Type” Responses from True/False to Manual/Automatic respectively
Removed column “Fat”
Removed column “LogId”
Data Manipulation and Analysis
I then uploaded my 4 tables into BigQuery SQL Console to begin my data manipulation. Each phase of manipulation was guided by a question in search of a trend.
Continue to:
- SQL_Queries for all queries
- Data Table Link for access to view all tables resulting for queries
- Analysis for analysis of data
- Visualizations for data graphs
- Recommendations for my answer to the business task
- Sources Cited for resource credits
8 case studies and real world examples of how Big Data has helped keep on top of competition
Fast, data-informed decision-making can drive business success. Managing high customer expectations, navigating marketing challenges, and global competition – many organizations look to data analytics and business intelligence for a competitive advantage.
Using data to serve up personalized ads based on browsing history, providing contextual KPI data access for all employees and centralizing data from across the business into one digital ecosystem so processes can be more thoroughly reviewed are all examples of business intelligence.
Organizations invest in data science because it promises to bring competitive advantages.
Data is transforming into an actionable asset, and new tools are using that reality to move the needle with ML. As a result, organizations are on the brink of mobilizing data to not only predict the future but also to increase the likelihood of certain outcomes through prescriptive analytics.
Here are some case studies that show some ways BI is making a difference for companies around the world:
1) Starbucks:
With 90 million transactions a week in 25,000 stores worldwide the coffee giant is in many ways on the cutting edge of using big data and artificial intelligence to help direct marketing, sales and business decisions
Through its popular loyalty card program and mobile application, Starbucks owns individual purchase data from millions of customers. Using this information and BI tools, the company predicts purchases and sends individual offers of what customers will likely prefer via their app and email. This system draws existing customers into its stores more frequently and increases sales volumes.
The same intel that helps Starbucks suggest new products to try also helps the company send personalized offers and discounts that go far beyond a special birthday discount. Additionally, a customized email goes out to any customer who hasn’t visited a Starbucks recently with enticing offers—built from that individual’s purchase history—to re-engage them.
2) Netflix:
The online entertainment company’s 148 million subscribers give it a massive BI advantage.
Netflix has digitized its interactions with its 151 million subscribers. It collects data from each of its users and with the help of data analytics understands the behavior of subscribers and their watching patterns. It then leverages that information to recommend movies and TV shows customized as per the subscriber’s choice and preferences.
As per Netflix, around 80% of the viewer’s activity is triggered by personalized algorithmic recommendations. Where Netflix gains an edge over its peers is that by collecting different data points, it creates detailed profiles of its subscribers which helps them engage with them better.
The recommendation system of Netflix contributes to more than 80% of the content streamed by its subscribers which has helped Netflix earn a whopping one billion via customer retention. Due to this reason, Netflix doesn’t have to invest too much on advertising and marketing their shows. They precisely know an estimate of the people who would be interested in watching a show.
3) Coca-Cola:
Coca Cola is the world’s largest beverage company, with over 500 soft drink brands sold in more than 200 countries. Given the size of its operations, Coca Cola generates a substantial amount of data across its value chain – including sourcing, production, distribution, sales and customer feedback which they can leverage to drive successful business decisions.
Coca Cola has been investing extensively in research and development, especially in AI, to better leverage the mountain of data it collects from customers all around the world. This initiative has helped them better understand consumer trends in terms of price, flavors, packaging, and consumer’ preference for healthier options in certain regions.
With 35 million Twitter followers and a whopping 105 million Facebook fans, Coca-Cola benefits from its social media data. Using AI-powered image-recognition technology, they can track when photographs of its drinks are posted online. This data, paired with the power of BI, gives the company important insights into who is drinking their beverages, where they are and why they mention the brand online. The information helps serve consumers more targeted advertising, which is four times more likely than a regular ad to result in a click.
Coca Cola is increasingly betting on BI, data analytics and AI to drive its strategic business decisions. From its innovative free style fountain machine to finding new ways to engage with customers, Coca Cola is well-equipped to remain at the top of the competition in the future. In a new digital world that is increasingly dynamic, with changing customer behavior, Coca Cola is relying on Big Data to gain and maintain their competitive advantage.
4) American Express GBT
The American Express Global Business Travel company, popularly known as Amex GBT, is an American multinational travel and meetings programs management corporation which operates in over 120 countries and has over 14,000 employees.
Challenges:
Scalability – Creating a single portal for around 945 separate data files from internal and customer systems using the current BI tool would require over 6 months to complete. The earlier tool was used for internal purposes and scaling the solution to such a large population while keeping the costs optimum was a major challenge
Performance – Their existing system had limitations shifting to Cloud. The amount of time and manual effort required was immense
Data Governance – Maintaining user data security and privacy was of utmost importance for Amex GBT
The company was looking to protect and increase its market share by differentiating its core services and was seeking a resource to manage and drive their online travel program capabilities forward. Amex GBT decided to make a strategic investment in creating smart analytics around their booking software.
The solution equipped users to view their travel ROI by categorizing it into three categories cost, time and value. Each category has individual KPIs that are measured to evaluate the performance of a travel plan.
Reducing travel expenses by 30%
Time to Value – Initially it took a week for new users to be on-boarded onto the platform. With Premier Insights that time had now been reduced to a single day and the process had become much simpler and more effective.
Savings on Spends – The product notifies users of any available booking offers that can help them save on their expenditure. It recommends users of possible saving potential such as flight timings, date of the booking, date of travel, etc.
Adoption – Ease of use of the product, quick scale-up, real-time implementation of reports, and interactive dashboards of Premier Insights increased the global online adoption for Amex GBT
5) Airline Solutions Company: BI Accelerates Business Insights
Airline Solutions provides booking tools, revenue management, web, and mobile itinerary tools, as well as other technology, for airlines, hotels and other companies in the travel industry.
Challenge: The travel industry is remarkably dynamic and fast paced. And the airline solution provider’s clients needed advanced tools that could provide real-time data on customer behavior and actions.
They developed an enterprise travel data warehouse (ETDW) to hold its enormous amounts of data. The executive dashboards provide near real-time insights in user-friendly environments with a 360-degree overview of business health, reservations, operational performance and ticketing.
Results: The scalable infrastructure, graphic user interface, data aggregation and ability to work collaboratively have led to more revenue and increased client satisfaction.
6) A specialty US Retail Provider: Leveraging prescriptive analytics
Challenge/Objective: A specialty US Retail provider wanted to modernize its data platform which could help the business make real-time decisions while also leveraging prescriptive analytics. They wanted to discover true value of data being generated from its multiple systems and understand the patterns (both known and unknown) of sales, operations, and omni-channel retail performance.
We helped build a modern data solution that consolidated their data in a data lake and data warehouse, making it easier to extract the value in real-time. We integrated our solution with their OMS, CRM, Google Analytics, Salesforce, and inventory management system. The data was modeled in such a way that it could be fed into Machine Learning algorithms; so that we can leverage this easily in the future.
The customer had visibility into their data from day 1, which is something they had been wanting for some time. In addition to this, they were able to build more reports, dashboards, and charts to understand and interpret the data. In some cases, they were able to get real-time visibility and analysis on instore purchases based on geography!
7) Logistics startup with an objective to become the “Uber of the Trucking Sector” with the help of data analytics
Challenge: A startup specializing in analyzing vehicle and/or driver performance by collecting data from sensors within the vehicle (a.k.a. vehicle telemetry) and Order patterns with an objective to become the “Uber of the Trucking Sector”
Solution: We developed a customized backend of the client’s trucking platform so that they could monetize empty return trips of transporters by creating a marketplace for them. The approach used a combination of AWS Data Lake, AWS microservices, machine learning and analytics.
- Reduced fuel costs
- Optimized Reloads
- More accurate driver / truck schedule planning
- Smarter Routing
- Fewer empty return trips
- Deeper analysis of driver patterns, breaks, routes, etc.
8) Challenge/Objective: A niche segment customer competing against market behemoths looking to become a “Niche Segment Leader”
Solution: We developed a customized analytics platform that can ingest CRM, OMS, Ecommerce, and Inventory data and produce real time and batch driven analytics and AI platform. The approach used a combination of AWS microservices, machine learning and analytics.
- Reduce Customer Churn
- Optimized Order Fulfillment
- More accurate demand schedule planning
- Improve Product Recommendation
- Improved Last Mile Delivery
How can we help you harness the power of data?
At Systems Plus our BI and analytics specialists help you leverage data to understand trends and derive insights by streamlining the searching, merging, and querying of data. From improving your CX and employee performance to predicting new revenue streams, our BI and analytics expertise helps you make data-driven decisions for saving costs and taking your growth to the next level.
Most Popular Blogs
Ready to transform and unlock your full IT potential? Connect with us today to learn more about our comprehensive digital solutions.
Schedule a Consultation
Transforming IT Operations with Managed Service Solutions for a Leading Retail Sports Giant
Delivering noc and soc it managed services for a leading global entertainment brand, elevating user transitions: jml automation mastery at work, saving hundreds of manual hours.
TechEnablers Episode 6: Navigating the Retail Revolutio
TechEnablers Episode 5: Upgrading the In-Store IT Infra
Cyber Program Operations: What might be missing from yo
Driving Efficiency in Retail Logistics
Visualizing Data in Healthcare
Diving into Data and Diversity
AWS Named as a Leader for the 11th Consecutive Year…
Introducing amazon route 53 application recovery controller, amazon sagemaker named as the outright leader in enterprise mlops….
- Made To Order
- Cloud Solutions
- Salesforce Commerce Cloud
- Distributed Agile
- Consulting and Process Optimization
- Data Warehouse & BI
- ServiceNow Consulting and Implementation
- Security Assessment & Mitigation
- AI Strategy and Governance
- Case Studies
- News and Events
Quick Links
- Privacy Statement
- +91-22-61591100
- [email protected]
- Skip to primary navigation
- Skip to main content
- Skip to primary sidebar
- Skip to footer
Inflow: eCommerce Marketing Agency
Home > KPIs and Reporting > Google Analytics
GA4 Case Study: Tracking Data for eCommerce & Non-eCommerce Sites
Over the last year, Inflow’s digital analytics team has been working hard to migrate our clients to Google Analytics 4 in preparation for the sunsetting of Universal Analytics.
To date, we’ve successfully configured the setup for more than 60 websites, both eCommerce and non-eCommerce. By replicating (or improving upon!) their existing data tracking in UA as closely as possible, we’ve provided our clients valuable historical data within GA4, giving them a leg up when the transition officially occurred on July 1, 2023.
In today’s blog, we’ll explore the work we did for two such clients — KEH Camera and Worldwide Business Research — including the unique challenges and solutions we discovered along the way.
Keep reading for the full details, or contact our team to have them audit (and recommend improvements for) your GA4 configuration today.
The eCommerce Site: KEH Camera
KEH is a reCommerce business that resells professional, collectible, and everyday camera gear. They’ve been an Inflow client since 2019 for a variety of services, including paid social, search engine optimization, and more.
The Challenge
Unlike traditional eCommerce brands, KEH has two sides to their business: Shop (for customers buying products) and Sell (for customers selling their products to the brand).
While KEH was able to successfully track both of these audiences separately through Enhanced eCommerce in Universal Analytics, that functionality no longer exists in the new version of Google Analytics — forcing the brand to get creative with their new configuration and attribution, especially when it came to existing custom events used to track “Sell” conversions in UA.
In short, KEH needed a new data-collection solution in GA4 that would segment out purchases from both Shop and Sell (as well as the user data for each audience) to better inform their digital marketing strategy.
Before our partnership, the KEH team had used an Enhanced eCommerce converter to replicate their UA data layer for GA4. While it mostly worked, it wasn’t as clean of an installation as our team could provide and would have eventually needed to be revisited when the complete transition to GA4 was made.
The Solution
Using our Google Analytics 4 setup process as a foundation, we took the concept of enhanced eCommerce forward into GA4-style events, values, and more with a custom configuration for KEH.
We started by using GA4’s eCommerce setup to track both Shop and Sell activity from the website. With customized purchase and eCommerce events, we were able to pass in where each transaction was coming from (Shop or Sell) to not only track website actions but also user-level actions (with a similar custom setup on the user side of the analytics platform).
Combined, these configurations would give KEH plentiful options to segment their data, either at the event or user level. In turn, they could better understand their customer journey — where different audiences were browsing on their site, where purchases were coming from, and more.
To push our tracking live onto KEH’s site, however, we needed one more step: a custom data layer.
While many eCommerce platforms have plugins that assist with GTM data layers, few can handle the complexity of a site like KEH — or the ability to parallel-track UA and GA4, as we’re recommending for our clients until next year’s deadline.
So, we worked with KEH’s web team to create and implement a custom data layer that would set their GA4 tracking into motion.
Get our free eCommerce data layer in our GA4 tracking toolkit today.
The Results
Even with its Shop and Sell complexities, KEH’s GA4 tracking is performing as expected by our team.
In our reporting dashboards, eCommerce purchases compare closely across UA and GA4, with users sitting at a typical 10% discrepancy due to the difference in the platforms’ configuration (user- vs. event-based.) These results are common across all of our GA4 clients, eCommerce and non-eCommerce.
Fortunately, KEH had already included BigQuery in their overarching web analytics strategy, making the data warehousing required by GA4 much simpler to implement.
As a reminder, Google Analytics 4 only stores 14 months of historical data within its platform. For your site to have access to more historical data, you’ll need an integration with BigQuery — which will store your site’s data and allow you to compare longer periods in applications like Google Data Studio.
BigQuery is also technically the most “accurate” source of GA4 data.
Although Google Signals data does not come through to BigQuery, we’ve been successfully using the integration so far for KEH’s needs.
The non-eCommerce Site: Worldwide Business Research
Worldwide Business Research is a company that plans and hosts more than 100 annual worldwide conferences (both in-person and virtual). They also execute the marketing needed for those events, including email marketing, digital advertising, and more.
As a partner to our current client IQPC , WBR reached out to Inflow for GA4 migration services earlier this year.
Note: To avoid confusion with GA4 “events,” we’ve capitalized Event in reference to WBR’s conferences in the Google Analytics case study below.
As a non-eCommerce site, WBR needed to track data across three global offices and hundreds of subdomains.
In Universal Analytics, WBR had relied heavily on views for each of their Events/subdomains. However, with views no longer existing GA4, WBR needed a solution to get Event-level data from each subdomain.
Their goal: Streamline an entire office’s tracking while keeping the ability to segment out data by Event/conference.
In case that wasn’t enough, the company also needed to change their Google Ads tracking to meet GA4’s capabilities. (Previously, they had imported conversions from UA views, which, as mentioned, no longer exist in Google Analytics 4.)
In short, WBR needed a completely custom architecture recommendation for their Google Analytics 4 configuration and tracking.
To consolidate WBR’s data-tracking and reporting options, we recommended setting up one GA4 property per office, with different segments for the Events/conferences hosted by each location passed into GA4 from Google Tag Manager (GTM).
In other words, individual Event data could be viewed by applying segments (comparisons, filters, audiences, etc.) to their reports or through filtered Data Studio dashboards. Any unsegmented reports would be a comprehensive report of all the office’s Events.
That way, WBR could more clearly distinguish the KPIs for each Event they hosted across the globe with much less effort than before.
Using our confidence in and knowledge of GA4 capabilities, combined with custom event setup to track Google Ads, we implemented WBR’s new architecture smoothly — giving the brand deeper insights into its Event performance without the multi-property headache of the past.
An added bonus: By setting up one property per office, we avoided the need to set up BigQuery and Google Ads tracking for every single Event as done in UA.
Like most clients, WBR continues to report most of their data out of Universal Analytics. But, by completing this setup far before next year’s deadline, we’ve given WBR’s marketing team more flexibility in not only learning their new GA4 setup but also how to best report out of it for their future marketing needs.
In addition to the configuration described above, we also created a custom Data Studio template for the “segments” of each Event — avoiding any need for WBR’s team to dig around in GA4 (and get more confused than before) while giving them every tool needed to evaluate each Event’s performance and make appropriate business decisions.
Still Need to Set Up Your GA4?
When it comes to the new Google Analytics 4, the clock is ticking.
To get as much historical data as possible for future comparison, now is the time to start configuring your analytics data tracking in the platform.
If you need help making it happen — or would like an expert to evaluate your current setup — Inflow is always here to help.
Request a free GA4 migration proposal now to learn how we can help get your site set for future data-tracking success.
Leave a Reply Cancel reply
Your email address will not be published. Required fields are marked *
This site uses Akismet to reduce spam. Learn how your comment data is processed .
Related Posts
About The Author
Mike Belasco
Mike Belasco has been an entrepreneur and digital marketer since 2003. Mike founded Inflow (previously known as seOverflow) in 2007 and led Inflow to five Denver’s Fastest-Growing Private Company awards and three Inc. 5000 awards. In 2009, he also founded ConversionIQ, which was subsequently acquired by Inflow. After 20 years of serving as Inflow’s Founding CEO, in 2023 Mike completed a sale of Inflow. He now takes on entrepreneurial adventures and continues to be a raving fan of the Inflow team while consulting as a Strategic Advisor.
Request a Proposal
Let us build a personalized strategy with the best eCommerce marketing services for your needs. Contact us below to get started.
About Stanford GSB
- The Leadership
- Dean’s Updates
- School News & History
- Business, Government & Society
- Centers & Institutes
- Center for Entrepreneurial Studies
- Center for Social Innovation
- Stanford Seed
About the Experience
- Learning at Stanford GSB
- Experiential Learning
- Guest Speakers
- Entrepreneurship
- Social Innovation
- Communication
- Life at Stanford GSB
- Collaborative Environment
- Activities & Organizations
- Student Services
- Housing Options
- International Students
Full-Time Degree Programs
- Why Stanford MBA
- Academic Experience
- Financial Aid
- Why Stanford MSx
- Research Fellows Program
- See All Programs
Non-Degree & Certificate Programs
- Executive Education
- Stanford Executive Program
- Programs for Organizations
- The Difference
- Online Programs
- Stanford LEAD
- Seed Transformation Program
- Aspire Program
- Seed Spark Program
- Faculty Profiles
- Academic Areas
- Awards & Honors
- Conferences
Faculty Research
- Publications
- Working Papers
- Case Studies
- Postdoctoral Scholars
Research Hub
- Research Labs & Initiatives
- Business Library
- Data, Analytics & Research Computing
- Behavioral Lab
- Faculty Recruiting
- See All Jobs
Research Labs
- Cities, Housing & Society Lab
- Golub Capital Social Impact Lab
Research Initiatives
- Corporate Governance Research Initiative
- Corporations and Society Initiative
- Policy and Innovation Initiative
- Rapid Decarbonization Initiative
- Stanford Latino Entrepreneurship Initiative
- Value Chain Innovation Initiative
- Venture Capital Initiative
- Career & Success
- Climate & Sustainability
- Corporate Governance
- Culture & Society
- Finance & Investing
- Government & Politics
- Leadership & Management
- Markets and Trade
- Operations & Logistics
- Opportunity & Access
- Technology & AI
- Opinion & Analysis
- Email Newsletter
Welcome, Alumni
- Communities
- Digital Communities & Tools
- Regional Chapters
- Women’s Programs
- Identity Chapters
- Find Your Reunion
- Career Resources
- Job Search Resources
- Career & Life Transitions
- Programs & Webinars
- Career Video Library
- Alumni Education
- Research Resources
- Volunteering
- Alumni News
- Class Notes
- Alumni Voices
- Contact Alumni Relations
- Upcoming Events
Admission Events & Information Sessions
- MBA Program
- MSx Program
- PhD Program
- Alumni Events
- All Other Events
- Operations, Information & Technology
- Organizational Behavior
- Political Economy
- Classical Liberalism
- The Eddie Lunch
- Accounting Summer Camp
- California Econometrics Conference
- California Quantitative Marketing PhD Conference
- California School Conference
- China India Insights Conference
- Homo economicus, Evolving
- Political Economics (2023–24)
- Scaling Geologic Storage of CO2 (2023–24)
- A Resilient Pacific: Building Connections, Envisioning Solutions
- Adaptation and Innovation
- Changing Climate
- Civil Society
- Climate Impact Summit
- Climate Science
- Corporate Carbon Disclosures
- Earth’s Seafloor
- Environmental Justice
- Operations and Information Technology
- Organizations
- Sustainability Reporting and Control
- Taking the Pulse of the Planet
- Urban Infrastructure
- Watershed Restoration
- Junior Faculty Workshop on Financial Regulation and Banking
- Ken Singleton Celebration
- Marketing Camp
- Quantitative Marketing PhD Alumni Conference
- Presentations
- Theory and Inference in Accounting Research
- Past Scholars
- Stanford Closer Look Series
- Quick Guides
- Core Concepts
- Journal Articles
- Glossary of Terms
- Faculty & Staff
- Subscribe to Corporate Governance Emails
- Researchers & Students
- Research Approach
- Charitable Giving
- Financial Health
- Government Services
- Workers & Careers
- Short Course
- Adaptive & Iterative Experimentation
- Incentive Design
- Social Sciences & Behavioral Nudges
- Bandit Experiment Application
- Conferences & Events
- Get Involved
- Reading Materials
- Teaching & Curriculum
- Energy Entrepreneurship
- Faculty & Affiliates
- SOLE Report
- Responsible Supply Chains
- Current Study Usage
- Pre-Registration Information
- Participate in a Study
Data Monetization and Consumer Tracking
In late September 2014, Facebook, the world’s largest social network, announced the launch of Atlas, an ad serving and measurement platform that would allow advertisers to target ads to real people using Facebook’s unique user IDs rather than information based on often inaccurate and unreliable cookies. The concept, called people-based marketing, would provide marketers with more accurate demographic, reach, and frequency information across the Internet and in-app, while preserving privacy by anonymizing users. At the time, the launch of Atlas was heralded as one of the most dramatic steps toward solving for cross-device reporting and cross-channel (particularly online and offline) issues (see Exhibit 1 for a description of cross-device and cross-channel components). This case serves to explain consumer tracking within the context of data monetization.
Learning Objective
Students are provided with a primer on consumer tracking as the foundation for further discussion on monetization.
- Research & Insights
- Search Fund Primer
- Affiliated Faculty
- Faculty Advisors
- Louis W. Foster Resource Center
- Defining Social Innovation
- Impact Compass
- Global Health Innovation Insights
- Faculty Affiliates
- Student Awards & Certificates
- Changemakers
- Dean Jonathan Levin
- Dean Garth Saloner
- Dean Robert Joss
- Dean Michael Spence
- Dean Robert Jaedicke
- Dean Rene McPherson
- Dean Arjay Miller
- Dean Ernest Arbuckle
- Dean Jacob Hugh Jackson
- Dean Willard Hotchkiss
- Faculty in Memoriam
- Stanford GSB Firsts
- Annual Alumni Dinner
- Class of 2024 Candidates
- Certificate & Award Recipients
- Dean’s Remarks
- Keynote Address
- Teaching Approach
- Analysis and Measurement of Impact
- The Corporate Entrepreneur: Startup in a Grown-Up Enterprise
- Data-Driven Impact
- Designing Experiments for Impact
- Digital Marketing
- The Founder’s Right Hand
- Marketing for Measurable Change
- Product Management
- Public Policy Lab: Financial Challenges Facing US Cities
- Public Policy Lab: Homelessness in California
- Lab Features
- Curricular Integration
- View From The Top
- Formation of New Ventures
- Managing Growing Enterprises
- Startup Garage
- Explore Beyond the Classroom
- Stanford Venture Studio
- Summer Program
- Workshops & Events
- The Five Lenses of Entrepreneurship
- Leadership Labs
- Executive Challenge
- Arbuckle Leadership Fellows Program
- Selection Process
- Training Schedule
- Time Commitment
- Learning Expectations
- Post-Training Opportunities
- Who Should Apply
- Introductory T-Groups
- Leadership for Society Program
- Certificate
- 2024 Awardees
- 2023 Awardees
- 2022 Awardees
- 2021 Awardees
- 2020 Awardees
- 2019 Awardees
- 2018 Awardees
- Social Management Immersion Fund
- Stanford Impact Founder Fellowships
- Stanford Impact Leader Prizes
- Social Entrepreneurship
- Stanford GSB Impact Fund
- Economic Development
- Energy & Environment
- Stanford GSB Residences
- Environmental Leadership
- Stanford GSB Artwork
- A Closer Look
- California & the Bay Area
- Voices of Stanford GSB
- Business & Beneficial Technology
- Business & Sustainability
- Business & Free Markets
- Business, Government, and Society Forum
- Second Year
- Global Experiences
- JD/MBA Joint Degree
- MA Education/MBA Joint Degree
- MD/MBA Dual Degree
- MPP/MBA Joint Degree
- MS Computer Science/MBA Joint Degree
- MS Electrical Engineering/MBA Joint Degree
- MS Environment and Resources (E-IPER)/MBA Joint Degree
- Academic Calendar
- Clubs & Activities
- LGBTQ+ Students
- Military Veterans
- Minorities & People of Color
- Partners & Families
- Students with Disabilities
- Student Support
- Residential Life
- Student Voices
- MBA Alumni Voices
- A Week in the Life
- Career Support
- Employment Outcomes
- Cost of Attendance
- Knight-Hennessy Scholars Program
- Yellow Ribbon Program
- BOLD Fellows Fund
- Application Process
- Loan Forgiveness
- Contact the Financial Aid Office
- Evaluation Criteria
- GMAT & GRE
- English Language Proficiency
- Personal Information, Activities & Awards
- Professional Experience
- Letters of Recommendation
- Optional Short Answer Questions
- Application Fee
- Reapplication
- Deferred Enrollment
- Joint & Dual Degrees
- Entering Class Profile
- Event Schedule
- Ambassadors
- New & Noteworthy
- Ask a Question
- See Why Stanford MSx
- Is MSx Right for You?
- MSx Stories
- Leadership Development
- How You Will Learn
- Admission Events
- Personal Information
- GMAT, GRE & EA
- English Proficiency Tests
- Career Change
- Career Advancement
- Career Support and Resources
- Daycare, Schools & Camps
- U.S. Citizens and Permanent Residents
- Requirements
- Requirements: Behavioral
- Requirements: Quantitative
- Requirements: Macro
- Requirements: Micro
- Annual Evaluations
- Field Examination
- Research Activities
- Research Papers
- Dissertation
- Oral Examination
- Current Students
- Education & CV
- International Applicants
- Statement of Purpose
- Reapplicants
- Application Fee Waiver
- Deadline & Decisions
- Job Market Candidates
- Academic Placements
- Stay in Touch
- Faculty Mentors
- Current Fellows
- Standard Track
- Fellowship & Benefits
- Group Enrollment
- Program Formats
- Developing a Program
- Diversity & Inclusion
- Strategic Transformation
- Program Experience
- Contact Client Services
- Campus Experience
- Live Online Experience
- Silicon Valley & Bay Area
- Digital Credentials
- Faculty Spotlights
- Participant Spotlights
- Eligibility
- International Participants
- Stanford Ignite
- Frequently Asked Questions
- Founding Donors
- Program Contacts
- Location Information
- Participant Profile
- Network Membership
- Program Impact
- Collaborators
- Entrepreneur Profiles
- Company Spotlights
- Seed Transformation Network
- Responsibilities
- Current Coaches
- How to Apply
- Meet the Consultants
- Meet the Interns
- Intern Profiles
- Collaborate
- Research Library
- News & Insights
- Databases & Datasets
- Research Guides
- Consultations
- Research Workshops
- Career Research
- Research Data Services
- Course Reserves
- Course Research Guides
- Material Loan Periods
- Fines & Other Charges
- Document Delivery
- Interlibrary Loan
- Equipment Checkout
- Print & Scan
- MBA & MSx Students
- PhD Students
- Other Stanford Students
- Faculty Assistants
- Research Assistants
- Stanford GSB Alumni
- Telling Our Story
- Staff Directory
- Site Registration
- Alumni Directory
- Alumni Email
- Privacy Settings & My Profile
- Success Stories
- The Story of Circles
- Support Women’s Circles
- Stanford Women on Boards Initiative
- Alumnae Spotlights
- Insights & Research
- Industry & Professional
- Entrepreneurial Commitment Group
- Recent Alumni
- Half-Century Club
- Fall Reunions
- Spring Reunions
- MBA 25th Reunion
- Half-Century Club Reunion
- Faculty Lectures
- Ernest C. Arbuckle Award
- Alison Elliott Exceptional Achievement Award
- ENCORE Award
- Excellence in Leadership Award
- John W. Gardner Volunteer Leadership Award
- Robert K. Jaedicke Faculty Award
- Jack McDonald Military Service Appreciation Award
- Jerry I. Porras Latino Leadership Award
- Tapestry Award
- Student & Alumni Events
- Executive Recruiters
- Interviewing
- Land the Perfect Job with LinkedIn
- Negotiating
- Elevator Pitch
- Email Best Practices
- Resumes & Cover Letters
- Self-Assessment
- Whitney Birdwell Ball
- Margaret Brooks
- Laura Bunch
- Bryn Panee Burkhart
- Margaret Chan
- Ricki Frankel
- Peter Gandolfo
- Cindy W. Greig
- Natalie Guillen
- Carly Janson
- Sloan Klein
- Sherri Appel Lassila
- Stuart Meyer
- Tanisha Parrish
- Virginia Roberson
- Philippe Taieb
- Michael Takagawa
- Terra Winston
- Johanna Wise
- Debbie Wolter
- Rebecca Zucker
- Complimentary Coaching
- Changing Careers
- Work-Life Integration
- Career Breaks
- Flexible Work
- Encore Careers
- Join a Board
- D&B Hoovers
- Data Axle (ReferenceUSA)
- EBSCO Business Source
- Global Newsstream
- Market Share Reporter
- ProQuest One Business
- RKMA Market Research Handbook Series
- Student Clubs
- Entrepreneurial Students
- Stanford GSB Trust
- Alumni Community
- How to Volunteer
- Springboard Sessions
- Consulting Projects
- 2020 – 2029
- 2010 – 2019
- 2000 – 2009
- 1990 – 1999
- 1980 – 1989
- 1970 – 1979
- 1960 – 1969
- 1950 – 1959
- 1940 – 1949
- Service Areas
- ACT History
- ACT Awards Celebration
- ACT Governance Structure
- Building Leadership for ACT
- Individual Leadership Positions
- Leadership Role Overview
- Purpose of the ACT Management Board
- Contact ACT
- Business & Nonprofit Communities
- Reunion Volunteers
- Ways to Give
- Fiscal Year Report
- Business School Fund Leadership Council
- Planned Giving Options
- Planned Giving Benefits
- Planned Gifts and Reunions
- Legacy Partners
- Giving News & Stories
- Giving Deadlines
- Development Staff
- Submit Class Notes
- Class Secretaries
- Board of Directors
- Health Care
- Sustainability
- Class Takeaways
- All Else Equal: Making Better Decisions
- If/Then: Business, Leadership, Society
- Grit & Growth
- Think Fast, Talk Smart
- Spring 2022
- Spring 2021
- Autumn 2020
- Summer 2020
- Winter 2020
- In the Media
- For Journalists
- DCI Fellows
- Other Auditors
- Academic Calendar & Deadlines
- Course Materials
- Entrepreneurial Resources
- Campus Drive Grove
- Campus Drive Lawn
- CEMEX Auditorium
- King Community Court
- Seawell Family Boardroom
- Stanford GSB Bowl
- Stanford Investors Common
- Town Square
- Vidalakis Courtyard
- Vidalakis Dining Hall
- Catering Services
- Policies & Guidelines
- Reservations
- Contact Faculty Recruiting
- Lecturer Positions
- Postdoctoral Positions
- Accommodations
- CMC-Managed Interviews
- Recruiter-Managed Interviews
- Virtual Interviews
- Campus & Virtual
- Search for Candidates
- Think Globally
- Recruiting Calendar
- Recruiting Policies
- Full-Time Employment
- Summer Employment
- Entrepreneurial Summer Program
- Global Management Immersion Experience
- Social-Purpose Summer Internships
- Process Overview
- Project Types
- Client Eligibility Criteria
- Client Screening
- ACT Leadership
- Social Innovation & Nonprofit Management Resources
- Develop Your Organization’s Talent
- Centers & Initiatives
- Student Fellowships
You are using an outdated browser. Please upgrade your browser or activate Google Chrome Frame to improve your experience.
- News & Blog
- In-Memory Data Grids
- In-Memory Computing
- Try for Free
- Operational Intelligence
Major U.S. Airline
A Case Study
Applications: Fast data tracking for flights, passengers, and messages
Server configuration: Multiple server farms in several locations run ScaleOut StateServer® Pro on more than 100 physical and virtual servers to manage data for hundreds of connected web and application servers.
Reason for Deployment: Needed low latency data storage and scalable data access for mission-critical applications serving hundreds of thousands of global passengers, flights, and operations functions.
Results: Removed data access bottlenecks to provide real-time data when it is needed for improved customer satisfaction and operational efficiency, developed a trusted technology partnership, and lowered cost by replacing another technology solution with ScaleOut Software.
A major U.S. airline has been a ScaleOut Software customer for more than a decade, using ScaleOut StateServer Pro with its in-memory computing and caching technologies to help manage the airline’s critical global flight tracking, passenger, baggage, and operations data with fast and highly available access to all stored data at any time.
Over the years, ScaleOut has become a trusted technology partner for the airline. It has helped to build a customized data access layer on top of ScaleOut StateServer Pro to meet the airline’s specific data access and management needs, to deliver high-value solutions with substantial cost efficiencies, and most recently to navigate additional complications spurred on by COVID-19’s impacts on the travel industry.
At the airline, ScaleOut Software primarily supports a team that is responsible for sourcing and persisting real-time data into enterprise data stores and for delivering events to application clients for business operations around the globe. Having fast and reliable access to the data stored in ScaleOut Software’s in-memory data grid is critical to managing passenger information, flight and baggage tracking, and operations control.
This team functions at the core of the airline’s nervous system and sends server-based information to data centers and systems around the world. Whether for passenger data or flight tracking and positioning information, the airline depends on ScaleOut Software to provide fast access to its data from any location without issues or outages.
“We really needed a fast system to store our data, keep it persisted, and make sure it doesn’t get lost or da maged.” – Senior Software Developer
Since deploying ScaleOut StateServer Pro , the airline has increased the performance of its applications and gained a reliable tool for caching critical .NET objects and their associations across locations that meets the need to process hundreds of thousands of data points each day.
“ The product has been awesome, stable. And that’s exactly what we want, you know? We don’t have to do anything to it. So, it’s perfect. It just runs and runs.” – Senior Software Developer
Due to ScaleOut’s reliability and ease of use, t he airline has consolidated from using three different caching technologies to two and standardi zed on ScaleOut StreamServer Pro. In addition to improving software engineering workflows, this change provide s additional business and cost-saving efficiencies .
“When looking at competing technologies, ScaleOut Software, is a much better value and it delivers consistently. They have also been a really great partner to us, taking our feedback seriously and helping to keep our data operations running smoothly.” – Resource Development Manager
This major U.S. airline value s having a true partnership with ScaleOut Software and its development, sales, and leadership teams. Whether providing additional coding support to complete a software transformation project, customizing its product for the airline’s needs, or flexibility to mitigate COVID-19 driven challenges, ScaleOut Software has been on call to help address all challenges .
Author: Kayley King
Try scaleout for free.
Use the power of in-memory computing in minutes on Windows or Linux.
Not ready to download? CONTACT US TO LEARN MORE
Data Topics
- Data Architecture
- Data Literacy
- Data Science
- Data Strategy
- Data Modeling
- Governance & Quality
- Data Education
- Smart Data News, Articles, & Education
Case Study: Tracking and Tracing Drugs in the Pharmaceutical Supply Chain
Failures or lack of visibility in the many-tiered pharmaceutical supply chain have multiple repercussions. Drug shortages have adverse economic and clinical effects on patients — they are more likely to have increased out-of-pocket costs, rates of drug errors, and, yes, mortality. Hospitals and health systems allocate over 8.6 million hours of additional labor hours to […]
Failures or lack of visibility in the many-tiered pharmaceutical supply chain have multiple repercussions. Drug shortages have adverse economic and clinical effects on patients — they are more likely to have increased out-of-pocket costs, rates of drug errors, and, yes, mortality.
The US had about 150 to 300 drug shortages every quarter from 2014 to 2019.
For drug managers, maintaining excess inventory to try to avoid shortages brings significant costs in storing pharmaceuticals — and waste when they are not used. They also struggle with being able to predict where a particular drug is likely to be needed at a particular time.
The average pharma holds 180 days of finished goods inventory, and could free up $25 billion if it reduced that to a target of 80 to 100 days. With increased competition from generics and rival brands, cutting costs in the supply chain lets them redirect money to competitive ends such as funding product development.
Compliance is another issue. Serialization compliance required by the FDA’s Drug Supply Chain Security Act requires manufacturers, re-packagers, wholesale distributors, and pharmacies to be capable of lot-level product tracing and to provide applicable transaction information, history, and statement.
To protect patients and prevent falsified medicines from entering the supply chain, the EU’s Falsified Medicines Directive was passed to increase the security of the manufacturing and delivery of medicines across Europe. The main focus is on counterfeit and falsified drugs that can be ineffective or even dangerous.
By 2023 in the US, lot-level tracing will move to unit-level serialization. Russia’s serialization gives pharma companies until this year for complete unit- and batch-level traceability. Brazil’s track and trace regulations go into effect in May 2022. In South Korea and India, companies must uniquely serialize drug products. Saudi Arabia’s Vision 2030 plan includes adopting technology for tracking all human registered drugs manufactured in Saudi Arabia and those imported from abroad. China has published regulations providing for the development of a new national drug traceability system by 2022.
Regulations that require that manufacturers add serial numbers to medications give them more data than previously, a benefit for having information about the status of drugs wherever they are in the supply chain. But getting this right requires that partners in the supply chain participate in the tracking.
Track and Trace in Action
Global pharmaceutical company Merck KGaA Healthcare is working on this issue. It maintains about 150 days of drug inventory, which is expensive to keep in-house and particularly wasteful when it comes to personalized drug therapies with short shelf lives. Its supply-and-demand forecasts are 85 percent accurate today.
One of the drugs it manufactures is related to amino oncology drugs. It was looking for a way to improve forecasting for these potentially life-changing and life-saving drugs. These are personalized medications, expensive and valuable. It’s imperative to ensure the drugs make it to the right place at the right time. It all starts with drawing blood from the patient and sending it to the lab, where a therapy is formulated based on the patient’s DNA. The drug created from this must travel along the supply chain in temperature-controlled environments, and it must reach the patient within a specified time frame for treatment.
Merck KGaA Healthcare has piloted a project with TraceLink , using the vendor’s Digital Network Platform to improve supply-and-demand forecasting and reduce shortages of critical immune-oncology drugs. Serialization on its own is still fairly new and even as it matures, TraceLink’s platform focuses on further enhancing the supply chain process. Not only does it generate serial numbers, but it also provides a centralized hub where third-party participants in drug companies’ supply chains can share relevant information such as manifests and product master data with each other. Bringing everyone together on the same platform is a more efficient way of trading this information than drug companies’ having to create point-to-point connections from their internal systems with the systems used by the companies they need to share data with.
Contract drug manufacturers, which in many cases manufacture generics for multiple drug companies, use the platform as well as brand-name big drug companies, and smaller ones that sometimes are the creators of blockbuster drugs. It’s hard for them to track all those relationships.
“With the network, you integrate once and interoperate all down the supply chain to increase visibility and lower the bar to sharing data,” said John Hogan, TraceLink Senior Vice President of Engineering. “The network changes the process of integrating between individual supply chain and inventory systems, which is difficult and for pharmaceutical companies and is not necessarily their strong point.”
The company has defined canonical data formats for internal management; it maps pharma supply chain partners’ data (logistics companies, dispensers, and wholesalers) into that and then back out into the format that another member of the network might need.
Compliance was the first problem TraceLink tackled but it realized the huge value in the data for many other purposes. “If you know how particular products traveled along the supply chain, you have unique insight that can be used for dealing with recalls or obstacles in the supply chain,” he said. “You can make things more efficient and avoid having those problems repeat themselves.”
It’s also providing APIs that other businesses can use when they see other use cases for the Digital Network Platform to leverage its core construction — for instance, to create new user experiences and provide a different preferred view into the same information using built-in machine learning algorithm. “In the future, you can imagine use cases where people involved in clinical trials might want to let their information be shared to prove the efficacy of those trials,” Hogan said.
Over the next five years the pharma track and trace solutions market is expected to surpass $2.38 billion. Other vendors that are in the pharma track and trace space include rfxcel, Adents, Acsis, Frequentz, Optel Group, Arvato Systems, E2open, Retail Solutions, UpNet, iControl and Nulogy.
Image used under license from Shutterstock.com
Leave a Reply Cancel reply
You must be logged in to post a comment.
IMAGES
VIDEO
COMMENTS
While academic exercises often feature clean, well-structured data and simplified scenarios, real-world projects tackle messy, diverse data sources with practical constraints and genuine business objectives. These case studies reflect the complexities data scientists face when translating data into actionable insights in the corporate world.
Data Analytics Case Studies in Oil and Gas 9) Shell Data Analyst Case Study Project. Shell is a global group of energy and petrochemical companies with over 80,000 employees in around 70 countries. Shell uses advanced technologies and innovations to help build a sustainable energy future.
Top 12 Data Science Case Studies 1. Data Science in Hospitality Industry. In the hospitality sector, data analytics assists hotels in better pricing strategies, customer analysis, brand marketing, tracking market trends, and many more. Airbnb focuses on growth by analyzing customer voice using data science.
The data source, "Fitbit Fitness Tracker Data" was found on data science and coding website, Kaggle by data scientist, Möbius. The datasets were sourced from a survey performed on Amazon Mechanical Turk workers for a study which collected Fitbit tracking data. The original study states 30 participants were surveyed, however 33 can be found in the data.
Fast, data-informed decision-making can drive business success. Managing high customer expectations, navigating marketing challenges, and global competition - many organizations look to data analytics and business intelligence for a competitive advantage. Using data to serve up personalized ads based on browsing history, providing contextual KPI data access for all employees and centralizing ...
Walmart as a case study. To our knowledge, this report is the first comprehensive analysis of Walmart's efforts to gather "Big Data" or massive amounts of information about consumers, analyze that information in complex ways, and use the results of that analysis to track consumers on and offline.
GA4 Case Study: Tracking Data for eCommerce & Non-eCommerce Sites. By Mike Belasco on October 11, 2022. 22 shares; By Mike Belasco on October 11, 2022. Over the last year, Inflow's digital analytics team has been working hard to migrate our clients to Google Analytics 4 in preparation for the sunsetting of Universal Analytics.
At the time, the launch of Atlas was heralded as one of the most dramatic steps toward solving for cross-device reporting and cross-channel (particularly online and offline) issues (see Exhibit 1 for a description of cross-device and cross-channel components). This case serves to explain consumer tracking within the context of data monetization.
Applications: Fast data tracking for flights, passengers, and messages Server configuration: Multiple server farms in several locations run ScaleOut StateServer® Pro on more than 100 physical and virtual servers to manage data for hundreds of connected web and application servers. Reason for Deployment: Needed low latency data storage and scalable data access for mission-critical applications ...
Brazil's track and trace regulations go into effect in May 2022. In South Korea and India, companies must uniquely serialize drug products. Saudi Arabia's Vision 2030 plan includes adopting technology for tracking all human registered drugs manufactured in Saudi Arabia and those imported from abroad. China has published regulations ...