- DATA SCIENCE BOOTCAMP
- ONLINE DATA SCIENCE BOOTCAMP
- Professional Development Courses
- CORPORATE OFFERINGS
- HIRING PARTNERS
- Learn Pandas
- Learn NumPy
- Learn SciPy
- Learn Matplotlib
- Random Forest
- Linear Regression
- Decision Tree
- Learn Generative AI
- Learn ChatGPT-3.5
- Learn ChatGPT-4
- Learn Google Bard
- Learn Python
- Learn MySQL
- Learn NoSQL
- Learn PySpark
- Learn PyTorch
- Python Hard
- Python Easy
Customer Lifetime Value Product Recommendation for Retail
Project GitHub | LinkedIn: Niki Moritz Hao-Wei Matthew Oren
The skills we demoed here can be learned through taking data science with machine learning bootcamp with nyc data science academy., introduction.
On any given day, countless transactions are being made in the retail space. All the transactions generate data, which can be utilized by merchants to improve their sales and help them make important business decisions. As part of our capstone, we consulted two retail clients to explore and identify trends in their customer behavior by building visualizations as well as predictive models. We have split the blog into 2 parts to represent our exploration and modeling for the respective clients and dataset.
Part 1. Predictive Customer Lifetime Value and Product Recommendation for Retail
1.Exploratory Data Analysis (EDA)
Upon receiving the data for the first client, we realized that the product items listed were in a semi-structured format. That is, some of the item names were in a “product name – color” format, although there were many items that did not have that format. That made it difficult to separate out the product with the color. To simplify things, we decomposed all the item names into individual words in a corporus, thereby allowing us to see the top generic items/color sold by analyzing the word count frequency.
We wanted to analyze any sales and item trends by year, so the frequency was standardized to show a meaningful comparison across years. As the graphs below show: necklaces/earrings are hot sellers. And gold colored jewelry are in demand.
For retailers, November tends to be the month of high sales volume due to the holiday season and Black Friday deals. We wanted to see if there were any specific trends in November that can allow the business to determine when is the best time to increase their advertising budget and promotional efforts. Indeed, we recognized that the first week of November in every year has the weakest sales volume. Therefore, we advised the company to perhaps spend more marketing dollars on the first week as an early holiday special promotion.
2. Modeling
2.1 RFM Segmentation, Analysis and Model
Since we did not have a target variable to predict, we had to get creative during our modeling phase. After doing some research, we decided to first do an RFM analysis (recency, frequency, monetary). The goal of RFM analysis is to utilize data regarding the recency (how recently a customer has purchased), frequency (the number of repeat purchases of a customer), and the monetary value of the orders to determine how valuable a customer is, as well as how many times a customer will return over the course of the next x time periods.
In our case, we were specifically interested in the CLV (customer lifetime value) and the number of times a customer will return. We used these results to perform a customer segmentation by creating the additional variable “Target_Group.” Let’s begin with data preparation.
In order to perform RFM analysis on our data, we had to transform it. Luckily, the “lifetimes” package in Python provides a function to do so. After having transformed our data, our data frame looked like this:
The ‘T’ column in this data frame simply represents the age of each customer. Equipped with our prepared data, we first investigated the specific characteristics of our clients’ best customers with regards to frequency and recency. We decided to plot our result using a heatmap that turns more yellow if a customer is more likely to return within the next period of time (as part of the data preparation, you have to specify a unit of time you want to base your analysis on; we set this parameter to ‘M’ for months).
This heatmap shows that the customers most likely to return have a historical frequency of around 16, meaning they’ve came back to purchase again 16 times, and a recency a little over 30, meaning that these customers had an age of a little over 30 when they last purchased.
Before starting to segment our clients’ customers, we wanted to make sure the model we were using was making accurate predictions. As in other machine learning approaches to prevent overfitting, we divided our data in a calibration and holdout set. Then, we fit the model on our calibration set and made predictions on the holdout set that the model had not seen yet.
Since we only had data for a relatively short period of time, we used the last 6 months of our data to test our model and got the below result. As we can see, despite our model not fitting the actual purchases perfectly, it was able to capture trends and significant turning points over the course of six months:
While this information is valuable, it does not satisfy our goal for insights yet. We wanted to produce actionable insights that could be implemented immediately to create business value. To do so, we decided to take a look at the number of times a customer is predicted to return within the next month, which can be interpreted as the probability of the customer returning in the next month.
Based on these insights, customers most likely to return can be targeted specifically with ad/marketing campaigns. By doing so, the amount of money spent on marketing can be reduced, and the return on these expenses can be increased.
We also included a more generalized version of this technique in our final product that, after specifying a time range and selecting a customer, returned the number of expected repeat purchases by this specific customer.
The other aspect of RFM analysis that we were really interested in as a basis for our clustering was the CLV. Calculating the CLV using the “lifetimes” package is really easy once you’ve prepared your data the right way. In order to be able to use the DCF (discounted cash flow) method, we needed to add a column with the monetary .
Then, it was just a matter of fitting the model and making the computation. We then sorted our data in an descending order to identify the most valuable customers for our clients. Here is an example of what our result looked like: We also wanted to let our clients know the probability of a particular customer to make returning purchases.
In other words, we needed to understand the probability that the customer is still “alive” or active in the customer lifecycle. The “lifetimes” Python library includes the tools to allow us to do these types of analyses. Take for instance the following customer, who made their initial purchase back in October of 2016 and hasn’t made another purchase for a few months. The likelihood of the customer being a recurring customer drops until they make their second purchase.
For the client, this information can be used to signal when to send out customer targeted promotions whenever the customer aliveness probability drops below a certain threshold. In the life cycle plot below, the dashed lines represent a purchase and the normal line describe the probability of this customer being alive at that specific date.
From a business perspective, it may also be helpful to segment customers based on their buying patterns. There are many ways to do this; the route we took was to use the CLV , mapping them to their corresponding percentiles and finally binning them to “Low Priority, Bronze, Silver, Gold”. Through this approach, Company A can easily see who are their most important customers and also create new strategies to bump lower tier customers to the higher tiers.
One such approach that we recommended was to have tier specific rewards program and to include periodic progress email to the customer to let them know how close they are to reaching the next tier in an attempt to encourage higher purchasing volume.
2.2 Association Rules and the Apriori Algorithm
Another model we utilized was the product recommendation system that pushes potentially interesting products to customers. One of the bigger costs in the retail business is the cost of unsold inventory that sits in the warehouse. To solve this problem, we designed a recommendation system utilizing associated rules with an added feature. The system allows the business to input items that they want to move from their inventory, and the recommendation system will prioritize those items if it is associated with items that a customer has or intends to purchase.
The system was based on a priori, or associated learning algorithm. It is an algorithm for frequent itemset mining over transactional database. The goal is to find high frequent item combination in transactions and make “rules” to make decisions to recommend products.
Each rule has three parameters: support, confidence, and lift. Generally speaking, it is desirable to use the rules with high support. These rules will be applicable to a large number of transactions and are more interesting and profitable to evaluate from a business standpoint.
The biggest challenge we encounter in rule mining was due to the unique nature of the client’s customers. As the customers are businesses and not normal retail customers, the rules that were generated were not unique. For example, jewelry retailers will buy earrings across multiple colors in bulk to appeal to different customers, whereas an individual retail customer will purchase just one or 2 colors for any particular item.
The result is that a rule generated will be that customers who buy “Gold Earring” will also buy “Silver Earring,” which is not insightful (we want to generate rules between different products).
To achieve the goal of generating more meaningful rules, we created a new feature which is “vendor : category”. As its literal meaning, this feature stores the vendor and category information of each item. By using this feature instead of line item in transaction, we were able to decrease the computation burden and also resolve the problems discussed above, thereby acquiring interesting rules.
The following figure is an example of the new rules created using the R package “arules”. Comparing to its peer python packages, it shows more completed result. R provides robust visualization tools for a priori, “arulesViz,” which is not compatible in Python. The figure below shows a sample of three rules of total 2138 rule. For rule arrows pointing from item to rule vertices indicate LHS items and from rule to item represent RHS. The size of rules vertices indicates support of the rules, and color indicates lift. Larger size and darker color mean higher support and lift, respectively.
Graph-based plot with items and rules as vertices (3 rules)
Grouped Matrix-based plot (2138 rules)
The next step is to push products based on the rules. Here is how our model works: first, once customer generate an order, each item in the order will be transformed into “vendor : category” type, and create LHS “vendor : category” list; second, the model will look up rules library, and generate the highest support RHS “vendor : category” list according to the LHS; third, find all items under RHS “vendor : category” list, and rank them by their frequency, then recommend the top three frequency items.
As mentioned at the beginning of this section, our model has a unique feature of enabling businesses to direct attention to items they want to offload. This is done by modifying the third step. Let’s say, the business created a list of items which make up a lot of inventory in the warehouse, if these items happen to appear in the items under RHS “vendor : category” list, they will override the whole rank and are pushed to customers.
Through the CLV analysis and improved recommendation system, we aimed to help the business better target customer segmentation and help with their inventory turnover ratio. We’ve learned a great deal in working with this business in that applying our data science skill sets to a real world environment is often quite different than the classroom setting. In part 2, we will continue our journey with yet another retail company and hope to uncover hidden sales patterns through a Shiny dashboard as well as forecast future sales from a time series analysis.
About Authors
Jo Wen (Iris) Chen
Esther chang.
Raymond Liang
Nutchaphol chaivorapongsa, leave a comment, cancel reply.
You must be logged in to post a comment.
View Posts by Categories
Our recent popular posts, view posts by tags, nyc data science academy.
NYC Data Science Academy teaches data science, trains companies and their employees to better profit from data, excels at big data project consulting, and connects trained Data Scientists to our industry.
NYC Data Science Academy is licensed by New York State Education Department.
Get detailed curriculum information about our amazing bootcamp!
- Refund Policy
SOCIAL MEDIA
10 Unique Data Science Capstone Project Ideas
A capstone project is a culminating assignment that allows students to demonstrate the skills and knowledge they’ve acquired throughout their degree program. For data science students, it’s a chance to tackle a substantial real-world data problem.
If you’re short on time, here’s a quick answer to your question: Some great data science capstone ideas include analyzing health trends, building a predictive movie recommendation system, optimizing traffic patterns, forecasting cryptocurrency prices, and more .
In this comprehensive guide, we will explore 10 unique capstone project ideas for data science students. We’ll overview potential data sources, analysis methods, and practical applications for each idea.
Whether you want to work with social media datasets, geospatial data, or anything in between, you’re sure to find an interesting capstone topic.
Project Idea #1: Analyzing Health Trends
When it comes to data science capstone projects, analyzing health trends is an intriguing idea that can have a significant impact on public health. By leveraging data from various sources, data scientists can uncover valuable insights that can help improve healthcare outcomes and inform policy decisions.
Data Sources
There are several data sources that can be used to analyze health trends. One of the most common sources is electronic health records (EHRs), which contain a wealth of information about patient demographics, medical history, and treatment outcomes.
Other sources include health surveys, wearable devices, social media, and even environmental data.
Analysis Approaches
When analyzing health trends, data scientists can employ a variety of analysis approaches. Descriptive analysis can provide a snapshot of current health trends, such as the prevalence of certain diseases or the distribution of risk factors.
Predictive analysis can be used to forecast future health outcomes, such as predicting disease outbreaks or identifying individuals at high risk for certain conditions. Machine learning algorithms can be trained to identify patterns and make accurate predictions based on large datasets.
Applications
The applications of analyzing health trends are vast and far-reaching. By understanding patterns and trends in health data, policymakers can make informed decisions about resource allocation and public health initiatives.
Healthcare providers can use these insights to develop personalized treatment plans and interventions. Researchers can uncover new insights into disease progression and identify potential targets for intervention.
Ultimately, analyzing health trends has the potential to improve overall population health and reduce healthcare costs.
Project Idea #2: Movie Recommendation System
When developing a movie recommendation system, there are several data sources that can be used to gather information about movies and user preferences. One popular data source is the MovieLens dataset, which contains a large collection of movie ratings provided by users.
Another source is IMDb, a trusted website that provides comprehensive information about movies, including user ratings and reviews. Additionally, streaming platforms like Netflix and Amazon Prime also provide access to user ratings and viewing history, which can be valuable for building an accurate recommendation system.
There are several analysis approaches that can be employed to build a movie recommendation system. One common approach is collaborative filtering, which uses user ratings and preferences to identify patterns and make recommendations based on similar users’ preferences.
Another approach is content-based filtering, which analyzes the characteristics of movies (such as genre, director, and actors) to recommend similar movies to users. Hybrid approaches that combine both collaborative and content-based filtering techniques are also popular, as they can provide more accurate and diverse recommendations.
A movie recommendation system has numerous applications in the entertainment industry. One application is to enhance the user experience on streaming platforms by providing personalized movie recommendations based on individual preferences.
This can help users discover new movies they might enjoy and improve overall satisfaction with the platform. Additionally, movie recommendation systems can be used by movie production companies to analyze user preferences and trends, aiding in the decision-making process for creating new movies.
Finally, movie recommendation systems can also be utilized by movie critics and reviewers to identify movies that are likely to be well-received by audiences.
For more information on movie recommendation systems, you can visit https://www.kaggle.com/rounakbanik/movie-recommender-systems or https://www.researchgate.net/publication/221364567_A_new_movie_recommendation_system_for_large-scale_data .
Project Idea #3: Optimizing Traffic Patterns
When it comes to optimizing traffic patterns, there are several data sources that can be utilized. One of the most prominent sources is real-time traffic data collected from various sources such as GPS devices, traffic cameras, and mobile applications.
This data provides valuable insights into the current traffic conditions, including congestion, accidents, and road closures. Additionally, historical traffic data can also be used to identify recurring patterns and trends in traffic flow.
Other data sources that can be used include weather data, which can help in understanding how weather conditions impact traffic patterns, and social media data, which can provide information about events or incidents that may affect traffic.
Optimizing traffic patterns requires the use of advanced data analysis techniques. One approach is to use machine learning algorithms to predict traffic patterns based on historical and real-time data.
These algorithms can analyze various factors such as time of day, day of the week, weather conditions, and events to predict traffic congestion and suggest alternative routes.
Another approach is to use network analysis to identify bottlenecks and areas of congestion in the road network. By analyzing the flow of traffic and identifying areas where traffic slows down or comes to a halt, transportation authorities can make informed decisions on how to optimize traffic flow.
The optimization of traffic patterns has numerous applications and benefits. One of the main benefits is the reduction of traffic congestion, which can lead to significant time and fuel savings for commuters.
By optimizing traffic patterns, transportation authorities can also improve road safety by reducing the likelihood of accidents caused by congestion.
Additionally, optimizing traffic patterns can have positive environmental impacts by reducing greenhouse gas emissions. By minimizing the time spent idling in traffic, vehicles can operate more efficiently and emit fewer pollutants.
Furthermore, optimizing traffic patterns can have economic benefits by improving the flow of goods and services. Efficient traffic patterns can reduce delivery times and increase productivity for businesses.
Project Idea #4: Forecasting Cryptocurrency Prices
With the growing popularity of cryptocurrencies like Bitcoin and Ethereum, forecasting their prices has become an exciting and challenging task for data scientists. This project idea involves using historical data to predict future price movements and trends in the cryptocurrency market.
When working on this project, data scientists can gather cryptocurrency price data from various sources such as cryptocurrency exchanges, financial websites, or APIs. Websites like CoinMarketCap (https://coinmarketcap.com/) provide comprehensive data on various cryptocurrencies, including historical price data.
Additionally, platforms like CryptoCompare (https://www.cryptocompare.com/) offer real-time and historical data for different cryptocurrencies.
To forecast cryptocurrency prices, data scientists can employ various analysis approaches. Some common techniques include:
- Time Series Analysis: This approach involves analyzing historical price data to identify patterns, trends, and seasonality in cryptocurrency prices. Techniques like moving averages, autoregressive integrated moving average (ARIMA), or exponential smoothing can be used to make predictions.
- Machine Learning: Machine learning algorithms, such as random forests, support vector machines, or neural networks, can be trained on historical cryptocurrency data to predict future price movements. These algorithms can consider multiple variables, such as trading volume, market sentiment, or external factors, to make accurate predictions.
- Sentiment Analysis: This approach involves analyzing social media sentiment and news articles related to cryptocurrencies to gauge market sentiment. By considering the collective sentiment, data scientists can predict how positive or negative sentiment can impact cryptocurrency prices.
Forecasting cryptocurrency prices can have several practical applications:
- Investment Decision Making: Accurate price forecasts can help investors make informed decisions when buying or selling cryptocurrencies. By considering the predicted price movements, investors can optimize their investment strategies and potentially maximize their returns.
- Trading Strategies: Traders can use price forecasts to develop trading strategies, such as trend following or mean reversion. By leveraging predicted price movements, traders can make profitable trades in the volatile cryptocurrency market.
- Risk Management: Cryptocurrency price forecasts can help individuals and organizations manage their risk exposure. By understanding potential price fluctuations, risk management strategies can be implemented to mitigate losses.
Project Idea #5: Predicting Flight Delays
One interesting and practical data science capstone project idea is to create a model that can predict flight delays. Flight delays can cause a lot of inconvenience for passengers and can have a significant impact on travel plans.
By developing a predictive model, airlines and travelers can be better prepared for potential delays and take appropriate actions.
To create a flight delay prediction model, you would need to gather relevant data from various sources. Some potential data sources include:
- Flight data from airlines or aviation organizations
- Weather data from meteorological agencies
- Historical flight delay data from airports
By combining these different data sources, you can build a comprehensive dataset that captures the factors contributing to flight delays.
Once you have collected the necessary data, you can employ different analysis approaches to predict flight delays. Some common approaches include:
- Machine learning algorithms such as decision trees, random forests, or neural networks
- Time series analysis to identify patterns and trends in flight delay data
- Feature engineering to extract relevant features from the dataset
By applying these analysis techniques, you can develop a model that can accurately predict flight delays based on the available data.
The applications of a flight delay prediction model are numerous. Airlines can use the model to optimize their operations, improve scheduling, and minimize disruptions caused by delays. Travelers can benefit from the model by being alerted in advance about potential delays and making necessary adjustments to their travel plans.
Additionally, airports can use the model to improve resource allocation and manage passenger flow during periods of high delay probability. Overall, a flight delay prediction model can significantly enhance the efficiency and customer satisfaction in the aviation industry.
Project Idea #6: Fighting Fake News
With the rise of social media and the easy access to information, the spread of fake news has become a significant concern. Data science can play a crucial role in combating this issue by developing innovative solutions.
Here are some aspects to consider when working on a project that aims to fight fake news.
When it comes to fighting fake news, having reliable data sources is essential. There are several trustworthy platforms that provide access to credible news articles and fact-checking databases. Websites like Snopes and FactCheck.org are good starting points for obtaining accurate information.
Additionally, social media platforms such as Twitter and Facebook can be valuable sources for analyzing the spread of misinformation.
One approach to analyzing fake news is by utilizing natural language processing (NLP) techniques. NLP can help identify patterns and linguistic cues that indicate the presence of misleading information.
Sentiment analysis can also be employed to determine the emotional tone of news articles or social media posts, which can be an indicator of potential bias or misinformation.
Another approach is network analysis, which focuses on understanding how information spreads through social networks. By analyzing the connections between users and the content they share, it becomes possible to identify patterns of misinformation dissemination.
Network analysis can also help in identifying influential sources and detecting coordinated efforts to spread fake news.
The applications of a project aiming to fight fake news are numerous. One possible application is the development of a browser extension or a mobile application that provides users with real-time fact-checking information.
This tool could flag potentially misleading articles or social media posts and provide users with accurate information to help them make informed decisions.
Another application could be the creation of an algorithm that automatically identifies fake news articles and separates them from reliable sources. This algorithm could be integrated into news aggregation platforms to help users distinguish between credible and non-credible information.
Project Idea #7: Analyzing Social Media Sentiment
Social media platforms have become a treasure trove of valuable data for businesses and researchers alike. When analyzing social media sentiment, there are several data sources that can be tapped into. The most popular ones include:
- Twitter: With its vast user base and real-time nature, Twitter is often the go-to platform for sentiment analysis. Researchers can gather tweets containing specific keywords or hashtags to analyze the sentiment of a particular topic.
- Facebook: Facebook offers rich data for sentiment analysis, including posts, comments, and reactions. Analyzing the sentiment of Facebook posts can provide valuable insights into user opinions and preferences.
- Instagram: Instagram’s visual nature makes it an interesting platform for sentiment analysis. By analyzing the comments and captions on Instagram posts, researchers can gain insights into the sentiment associated with different images or topics.
- Reddit: Reddit is a popular platform for discussions on various topics. By analyzing the sentiment of comments and posts on specific subreddits, researchers can gain insights into the sentiment of different communities.
These are just a few examples of the data sources that can be used for analyzing social media sentiment. Depending on the research goals, other platforms such as LinkedIn, YouTube, and TikTok can also be explored.
When it comes to analyzing social media sentiment, there are various approaches that can be employed. Some commonly used analysis techniques include:
- Lexicon-based analysis: This approach involves using predefined sentiment lexicons to assign sentiment scores to words or phrases in social media posts. By aggregating these scores, researchers can determine the overall sentiment of a post or a collection of posts.
- Machine learning: Machine learning algorithms can be trained to classify social media posts into positive, negative, or neutral sentiment categories. These algorithms learn from labeled data and can make predictions on new, unlabeled data.
- Deep learning: Deep learning techniques, such as recurrent neural networks (RNNs) or convolutional neural networks (CNNs), can be used to capture the complex patterns and dependencies in social media data. These models can learn to extract sentiment information from textual or visual content.
It is important to note that the choice of analysis approach depends on the specific research objectives, available resources, and the nature of the social media data being analyzed.
Analyzing social media sentiment has a wide range of applications across different industries. Here are a few examples:
- Brand reputation management: By analyzing social media sentiment, businesses can monitor and manage their brand reputation. They can identify potential issues, respond to customer feedback, and take proactive measures to maintain a positive image.
- Market research: Social media sentiment analysis can provide valuable insights into consumer opinions and preferences. Businesses can use this information to understand market trends, identify customer needs, and develop targeted marketing strategies.
- Customer feedback analysis: Social media sentiment analysis can help businesses understand customer satisfaction levels and identify areas for improvement. By analyzing sentiment in customer feedback, companies can make data-driven decisions to enhance their products or services.
- Public opinion analysis: Researchers can analyze social media sentiment to study public opinion on various topics, such as political events, social issues, or product launches. This information can be used to understand public sentiment, predict trends, and inform decision-making.
These are just a few examples of how analyzing social media sentiment can be applied in real-world scenarios. The insights gained from sentiment analysis can help businesses and researchers make informed decisions, improve customer experience, and drive innovation.
Project Idea #8: Improving Online Ad Targeting
Improving online ad targeting involves analyzing various data sources to gain insights into users’ preferences and behaviors. These data sources may include:
- Website analytics: Gathering data from websites to understand user engagement, page views, and click-through rates.
- Demographic data: Utilizing information such as age, gender, location, and income to create targeted ad campaigns.
- Social media data: Extracting data from platforms like Facebook, Twitter, and Instagram to understand users’ interests and online behavior.
- Search engine data: Analyzing search queries and user behavior on search engines to identify intent and preferences.
By combining and analyzing these diverse data sources, data scientists can gain a comprehensive understanding of users and their ad preferences.
To improve online ad targeting, data scientists can employ various analysis approaches:
- Segmentation analysis: Dividing users into distinct groups based on shared characteristics and preferences.
- Collaborative filtering: Recommending ads based on users with similar preferences and behaviors.
- Predictive modeling: Developing algorithms to predict users’ likelihood of engaging with specific ads.
- Machine learning: Utilizing algorithms that can continuously learn from user interactions to optimize ad targeting.
These analysis approaches help data scientists uncover patterns and insights that can enhance the effectiveness of online ad campaigns.
Improved online ad targeting has numerous applications:
- Increased ad revenue: By delivering more relevant ads to users, advertisers can expect higher click-through rates and conversions.
- Better user experience: Users are more likely to engage with ads that align with their interests, leading to a more positive browsing experience.
- Reduced ad fatigue: By targeting ads more effectively, users are less likely to feel overwhelmed by irrelevant or repetitive advertisements.
- Maximized ad budget: Advertisers can optimize their budget by focusing on the most promising target audiences.
Project Idea #9: Enhancing Customer Segmentation
Enhancing customer segmentation involves gathering relevant data from various sources to gain insights into customer behavior, preferences, and demographics. Some common data sources include:
- Customer transaction data
- Customer surveys and feedback
- Social media data
- Website analytics
- Customer support interactions
By combining data from these sources, businesses can create a comprehensive profile of their customers and identify patterns and trends that will help in improving their segmentation strategies.
There are several analysis approaches that can be used to enhance customer segmentation:
- Clustering: Using clustering algorithms to group customers based on similar characteristics or behaviors.
- Classification: Building predictive models to assign customers to different segments based on their attributes.
- Association Rule Mining: Identifying relationships and patterns in customer data to uncover hidden insights.
- Sentiment Analysis: Analyzing customer feedback and social media data to understand customer sentiment and preferences.
These analysis approaches can be used individually or in combination to enhance customer segmentation and create more targeted marketing strategies.
Enhancing customer segmentation can have numerous applications across industries:
- Personalized marketing campaigns: By understanding customer preferences and behaviors, businesses can tailor their marketing messages to individual customers, increasing the likelihood of engagement and conversion.
- Product recommendations: By segmenting customers based on their purchase history and preferences, businesses can provide personalized product recommendations, leading to higher customer satisfaction and sales.
- Customer retention: By identifying at-risk customers and understanding their needs, businesses can implement targeted retention strategies to reduce churn and improve customer loyalty.
- Market segmentation: By identifying distinct customer segments, businesses can develop tailored product offerings and marketing strategies for each segment, maximizing the effectiveness of their marketing efforts.
Project Idea #10: Building a Chatbot
A chatbot is a computer program that uses artificial intelligence to simulate human conversation. It can interact with users in a natural language through text or voice. Building a chatbot can be an exciting and challenging data science capstone project.
It requires a combination of natural language processing, machine learning, and programming skills.
When building a chatbot, data sources play a crucial role in training and improving its performance. There are various data sources that can be used:
- Chat logs: Analyzing existing chat logs can help in understanding common user queries, responses, and patterns. This data can be used to train the chatbot on how to respond to different types of questions and scenarios.
- Knowledge bases: Integrating a knowledge base can provide the chatbot with a wide range of information and facts. This can be useful in answering specific questions or providing detailed explanations on certain topics.
- APIs: Utilizing APIs from different platforms can enhance the chatbot’s capabilities. For example, integrating a weather API can allow the chatbot to provide real-time weather information based on user queries.
There are several analysis approaches that can be used to build an efficient and effective chatbot:
- Natural Language Processing (NLP): NLP techniques enable the chatbot to understand and interpret user queries. This involves tasks such as tokenization, part-of-speech tagging, named entity recognition, and sentiment analysis.
- Intent recognition: Identifying the intent behind user queries is crucial for providing accurate responses. Machine learning algorithms can be trained to classify user intents based on the input text.
- Contextual understanding: Chatbots need to understand the context of the conversation to provide relevant and meaningful responses. Techniques such as sequence-to-sequence models or attention mechanisms can be used to capture contextual information.
Chatbots have a wide range of applications in various industries:
- Customer support: Chatbots can be used to handle customer queries and provide instant support. They can assist with common troubleshooting issues, answer frequently asked questions, and escalate complex queries to human agents when necessary.
- E-commerce: Chatbots can enhance the shopping experience by assisting users in finding products, providing recommendations, and answering product-related queries.
- Healthcare: Chatbots can be deployed in healthcare settings to provide preliminary medical advice, answer general health-related questions, and assist with appointment scheduling.
Building a chatbot as a data science capstone project not only showcases your technical skills but also allows you to explore the exciting field of artificial intelligence and natural language processing.
It can be a great opportunity to create a practical and useful tool that can benefit users in various domains.
Completing an in-depth capstone project is the perfect way for data science students to demonstrate their technical skills and business acumen. This guide outlined 10 unique project ideas spanning industries like healthcare, transportation, finance, and more.
By identifying the ideal data sources, analysis techniques, and practical applications for their chosen project, students can produce an impressive capstone that solves real-world problems and showcases their abilities.
Similar Posts
High-Paying Computer Science Associate Degree Jobs
In today’s technology-driven world, a computer science associate degree can open the door to lucrative and rewarding careers. Whether you’re a recent high school grad or looking to change careers, an associate degree in computer science provides the technical skills and knowledge needed to qualify for many in-demand jobs. If you’re short on time, here’s…
Pursuing A Part-Time Phd In Computer Science: What You Need To Know
Earning a PhD is the pinnacle of academic achievement in computer science, opening doors to research, teaching, and leadership roles. But taking 4+ years off work for a full-time program isn’t feasible for everyone. Part-time PhD options allow professionals to attain this goal while continuing their careers. If you’re short on time, here’s the key…
Is A Communications Degree A Bachelor Of Arts Or Bachelor Of Science?
In today’s world, a communications degree can open many doors and lead to exciting careers in fields like public relations, marketing, journalism, advertising, and more. But one question that often comes up when deciding to pursue this versatile degree is: will I earn a Bachelor of Arts (BA) or a Bachelor of Science (BS) in…
Is Human Geography A Social Science? Examining The Field
Human geography studies the relationship between humans and their environment. If you’re interested in majoring in this field, you likely want to know – is human geography a social science? While it contains some spatial and scientific elements, human geography is fundamentally grounded in the social sciences. If you’re short on time, here’s a quick…
The Top 10 Cities For Computer Science Jobs And Careers
For tech professionals, location matters. Some cities offer far more opportunities for computer science careers than others. If you’re looking to jumpstart your CS job search, targeting areas with thriving tech scenes and plentiful programming jobs is key. If you’re short on time, here’s a quick answer: The top cities for computer science jobs are…
What Does It Mean To Believe In Science Over Religion?
In an age of rapidly advancing technology and scientific discoveries, the line between science and faith is increasingly blurred for many people. If you feel yourself siding more with science over religion, you’re not alone. If you’re short on time, here’s a quick answer: Believing in science over religion means trusting empirical evidence and the…
Capstone Projects
The capstone project experience.
In the final two quarters of the program, students gain real world experience working in small groups on a data science challenge facing a company or not-for-profit. At the conclusion of the capstone project, sponsoring organizations are invited to attend a formal Capstone Event where students showcase their work. Capstone projects typically span a wide range of interests, including energy, agriculture, retail, urban planning, healthcare, marketing, and education.
Examples of Previous Capstone Sponsors
- Applied Physics Lab, UW
- Civil & Environmental Engineering, WSU
- Equal Opportunity Schools
- The Hershey Company
- Jacksonville Zoo and Gardens
- Kids on 45th
- Seattle Children’s Hospital
- Urban Planning, UW
- Virginia Mason
Capstone Archives
Capstone projects take a variety of forms. These include, but are not limited to, dashboard development, data analysis, pipeline building, and machine learning models. The scope and goal of each project is developed to satisfy sponsor needs and student interests.
2024 Cohort
In 2024 sixteen teams presented capstone posters at our MSDS co-working space. These projects included audio signal analysis (Bats!, SonarSquad, Hydrophonatics), pipeline development (Ocastra, Virufy), dashboards (DataNuggets, EqualOpportunitySchools, Koalified), image analysis (Diateam, TreeMusketeers, PixelPioneers), large language model tools (EquityEngine, MetaMinds, SCubed, Trojans), and data collection and analysis (Virgina Mason). Many of these projects combined data collection, analysis, modeling, and dashboard development.
Please find PDF versions of all posters here . (These files are enclosed in a zip folder for your convenience.)
Gather Interactive Archives
Due to the pandemic, our Capstone 2021 was held entirely online in the Gather.Town platform , to which we added galleries of our 2020 and 2022 Capstone projects for an archive you can digitally wander and browse.
Gather presents a map-based, interactive platform where you can wander among projects, see media like posters, infographics, and video, and do video/audio chat with others who are logged into the space. You can read some basics about using this platform at the Gather site. One of the other benefits of Gather is that it created a persistent archive of our Capstone 2020-2022 projects, which you can view and digitally wander among here:
https://tinyurl.com/msdsfair
Admissions timelines.
Applications are now open for Autumn 2025
International Deadline: January 7, 2025 at 11:59pm PST
Domestic Deadline: January 14, 2025 at 11:59pm PST
Information Sessions
Upcoming online information sessions:
- Oct 30, 12:00pm PDT Register Here
Admissions Updates
Be boundless, connect with us:.
© 2024 University of Washington | Seattle, WA
Main navigation
- Undergraduate Programs
- Bachelor of Commerce
- MBA Programs
- MM in Analytics
- MM in Finance
- MM in Retailing
- Global Manufacturing and Supply Chain Management
- Graduate Certificate in Healthcare Management
- Graduate Certificate in Professional Accounting
- McGill-HEC Montréal Executive MBA
- McGill Executive Institute
- International Masters for Health Leadership
- International Masters Program for Managers
- PhD in Management
- McGill Personal Finance Essentials
- McGill Dobson Centre for Entrepreneurship
- Career Management
- Marcel Desautels Institute for Integrated Management (MDIIM)
- Equity, Diversity and Inclusion (EDI)
- Laidley Centre for Business Ethics and Equity (LCBEE)
- Sustainability
- Sustainable Growth Initiative (SGI)
- Entrepreneurship & Innovation Initiative (E&I)
- Managing Disruption: Analytics, Advanced Digital Technologies and AI (AAAI)
- TEST Home page
In-Person Information Session Meet us at our Montreal downtown campus on Sunday, October 27, 2024, from 2:30 to 3:30 p.m
McGill MMA (EXP)
Real-world Exposure through the Experiential Module
- Corporate projects
- Community projects
- Our community
- Industry partners
Need help with your business analytics needs? Learn how MMA can help you or your partners at no cost.
On this page: → EXP Analytics Consulting module → Capstone projects → Experiential Learning Spotlight → Experiential Coaches
As core to the Master of Management in Analytics (MMA) program, the EXP Analytics Consulting module has McGill MMA students working alongside Industry professionals solving a significant Data & Analytics problem, aimed to boost the client’s top or bottom lines.
Be part of a Data Science team of 4 student specialists
- Business Strategist - What is the problem? - How do we solve it?
- Data Analyst/Modeler - Identify core data needs - Define formulas/algorithms
- Data Engineer/Coder - Automate data sourcing - Integrate solution components
- Visualization/UI Designer - Design front end for best user adoption - Articulate User Experience
With the McGill MMA (EXP) Analytics projects, you get full structural integrity to drive a strong result.
Build a Data Driven Solution over the Program Long Tenure
All students undertake a technical consulting role by working in teams with real companies and attempting to solve a live data-driven problem .
- Produce a robust analytic solution over 10 months
- Practice using real data and market-leading software
- Benefit from industry mentorship and faculty coaching
- Gain unparalleled training for the job market
Capstone projects
The Master of Management in Analytics’ experiential capstone projects are year-long opportunities where student teams get to work in the private and public sectors to solve pressing issues of the day using data analytics. Here are a few examples of our student's real-world projects.
Professional Services-Consulting Forensic Analytics Anomaly Detection
The world’s governments and large companies spend trillions of dollars on procurement each year. That is a lot of transactions – and an enormous amount of data. Identifying fraudulent transactions is no easy task, but it can help protect a company’s reputation, and avoid costly fines from regulators. To do that, large organizations often work with auditors at large professional services firms like KPMG, which specializes in this type of work. With such large and complex data sets, automation is key to efficiency. And Master of Management Analytics students have been working with the Big 4 accounting firm to apply analytics techniques to detecting anomalies in these data sets – and helping to identify suspicious transactions.
Public Sector-Provincial Govt Communications & Media Buying
Public Media Topic Modeling & Media Channel Optimization
Governments represent the voters who elected them – but the halls of power can be pretty far removed from the everyday experiences of ordinary people. Topic modelling can help bridge this gap, and Master of Management Analytics have used the technique to help the Government of Ontario understand what their voters care about most. Topic modelling analyzes a set of textual document to search for specific topics, and how frequently they are being discussed. Governments can use that information to shape communications strategies, and ensure they are focused what people care about the most.
Retail: Consumer Goods Digital Retail: Beauty
Product Recommendation & Bundling Engine
Loblaws began as a single grocery store in Toronto, but it grew in to a retail giant with stores across the country. Today, Loblaws is Canada’s largest retail chain by revenue, and it sells a lot more than groceries. The company has branched out in to clothing, household items, pharmacy, and beauty products, and its PC Optimum loyalty rewards program is one way the company nudges its customers to buy additional products from the company. Master of Management Analytics students have worked with Loblaws to develop a product recommendation and bundling engine that will help them identify which products would complement other purchases.
MMA Experiential Learning Spotlight
The experiential learning project offers students the chance to gain valuable hands-on experience and develop their skills in analytics while making an impact on their professional growth. From real-world experience to mentorship from industry leaders, learn more about the benefits of the MMA program.
MMA Experiential Coaches
Complementing the Academic side of the learning, Professional Coaches play an integral role in helping navigate students through expectations of the MMA industry projects as well as that of the course. They ensure that students keep the project on point, guide them through aspects of the deliverables that are unique to each client and mentor on effective and successful collaboration with client teams. Leveraging their industry expertise, they also act as advisers on solution development and challenge students to get to the edge of their abilities.
Dino Stamatiou, MSc Director of Business Intelligence | Tempo Software
Dino guides MMA students as an Analytics Consulting Coach to share his experiences in having designed, developed, and delivered mission-critical decision support and analytics solutions for leading corporations across various industries and ranging from Fortune 100s to technology startups. His passion lies in building high performing teams, tackling real-world challenges using data, and elevating the analytical capabilities of his clients and stakeholders.
Kenneth Richardson, MBA Consultant, Strategist | AI/Data/Finance/Sales
Ken guides MMA students as an Analytics Consulting Coach to share his experiences in being passionate about distilling disparate sources of information into key points of relevant business or academic knowledge in order to help coach people or organizations to operate more successfully.
Dimitris Lianoudakis, MSc Founder and Principal | LP Group Payments Consulting
Dimitris guides MMA students as an Analytics Consulting Coach to share his experiences in having built data science teams from the ground up with a focus on learning and development and a proven track record of delivering something of value to the customer. Dimitris has worked in the intersection of payments and data for over 10 years across multiple countries. He's held positions in Business Intelligence, Product, Payments, and various senior leadership roles which he leverages to quickly adapt to a changing market.
If you are interested in becoming an MMA coach please feel free to connect with us.
Become an MMA coach
Department and University Information
Desautels faculty of management mcgill university.
- Bachelor of Commerce (BCom)
- Master of Management in Analytics (MMA)
- Master of Management in Finance (MMF)
- Master of Management in Retailing (MMR)
- Global Manufacturing and Supply Chain Management Program (GMSCM)
- Graduate Certificate in Healthcare Management (GCHM)
- Graduate Certificate in Professional Accounting (GCPA Program)
- Executive MBA
- McGill Executive Institute (MEI)
- International Masters for Health Leadership (IMHL)
- International Masters Program for Managers (IMPM)
- Desautels at a Glance
- Marcel Desautels
- Administration & Governance
- Desautels Strategic Plan 2025
- Equity, Diversity and Inclusion
- Academic Integrity
- International Advisory Board
- Desautels Global Experts
- Delve Thought Leadership
- Search the Desautels directory
- Areas of specialization
- Desautels 22: Top-tier Publications
- Research publications
- Research centres
- McGill Centre for the Convergence of Health and Economics (MCCHE)
- Desautels alumni
- Get involved
- Support Desautels
- News and social
- Desautels Stories
- DesautelsConnect on 10KC
- Working at Desautels
- Student Hub
- Casual payroll
Capstone Projects
M.S. in Data Science students are required to complete a capstone project. Capstone projects challenge students to acquire and analyze data to solve real-world problems. Project teams consist of two to four students and a faculty advisor. Teams select their capstone project at the beginning of the year and work on the project over the course of two semesters.
Most projects are sponsored by an organization—academic, commercial, non-profit, and government—seeking valuable recommendations to address strategic and operational issues. Depending on the needs of the sponsor, teams may develop web-based applications that can support ongoing decision-making. The capstone project concludes with a paper and presentation.
Key takeaways:
- Synthesizing the concepts you have learned throughout the program in various courses (this requires that the question posed by the project be complex enough to require the application of appropriate analytical approaches learned in the program and that the available data be of sufficient size to qualify as ‘big’)
- Experience working with ‘raw’ data exposing you to the data pipeline process you are likely to encounter in the ‘real world’
- Demonstrating oral and written communication skills through a formal paper and presentation of project outcomes
- Acquisition of team building skills on a long-term, complex, data science project
- Addressing an actual client’s need by building a data product that can be shared with the client
Capstone projects have been sponsored by a variety of organizations and industries, including: Capital One, City of Charlottesville, Deloitte Consulting LLP, Metropolitan Museum of Art, MITRE Corporation, a multinational banking firm, The Public Library of Science, S&P Global Market Intelligence, UVA Brain Institute, UVA Center for Diabetes Technology, UVA Health System, U.S. Army Research Laboratory, Virginia Department of Health, Virginia Department of Motor Vehicles, Virginia Office of the Governor, Wikipedia, and more.
Sponsor a Capstone Project
View previous examples of capstone projects and check out answers to frequently asked questions.
What does the process look like?
- The School of Data Science periodically puts out a Call for Proposals . Prospective project sponsors submit official proposals, vetted by the Associate Director for Research Development, Capstone Director, and faculty.
- Sponsors present their projects to students at “Pitch Day” near the start of the Fall term, where students have the opportunity to ask questions.
- Students individually rank their top project choices. An algorithm sorts students into capstone groups of approximately 3 to 4 students per group.
- Adjustments are made by hand as necessary to finalize groups.
- Each group is assigned a faculty mentor, who will meet groups each week in a seminar-style format.
What is the seminar approach to mentoring capstones?
We utilize a seminar approach to managing capstones to provide faculty mentorship and streamlined logistics. This approach involves one mentor supervising three to four loosely related projects and meeting with these groups on a regular basis. Project teams often encounter similar roadblocks and issues so meeting together to share information and report on progress toward key milestones is highly beneficial.
Do all capstone projects have corporate sponsors?
Not necessarily. Generally, each group works with a sponsor from outside the School of Data Science. Some sponsors are corporations, some are from nonprofit and governmental organizations, and some are from in other departments at UVA.
One of the challenges we continue to encounter when curating capstone projects with external sponsors is appropriately scoping and defining a question that is of sufficient depth for our students, obtaining data of sufficient size, obtaining access to the data in sufficient time for adequate analysis to be performed and navigating a myriad of legal issues (including conflicts of interest). While we continue to strive to use sponsored projects and work to solve these issues, we also look for ways to leverage openly available data to solve interesting societal problems which allow students to apply the skills learned throughout the program. While not all capstones have sponsors, all capstones have clients. That is, the work is being done for someone who cares and has investment in the outcome.
Why do we have to work in groups?
Because data science is a team sport!
All capstone projects are completed by group work. While this requires additional coordination , this collaborative component of the program reflects the way companies expect their employees to work. Building this skill is one of our core learning objectives for the program.
I didn’t get my first choice of capstone project from the algorithm matching. What can I do?
Remember that the point of the capstone projects isn’t the subject matter; it’s the data science. Professional data scientists may find themselves in positions in which they work on topics assigned to them, but they use methods they enjoy and still learn much through the process. That said, there are many ways to tackle a subject, and we are more than happy to work with you to find an approach to the work that most aligns with your interests.
Your ability to influence which project you work on is in the ranking process after “pitch day” and in encouraging your company or department to submit a proposal during the Call for Proposal process. At a minimum it takes several months to work with a sponsor to adequately scope a project, confirm access to the data and put the appropriate legal agreements into place. Before you ever see a project presented on pitch day, a lot of work has taken place to get it to that point!
Can I work on a project for my current employer?
Each spring, we put forward a public call for capstone projects. You are encouraged to share this call widely with your community, including your employer, non-profit organizations, or any entity that might have a big data problem that we can help solve. As a reminder, capstone projects are group projects so the project would require sufficient student interest after ‘pitch day’. In addition, you (the student) cannot serve as the project sponsor (someone else within your employer organization must serve in that capacity).
If my project doesn’t have a corporate sponsor, am I losing out on a career opportunity?
The capstone project will provide you with the opportunity to do relevant, high-quality work which can be included on a resume and discussed during job interviews. The project paper and your code on Github will provide more career opportunities than the sponsor of the project. Although it does happen from time to time, it is rare that capstones lead to a direct job offer with the capstone sponsor's company. Capstone projects are just one networking opportunity available to you in the program.
Capstone Project Reflections From Alumni
"For my Capstone project, I used Python to train machine learning models for visual analysis – also known as computer vision. Computer vision helped my Capstone team analyze the ergonomic posture of workers at risk of developing musculoskeletal injuries. We automated the process, and hope our work further protects the health and safety of people working in the United States.” — Theophilus Braimoh, MSDS Online Program 2023, Admissions Student Ambassador
“My Capstone experience with the ALMA Observatory and NRAO was a pivotal chapter in my UVA Master’s in Data Science journey. It fostered profound growth in my data science expertise and instilled a confidence that I'm ready to make meaningful contributions in the professional realm.” — Haley Egan, MSDS Online Program 2023, Admissions Student Ambassador
“Our Capstone projects gave us the opportunity to gain new domain knowledge and answer big data questions beyond the classroom setting.” — Mina Kim, MSDS Residential Program 2023, Ph.D. in Psychology Candidate
Capstone Project Reflections From Sponsors
“For us, the level of expertise, and special expertise, of the capstone students gives us ‘extra legs’ and an extra push to move a project forward. The team was asked to provide a replicable prototype air quality sensor that connected to the Cville Things Network, a free and community supported IoT network in Charlottesville. Their final product was a fantastic example that included clear circuit diagrams for replication by citizen scientists.” — Lucas Ames, Founder, Smart Cville
“Working with students on an exploratory project allowed us to focus on the data part of the problem rather than the business part, while testing with little risk. If our hypothesis falls flat, we gain valuable information; if it is validated or exceeded, we gain valuable information and are a few steps closer to a new product offering than when we started.” — Ellen Loeshelle, Senior Director of Product Management, Clarabridge
Data Science Capstone Project Examines COVID's Impact on Alcohol-Related Health Incidents at UVA
Student Capstone Project Looks To Improve Electrolarynx Speech-to-Text
Master’s Students Strengthen Ability of LLM to Recommend Scholarly Works
My MSDS Capstone Project: Predicting California’s Hydroclimate
Data Science Master’s Students Tackle Diverse, Real-World Challenges in Capstone Projects
Get the latest news.
Subscribe to receive updates from the School of Data Science.
- Prospective Student
- School of Data Science Alumnus
- UVA Affiliate
- Industry Member
Data Science: Capstone
Show what you’ve learned from the Professional Certificate Program in Data Science.
- Introductory
Associated Schools
Harvard T.H. Chan School of Public Health
What you'll learn.
How to apply the knowledge base and skills learned throughout the series to a real-world problem
Independently work on a data analysis project
Course description
To become an expert data scientist you need practice and experience. By completing this capstone project you will get an opportunity to apply the knowledge and skills in R data analysis that you have gained throughout the series. This final project will test your skills in data visualization, probability, inference and modeling, data wrangling, data organization, regression, and machine learning.
Unlike the rest of our Professional Certificate Program in Data Science, in this course, you will receive much less guidance from the instructors. When you complete the project you will have a data product to show off to potential employers or educational programs, a strong indicator of your expertise in the field of data science.
Instructors
Rafael Irizarry
You may also like.
Data Science: Probability
Learn probability theory — essential for a data scientist — using a case study on the financial crisis of 2007–2008.
Data Science: Inference and Modeling
Learn inference and modeling: two of the most widely used statistical tools in data analysis.
High-Dimensional Data Analysis
A focus on several techniques that are widely used in the analysis of high-dimensional data.
Join our list to learn more
Navigation Menu
Search code, repositories, users, issues, pull requests..., provide feedback.
We read every piece of feedback, and take your input very seriously.
Saved searches
Use saved searches to filter your results more quickly.
To see all available qualifiers, see our documentation .
- Notifications You must be signed in to change notification settings
I wanted to showcase to companies how can data analysis help improve the company profit.
ghadikq/Capstone_Project_Online_Retail
Folders and files, repository files navigation, capstone project online retail.
Understand data better and extract insights from it to provide insight for decision-makers to improve company marketing and increase sales. Also, showcase how can adding a recommender engine help to increase the company sales.
RESEARCH QUESTIONS
- COUNTRY - How many customers from different country , dose profit change?
- QUANTITY - How dose quantity trend change based on date?
- PROFIT - How was the profit for this year based on months and days?
- PRODUCTS - What is most sold products in the store?
This analysis is on Online_Retail_II dataset provided by UCI Link .
The dataset contain transactions occurring for a UK-based and registered online shop , The company mainly sells unique all-occasion gift-ware. Many customers of the company are wholesalers.
The dataset contains the following variables:
You can access the dataset directly from data folder in this repository.
Repository content
- Online Retail R Markdown.
- Online Retail html.
- Online Retail ppt slide for presentation.
- HTML 100.0%
IMAGES
VIDEO
COMMENTS
I worked on this capstone project towards completion of final assessment for PGP in Data Science course from Simplilearn-Purdue University. My job was to analyze transactional data for a online retail company and create customer segementation so that company can create effective marketing campaign.
AlmaBetter Capstone Project -Machine Learning Project type: Regression. Sales forecasting is an approach retailers use to anticipate future sales by analyzing past sales, identifying trends, and projecting data into the future. - GitHub - samchak18/Capstone_Project_2_Retail_Sales_Prediction: AlmaBetter Capstone Project -Machine Learning Project type: Regression.
In this article I want to present you a full data science portfolio project. In this project I want to perform Retail Data Analytics using the Amazon Web Service and different Machine Learning Algorithms. The full code including a project proposal and a final project report can be found in my Github repository. Definition
Explore and run machine learning code with Kaggle Notebooks | Using data from [Private Datasource] Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic. Learn more. OK, Got it. Something went wrong and this page crashed!
Part 1. Predictive Customer Lifetime Value and Product Recommendation for Retail. 1.Exploratory Data Analysis (EDA) Upon receiving the data for the first client, we realized that the product items listed were in a semi-structured format. That is, some of the item names were in a "product name - color" format, although there were many ...
Kaggle is the world's largest data science community with powerful tools and resources to help you achieve your data science goals. Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic. Learn more. OK, Got it.
The notebook 1_Data_Exploration.ipynb contains some code for the data analysis of the dataset. The notebook 2_Create_Train_and_Test_Data.ipynb contains the code for merging all data together and creating the final csv files for training and testing. The folder Documentation contains the Proposal for this Project.
Project Idea #10: Building a Chatbot. A chatbot is a computer program that uses artificial intelligence to simulate human conversation. It can interact with users in a natural language through text or voice. Building a chatbot can be an exciting and challenging data science capstone project.
The Capstone Project Experience. In the final two quarters of the program, students gain real world experience working in small groups on a data science challenge facing a company or not-for-profit. At the conclusion of the capstone project, sponsoring organizations are invited to attend a formal Capstone Event where students showcase their work.
Capstone projects are specifically designed to encourage students to think critically, solve challenging data science problems, and develop analytical skills. Two group of students built an end-to-end data science solution using Azure Machine Learning to accurately forecast sales.
This is a transnational data set which contains all the transactions that occurred between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique and all-occasion gifts.
Capstone projects. The Master of Management in Analytics' experiential capstone projects are year-long opportunities where student teams get to work in the private and public sectors to solve pressing issues of the day using data analytics. Here are a few examples of our student's real-world projects. Professional Services-Consulting
Data Science Capstone Retail Project. Contribute to kc2019/DS_Capstone_Retail development by creating an account on GitHub.
Capstone Projects. M.S. in Data Science students are required to complete a capstone project. Capstone projects challenge students to acquire and analyze data to solve real-world problems. Project teams consist of two to four students and a faculty advisor. Teams select their capstone project at the beginning of the year and work on the project ...
By completing this capstone project you will get an opportunity to apply the knowledge and skills in R data analysis that you have gained throughout the series. This final project will test your skills in data visualization, probability, inference and modeling, data wrangling, data organization, regression, and machine learning.
rohitlog/Data-Science-capstone-Retail-project This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main
Hello and welcome. I'm Vijayraj K Poojary and together with Snehil and Vinay, we will do our Capstone project presentation on "OList Marketing and Retail Ana...
This analysis is on Online_Retail_II dataset provided by UCI Link. The dataset contain transactions occurring for a UK-based and registered online shop , The company mainly sells unique all-occasion gift-ware.