Market Basket Analysis with Association Rule Learning
The promise held by data mining was that algorithms crunch data and find fascinating patterns that you could exploit in your business.
The exemplar of this promise is market basket analysis (Wikipedia refers to it as affinity analysis). Provided a pile of transactional records, identify interesting purchasing patterns that could be exploited in the store, like offers and product layout.
In this blog post, you will go through a market basket analysis guide leveraging association learning in Weka. If you adhere to the step-by-step instructions, you will carry out a market basket analysis on point of sale data in just under five minutes.
Association Rule Learning
We will now look into a case study where a start-up investigating client behaviour in a SaaS application. The point of interest was patterns of behaviour that signifies churn or conversion ranging from free-of-cost to paid accounts.
Weeks were expended pouring over the information, looking at correlations and plots. They came up with a bunch of rules that signified results and presented ideas for potential interventions to influence those results.
They devised rules such as “User Creates x widgets in y days and logged in n times then they will convert”. Numbers were allocated to the rules like support (the number of records that match the rule out of all record) and lift (the % increase in forecasting precision in leveraging the rule to forecast a conversion.)
It was just after the report was presented and put forth that they understood what a massive mistake they had made. They had conducted Association Rule Learning by hand, when there are off-the-shelf algorithms that could have conducted the work for them.
If you are going through massive datasets for patterns of interest, association rule learning is a suite of methods you should be leveraging.
1] Begin the Weka Explorer
You must possess prior knowledge on running a classifier, designing and running an experiment, and ensemble methods. If you require assistance in downloading and setting up Weka, look forward to our articles which are in the pipeline at AICorespot.
2] Load the supermarket datasets
Weka carries with it an array of real datasets in the “data” directory of the Weka installation. This is really handy as you can explore and experiment on these widespread problems and learn about the several strategies in Weka at your disposal.
Load the Supermarket dataset (data/supermarket.arff). This is a dataset of point of sale data. The data is nominal and every instance indicates a client transaction at a supermarket, the products obtained and the departments that are involved. There is not adequate data with regards to this dataset online, even though you can observe this comment.
The data consists of 4,627 examples and 217 attributes. The data is denormalized. Every attribute is binary and either contains a value (“t” for true) or no value (“?” for missing). There is a nominal class attribute referred to as “total” that signifies whether the transaction was less than 100(low)or greater than 100 (high).
There is no interest in developing a predictive model for total. Rather, the interest lies in what items were obtained together. We have interest in identifying useful patterns in this information that might or might not be related to the forecasted attributed.
3] Figure out Association Rules
Choose the “Associate” tab in the Weka Explorer. The “Apriori” algorithm will already be chosen. This is the most widespread association rule learning strategy as it might have been the first (Agrawal and Srikant in 1994) and it is really effective.
In theory the algorithm is quite simplistic. It builds up attribute-value (item) sets that maximize the number of instances that can be detailed (coverage of the dataset). The search through item space is really similar to the problem encountered faced with attribute selection and subset search.
Tap the “start” button to run Apriori on the dataset.
4] Analyze outcomes
The actual work for association rule learning is in the interpreting of the outcomes.
From observing the “Associator output” window, you can observe that the algorithm put forth 10 rules obtained from the supermarket dataset. The algorithm is setup to cease at 10 rules by default, you can click on the algorithm name and set it up to identify and report additional rules if you like by changing the “numRules” value.
The rules found out where:
- biscuits=t frozen foods=t fruit=t total=high 788 ==> bread and cake=t 723 conf:(0.92)
- baking needs=t biscuits=t fruit=t total=high 760 ==> bread and cake=t 696 conf:(0.92)
- baking needs=t frozen foods=t fruit=t total=high 770 ==> bread and cake=t 705 conf:(0.92)
- biscuits=t fruit=t vegetables=t total=high 815 ==> bread and cake=t 746 conf:(0.92)
- party snack foods=t fruit=t total=high 854 ==> bread and cake=t 779 conf:(0.91)
- biscuits=t frozen foods=t vegetables=t total=high 797 ==> bread and cake=t 725 conf:(0.91)
- baking needs=t biscuits=t vegetables=t total=high 772 ==> bread and cake=t 701 conf:(0.91)
- biscuits=t fruit=t total=high 954 ==> bread and cake=t 866 conf:(0.91)
- frozen foods=t fruit=t vegetables=t total=high 834 ==> bread and cake=t 757 conf:(0.91)
- frozen foods=t fruit=t total=high 969 ==> bread and cake=t 877 conf:(0.91)
You can observe rules are put forth in antecedent => consequent format. The number connected with the antecedent is the absolute coverage in the dataset (in this scenario a number out of a possible total of 4,627) The number next to the consequent is the absolute number of examples that match the antecedent and the consequent. The number in brackets on the end is the support for the rule (number of antecedent divided by the number of matching consequents). You can observe that a cutoff of 91% was leveraged in choosing rules, specified in the Associator Output window and signified in that no rule has a coverage lesser than 0.91.
We don’t wish to go through all 10 rules, it would be perceived as too onerous. Here are a few observations:
- We can observe that all presented rules possess a consequent of “bread and cake”
- All presented rules signify a high cumulative transaction amount.
- “biscuits” and “frozen foods” appear in several of the presented rules.
You have to take care with regards to interpretation of association rules. They are associations (think correlations), not necessarily casually related. Additionally, short antecedent are probably more robust than long antecedent that are more probable to be fragile.
If we hold interest in the total for instance, we would want to convince individuals that purchase biscuits, frozen foodstuffs and fruit to purchase bread and cake so that they have the outcome of a high total transaction amount (Rule #1). This might sound feasible, but is flawed reasoning. The product combo does not have the result of a high total, it is only connected with a high total. Those 723 transactions might have a massive assortment of arbitrary items in addition to those within the rule.
What could be of interest to evaluate is to model the path through the store needed to gather connected items and observing if changes to that path (shorter, longer, displayed offers, etc.) have an impact on transaction size or basket size.
In this blog article by AICorespot you found out about the power of automatically learning association rules from large datasets. You learned that it is a lot more effective strategy to leverage an algorithm like Apriori instead of deducing rules by hand.
You carried out your first market basket analysis in Weka and learned that the actual work is in the analyses of the outcomes. You found out the meticulous attention to detail needed during interpretation of rules and that association (correlation) is not identical to causation.