Principles of Data Mining
Lab 2
Type your answers in the answer sheet NOT in this file:
For questions 3-5: show all your work/math, no points on final answer only.
What is market basket analysis?
What is the “Apriori principle”? Why is it useful in association rule mining?
Consider the data set shown in Table below then answer the following questions:
Customer ID | Transaction ID | Items Bought |
1 | 0001 | {a, d, e} |
1 | 0024 | {a, b, c, e} |
2 | 0012 | {a, b, d, e} |
2 | 0031 | {a, c, d, e} |
3 | 0015 | {b, c, e} |
3 | 0022 | {b, d, e} |
4 | 0029 | {c, d} |
4 | 0040 | {a, b, c} |
5 | 0033 | {a, d, e} |
5 | 0038 | {a, b, e} |
Compute the support for itemsets {e} , {b, d} , and {b, d, e} by treating each transaction ID as a market basket.
Use the results in part (a) to compute the confidence for the association rules
{b, d} → {e} and {e} → {b, d}.
Study the table below then answer the following questions: (Minimum Support = 40% Minimum Confidence = 40%)
Transaction ID | Items Bought |
1 | A,B,C T |
2 | A,B,C,D,E T |
3 | A,C,D T |
4 | A,C,D,E T |
5 | A,B,C,D |
Find all the frequent item sets using Apriori algorithm. Use tables to represent Ck and Lk.
Generate all possible decision rules from each of the frequent itemsets you obtained from the previous questions along with their confidence.