Background on market basket analysis and association rules
When people go to the supermarket for shopping, they often have a list of things to buy. A household might buy salt, pepper and vegetables for a family dinner, while a young man might buy beer and chips for one weekend. Understanding these buying patterns can help to increase supermarket sales in several ways, bringing up the idea of a market basket analysis.
Market basket analysis(also Affinity analysis) is a data analysis and data mining technique that discovers co-occurrence relationships among activities performed by specific individuals or groups. It provides retailers with information to understand the purchasing behaviors of buyers. This information enables retailers to understand the buyer's needs and rewrite the store's layout accordingly, develop cross-promotional programs, or even capture new buyers.
There’s a famous story about beer purchases and diapers. An interesting study finds that when married men go to stores for purchasing diapers, they usually buy beer together with the diapers. The supermarkets considered this as a cross-selling opportunity and decided to keep the baby diapers aisle next to the beer aisle. They immediately witnessed a steep growth in sales by men who bought baby diapers also purchased beer. However, putting beer and diapers together didn't help diaper sales. Analysis showed that men purchasing diapers are more likely to buy beer, but men purchase beer might not buy diapers. To solve this confusion, market basket analysis includes ‘association rules’, to identify which products will experience cross-sells when two or more products are kept together.
According to Wikipedia, association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness. There are three common ways to measure association.
The three common ways are referred here from Association Rules and the Apriori Algorithm: A Tutorial
Measure 1: Support. This says how popular an itemset is, as measured by the proportion of transactions in which an itemset appears. In Table 1 below, the support of {apple} is 4 out of 8, or 50%. Itemset can also contain multiple items. For instance, the support of {apple, beer, rice} is 2 out of 8, or 25%.


Table 1. Example Transactions
If you discover that sales of items beyond a certain proportion tend to have a significant impact on your profits, you might consider using that proportion as your support threshold. You may then identify itemsets with support values above this threshold as significant itemsets.
Measure 2: Confidence. If there is a pair of items X and Y, that are frequently bought together, how likely item Y is purchased when item X is purchased, expressed as {X -> Y}. This is measured by the proportion of transactions with item X, in which item Y also appears. In Table 1, the confidence of {apple -> beer} is 3 out of 4, or 75%.

One drawback of the confidence measure is that it might misrepresent the importance of an association. This is because it only accounts for how popular apples are, but not beers. If beers are also very popular in general, there will be a higher chance that a transaction containing apples will also contain beers, thus inflating the confidence measure. To account for the base popularity of both constituent items, we use a third measure called lift.
Measure 3: Lift. This says how likely item Y is purchased when item X is purchased, while controlling for how popular item Y is. In Table 1, the lift of {apple -> beer} is 1, which implies no association between items. A lift value greater than 1 means that item Y is likely to be bought if item X is bought, while a value less than 1 means that item Y is unlikely to be bought if item X is bought.

The advantage of association rule algorithms over the more standard decision tree algorithms (C5.0 and C&R Trees) is that associations can exist between any of the attributes. A decision tree algorithm will build rules with only a single conclusion, whereas association algorithms attempt to find many rules, each of which may have a different conclusion.
The disadvantage of association algorithms is that they are trying to find patterns within a potentially very large search space and, hence, can require much more time to run than a decision tree algorithm.
The following algorithms are well-known association rules:
- Apriori algorithm, uses a breadth-first search strategy to count the support of itemsets and uses a candidate generation function which exploits the downward closure property of support.
- Equivalence Class Transformation (Eclat) algorithm is a depth-first search algorithm based on set intersection. It is suitable for both sequential as well as parallel execution with locality-enhancing properties.
- Frequent pattern (FP)-growth algorithm. In the first pass, the algorithm counts occurrence of items (attribute-value pairs) in the dataset, and stores them to the 'header table'. In the second pass, it builds the FP-tree structure by inserting instances. Items in each instance have to be sorted by descending order of their frequency in the dataset, so that the tree can be processed quickly.
Generalized Spatial Association Rule (GSAR)
In recent years, because of the importance and necessity of analyzing geospatial data in different industries, spatial data mining approaches have gained lots of interest. Among the existing spatial data mining approaches, the spatial association rule mining proposed by Koperski and Han (1995) is one of the most typical approaches for spatial pattern discovery.
As defined by Koperski and Han, a spatial association rule is a rule which describes the implication of one or a set of spatial objects by another set of spatial objects in spatial databases. The spatial objects involved therefore can be classified into two groups, the event object and the geo-context object:
- The event objects are the research targets of rule mining, which means that the rules discovered are about the spatial patterns of the event objects.
- The geo-context objects are used to describe the patterns of the event objects.
The patterns are represented by spatial relationships, for example topological relationships defined between each pair of a reference object and a task-relevant object.
Here’s an example of spatial association rule, event and geo-context objects, and spatial relationships.
Example: A rule “most crime cases occurred within census tract No. 1 are close to Freya St (street)” is a spatial association rule discovered in a spatial database containing crime cases and map elements. The crime cases are event objects. The census tract No. 1 and the Freya St are specific geo-context objects. The whole set of geo-context objects may include all the census tracts, streets and roads, and other map elements in the database. The within and close to are spatial relationships defined between the crime cases and the census tract and the road.
The rule in this example can be written as:
<Within, Tract1> è <Close to, Freya St> (a%, l%)
where <Within, Tract1> is the condition of the rule, and <Close to, Freya St> is the prediction of the rule. In the bracket, a% denotes the condition support, implying how many crime cases (in percentage) satisfy the condition of the rule, i.e. Symbol l% denotes the rule lift, implying the Ratio of confidence for the rule (the probability that the prediction is true given that the condition is true) to the prior probability of having the prediction of the rule. Therefore, lift measures the gain in prediction accuracy by using the rule.
In a word, a spatial association rule describes the spatial distribution pattern of a set of event objects by their spatial relationships with the geo-context objects. However, as analyzed by Dong et al. (2012b), existing spatial association rule mining approaches have a major limitation that they cannot effectively involve all available non-spatial information of the spatial objects. As a result, many interesting rules expressing richer information (e.g., the combinations of spatial and non-spatial information) cannot be found even if non-spatial information that could be useful for rule discovery is available.
GSAR mining algorithm overcomes this significant shortcoming. GSAR can exploit all available information of the spatial objects, including spatial, non-spatial, and ontological information (represented by taxonomy). It has two advantages compared to association rules.
- More information can be effectively utilized and exploited for spatial association rule discovery than ever before.
- GSAR can be implemented by extending traditional association rule mining algorithm, such as Apriori, FP-Growth and no significant additional computation is introduced.
Usage case – Spokane Crime Analysis
This use case is on the real crime history of the city of Spokane, WA, USA, downloaded from City of Spokane Open GIS Data on March 2010(Deprecated now). In this use case, GSAR mining is conducted to find crime patterns of different census tracts.
Data Overview
Here’s a map overview of the datasets. It’s a set of ESRI shapefiles. There are 816 crime incidents of three crime types (Drugs, Vehicle Theft, and Robbery), 10 census tracts, and 23 major streets in the datasets.

Figure 1: Map Overview of Crime Dataset
Event objects
In total there are 816 crime incidents of three types: Drugs (167 incidents), Vehicle Theft (552 incidents), and Robbery (97 incidents), The incidents were reported in the northeast city (covering 10 census tracts) from January 2009 to March 2010 and constitutes the crime history, where each incident is regarded as an event spatial object. Day of week on which the incident happened, and its general type are included in pattern mining as non-spatial attributes of event objects. Sample data is shown in Table 1.
Table 1: Sample Crime Data
|
tid
|
day
|
type
|
|
4
|
Wednesday
|
Drugs
|
|
251
|
Thursday
|
Robbery
|
|
266
|
Friday
|
VehicleTheft
|
|
…
|
…
|
…
|
Geo-context Objects
A total of 10 census tracts and 23 major streets in the area are regarded as geo-context objects in the analysis. The original data is also in ESRI shapefile format. There are two shapefiles: One shapefile represents the census tract layer, and the other represents the street layer. The geometries of the census tracts (in polygon) and the streets (in polyline) are used to derive spatial features for each crime incident, including two spatial relationships: 1) within a tract, and 2) close to a street (defined by distance < 500 feet).
The census tracts data is involved as non-spatial attributes of geo-context objects. Population density (per sq mile) and ratio of male to female of each census tract are considered as contexts of crimes and included in the analysis. This is because these two attributes, according to the domain knowledge, can influence the distribution of crime occurrences. The tracts’ non-spatial attributes are summarized in Table 2. In order to do association rule mining algorithm, non-spatial attributes with continuous values needs to be discretized. So, we derive POPDEN from the original POPDENSITY attribute and RMF from the original RATIO_MF attribute. POPDEN and RMF will be used instead of the original attributes in rule mining.
Table 2: Attributes of Census Tracts
Using SPSS Statistics to Analyze the data
All the crimes should be specified as event objects. Because all of the crimes have similar data models, we can combine the three shapefiles into one. The census tracts and the streets can be specified as geo-context objects. We are interested in the influence of population density and the ratio of male to female to crime, so in this rule mining, POPDEN and RMF will be used as condition and crime type – OFFGEN will be used as prediction.
Setting – SPSS Statistics Syntax
SPATIAL ASSOCIATION RULES
/MAPSPEC FILE='C:\Users\IBM_ADMIN\Desktop\crimeAnalysis.mplan'
/AUTOBINNING BINS=5
/AGGREGATION AGGMAP=YES CONTINUOUS=MEAN ORDINAL=MODE
/DATASET DATAID=Drug_1 Robbery_1 VehicleTheft_1 KEEP=OFFGEN(Drug_1,Robbery_1,VehicleTheft_1 STRING NOMINAL) PREDICTIONS=OFFGEN
/DATASET DATAID=Context CONDITIONS=MALE_FEMAL POPDEN_BIN
/DATASET DATAID=Context_1 CONDITIONS=Location
/RULEGENERATION MAXCONDITION=5 MAXPREDICTION=5 MINSUPPORT=0.1 MINCONDITIONSUPPORT=0.01 MINCONFIDENCE=0.7 MINLIFT=1 EXCLUDE=Location(Context_1) OFFGEN(EVENT) ; Location(Context_1) MALE_FEMAL(Context) ; Location(Context_1) POPDEN_BIN(Context)
/MODELTABLES FIELDTRANSFORMATION=NO RECORDSUMMARY=NO EVALUATION=YES ITEMFREQ=NO FIELDFREQ=YES EXCLUDEDINPUTS=NO
/MAPOUTPUT DISPLAY=YES CRITERION=CONFIDENCE NUMRULES=5
/WORDCLOUD DISPLAY=YES CRITERION=CONFIDENCE NUMRULES=10
/RULESTABLE DISPLAY=YES CRITERION=CONFIDENCE NUMRULES=30.
Output - Crime_Analysis_Output.spv
We are interested in the output of the Rules Tables and Interactive bar chart and map of the top rules.
- Rules Tables - rules tables display the top rules and values for confidence, rule support, lift, condition support, and deploy ability. Each table is sorted by values of the selected criterion. You can display all rules or the top number of rules based on the selected criterion.
- Map - Interactive bar chart and map of the top rules based on the selected criterion. Each interactive output object contains the top rules for confidence, rule support, lift, condition support, and deployability. The selected criterion determines the list of rules that is displayed by default. You can select a different criterion interactively in the output. Max rules to display determines the number of rules that are displayed in the output.

The light green area is the context object that does not meet the conditions of the rule, meaning the density of area population is not very high. The dark green area is the context object that meets the conditions, meaning the density of area population is very high. The blue points are events satisfying both the condition and the prediction, meaning vehicle thefts occur in the high population density area. From the rule, we can say, 19.49% vehicle theft crimes occur in very high population areas, and in high population areas, we have 72.94% confidence to determine that the crime type is vehicle theft. The red points are events satisfying only the prediction, where vehicle theft crimes occurred in areas where population density is not very high.
In this analysis, only “Vehicle theft” crime type is identified, beacuse we set the minimum confidence as 70% that, is to determine 70% crime type of that given area. But in the raw data set, the portion of Drugs is only 21% and Robbery is only 12%. There’s no area that more than 70% crime type is Drugs or Robbery. So to determine rules about Drugs and Robbery, we have to decrease confidence to 10%(MINCONFIDENCE=0.1) and minimum rule support as 5%(MINSUPPORT=0.05). Let’s see what will happen.
SPATIAL ASSOCIATION RULES
/MAPSPEC FILE='C:\Users\IBM_ADMIN\Desktop\crimeAnalysis.mplan'
/AUTOBINNING BINS=5
/AGGREGATION AGGMAP=YES CONTINUOUS=MEAN ORDINAL=MODE
/DATASET DATAID=Drug_1 Robbery_1 VehicleTheft_1 KEEP=OFFGEN(Drug_1,Robbery_1,VehicleTheft_1 STRING NOMINAL) PREDICTIONS=OFFGEN
/DATASET DATAID=Context CONDITIONS=MALE_FEMAL POPDEN_BIN
/DATASET DATAID=Context_1 CONDITIONS=Location
/RULEGENERATION MAXCONDITION=5 MAXPREDICTION=5 MINSUPPORT=0.05 MINCONDITIONSUPPORT=0.01 MINCONFIDENCE=0.1 MINLIFT=1 EXCLUDE=Location(Context_1) OFFGEN(EVENT) ; Location(Context_1) MALE_FEMAL(Context) ; Location(Context_1) POPDEN_BIN(Context)
/MODELTABLES FIELDTRANSFORMATION=NO RECORDSUMMARY=NO EVALUATION=YES ITEMFREQ=NO FIELDFREQ=YES EXCLUDEDINPUTS=NO
/MAPOUTPUT DISPLAY=YES CRITERION=LIFT NUMRULES=5
/WORDCLOUD DISPLAY=YES CRITERION=LIFT NUMRULES=10
/RULESTABLE DISPLAY=YES CRITERION=LIFT NUMRULES=10.
Output - Crime_Analysis_Output_lift.spv

Based on the following results for rule 8 , we can say that 5.88% Drug crimes occurred in high population density areas with a low male to female ratio. And in that area, we only have 22.75% confidence to determine that the crime type is Drugs. Actually most crimes are Vehicle theft in these areas.

More information about SPSS GSAR
You can find more information about product integration with the SPSS UI
Spark and Python API
The following videos provide additional information about applying GSAR:
1. Using the Generalized Spatial Association Rule in IBM SPSS Statistics 23
2. Geospatial Analytics with IBM SPSS Modeler
3. Apriori Algorithm (Associated Learning) - Fun and Easy Machine Learning
Use following websites for downloading geospatial data:
#GlobalAIandDataScience#GlobalDataScience