This is to let you know that a new dataset is released on the dataset page of SPMF called Chicago_Crimes_2001_to_2017.
This dataset can be used for high utility itemset mining andfrequent itemset mining.
To download the Chicago dataset, see the Datasets page and select either the version of the dataset for high utility itemset mining or frequent itemset mining.
This Chicago dataset was obtained from UCI and converted by Chongjie Zhang to a format that is suitable for itemset mining and donated to SPMF.
Here is a brief description of the version of the dataset for high utility itemset mining. It contains 2,662,309 transactions and 35 items, and it has real utility values.
The dataset records the crimes occurred in Chicago from 2001 to 2017.
Every transaction corresponds to a <month, area>. A transaction describes the crimes that occurred in a specific area during a specific month. Utility is the count of crime, and the names of items are shown in the 'NAMES'.
For example, '1 2:4:2 2' means that the crime 'THEFT' occurs twice and the crime 'OTHER OFFENSE' occurs twice in the corresponding <month, area> represented by this transaction. Here is the definitions of items:
2: OTHER OFFENSE
3: OFFENSE INVOLVING CHILDREN
4: CRIM SEXUAL ASSAULT
5: MOTOR VEHICLE THEFT
6: SEX OFFENSE
7: DECEPTIVE PRACTICE
10: WEAPONS VIOLATION
11: PUBLIC PEACE VIOLATION
15: LIQUOR LAW VIOLATION
16: INTERFERENCE WITH PUBLIC OFFICER
17: CRIMINAL DAMAGE
21: CRIMINAL TRESPASS
28: DOMESTIC VIOLENCE
29: OTHER NARCOTIC VIOLATION
30: PUBLIC INDECENCY
32: HUMAN TRAFFICKING
33: CONCEALED CARRY LICENSE VIOLATION
34: NON - CRIMINAL
35: NON-CRIMINAL (SUBJECT SPECIFIED)
For the frequent itemset mining, the dataset is the same except that there is no utility values.