SPSS Statistics

 View Only
  • 1.  Assistance With Large Data Analysis

    Posted Mon May 08, 2023 11:30 AM

    Hello, 

    I am having difficulties with a large data set analysis of crime data. The data contains an ID number (although sanitised) with the offence they committed, the date of the offence and the outcome of that offence, across a five-year period (01/01/2018 - 31/12/2022). 

    My aim is to analyse the IDs to identify which offence occurred first and what the outcome was and if they re-offended within one year of that date. If they re-offended, then to repeat it again to see if they offended for a third time (so on and so forth). From this I want a statistical representation for each outcome of how many times the person did re-offend and did not re-offend.

    I have attempted to do this on SPSS, however, struggled due to limited knowledge on the system. Is this possible in SPSS and what would the Syntax be?

    Thank you for any assistance.

    #SPSS Statistics



    ------------------------------
    Mitchell Wills
    ------------------------------


  • 2.  RE: Assistance With Large Data Analysis

    Posted Tue May 09, 2023 09:11 AM

    Suggestion:
    data list list / row_id (F) offender_id (F) date_of_offence (SDATE) .
    begin data .
    1 1 2021-07-22
    2 1 2021-09-10
    3 1 2021-10-01
    4 1 2023-04-13
    5 1 2023-05-10
    6 2 2022-10-01
    7 2 2023-02-01
    end data .
    sort cases by offender_id date_of_offence .
    compute sequence_id = row_id .
    compute sequence_number = 1 .
    do if offender_id = lag(offender_id) & date_of_offence - lag(date_of_offence) <= time.days(365) .
    - compute sequence_id = lag(sequence_id) .
    - compute sequence_number = lag(sequence_number) + 1 .
    end if .
    execute .
    list variables = all .
    The code groups sequences of offences to a common sequence_id, and numbers the second, third ... offence in any sequence. Does this help?



    ------------------------------
    Kai Borgolte
    ------------------------------