SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
  • 1.  extracting mixed alphanumeric string

    Posted Thu September 26, 2024 09:12 AM

    I am in need of extracting two components from a alphanumeric column to create two new columns as seen below.  The source column always has a defined string  which ends when it hits the course number (which can have a letter or letters appended to it).  Any help would be greatly appreciated.



    ------------------------------
    David Wright
    ------------------------------


  • 2.  RE: extracting mixed alphanumeric string

    Posted Thu September 26, 2024 09:58 AM
    Here is an easy way to do this.  It uses the SPSSINC TRANS extension command, so install that via Extensions > Extension Hub if you don't already have it.
    spssinc trans result=subj cata type=10 10
        /formula "re.match(r'(\D+)(\d.+)', code).groups()".

    It splits the input, variable code, into a part consisting of all the nondigit characters starting at the beginning until a digit is encountered and then the rest of the string.  I set the output string variables to be 10 bytes long, but you can change that as needed.






  • 3.  RE: extracting mixed alphanumeric string

    Posted Thu September 26, 2024 11:04 AM

    Jon,

    Perfect.  I ran the extension and it cleanly parse the data for the two columns I need, as always, you are a life saver, thank you!

    David



    ------------------------------
    David Wright
    ------------------------------



  • 4.  RE: extracting mixed alphanumeric string

    Posted Thu September 26, 2024 01:08 PM
    Thanks.  This problem could be solved in traditional syntax in a half dozen lines, but IMO the pattern matching approach I used is better.  Regular expressions are very powerful, but they can be abused.  If they are too complex, they are unreadable.  The difference is that with traditional syntax, you say how to compute the answer; with a regular expression,  you say what you want to compute and let the re engine figure out to compute it.