SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
  • 1.  Regarding string variable

    Posted Thu January 04, 2024 09:56 AM

    Hi,

    I have a string variable (blood pressure) that contains numbers (it can be Easly converted to numeric with no issue). the value under this variable is containing both systolic and diastolic blood pressure readings together like : "12080: which means the blood pressure is 120/80 and if its "180110: this means the blood pressure is 180/110. I want to create two new variables called "systoilc_blood_pressure" and "diastolic_blood_pressure". The systolic one should extract the first two digits of (blood pressure) if the (blood pressure) contains 4 digits and should extract the first three digits if the (blood pressure) contains 5 or 6 digits.  The diastolic one should extract the last two digits of the (blood pressure) if it contains 4 or 5 digits, and should extract the last three digits if the (blood pressure) contains 6 digits. I need a syntax to do that. I have been struggling on this problem for one month until now. Any help will be appreciated. 

    Khalid



    ------------------------------
    Khalid Orayj
    ------------------------------


  • 2.  RE: Regarding string variable

    Posted Thu January 04, 2024 11:57 AM
    Edited by David Dwyer Thu January 04, 2024 11:58 AM
      |   view attached

    Hi @Khalid Orayj,

    You've really hamstrung yourself with the data entry. Here is one approach to resolving your issue.  As I saw it, 4 digit values were easy, 6 digit values were easy, but 5 digit values presented a problem.  So I arbitrarily decided to parse the value differently if the leftmost value of the 'bp' variable was 1 (100<= systolic <200) versus any other value.  If it actually is possible to have a valid systolic value of 200 (or more), then my approach will still have problems for you.



    ------------------------------
    David Dwyer
    SPSS Technical Support
    IBM Software
    ------------------------------

    Attachment(s)

    zip
    Khalid.zip   6 KB 1 version


  • 3.  RE: Regarding string variable

    Posted Fri January 05, 2024 02:58 PM

    You could also do something like this.

    dataset close all.
    DATA LIST FREE/ id (F1) bp (A6).
    BEGIN DATA
    1 "9075"
    2 "12080"
    3 "99101"
    4 "180110"
    END DATA.
    LIST.
    NUMERIC systolic diastolic (F3).
    do if char.index(char.substr(lower(bp),1,1),'1') | char.index(char.substr(lower(bp),1,1),'2').
    COMPUTE systolic = NUMBER(CHAR.SUBSTR(bp,1,3),F3).
    COMPUTE diastolic = NUMBER(CHAR.SUBSTR(bp,4),F3).
    else.
    COMPUTE systolic = NUMBER(CHAR.SUBSTR(bp,1,2),F3).
    COMPUTE diastolic = NUMBER(CHAR.SUBSTR(bp,3),F3).
    end if.



    ------------------------------
    Art Jack
    ------------------------------



  • 4.  RE: Regarding string variable

    Posted Fri January 05, 2024 04:45 PM

    Thanks @Art Jack!
    As usual, I'm over-thinking.  That's pretty elegant.



    ------------------------------
    David Dwyer
    SPSS Technical Support
    IBM Software
    ------------------------------



  • 5.  RE: Regarding string variable

    Posted Fri January 05, 2024 05:52 PM
    Just for fun, here is a one-liner to do this using the SPSSINC TRANS extension command.

    spssinc trans result = systolic diastolic
        /formula "(bp[:3], bp[3:]) if str.startswith(bp, '1') else (bp[:2], bp[2:])".

    The resulting values are numeric.  If SPSSINC TRANS is not already installed or is not the latest version, it can be installed or updated via Extensions > Extension Hub.

    --





  • 6.  RE: Regarding string variable

    Posted Sun January 07, 2024 08:22 PM

    In my world the systolic bp must be higher than diastolic bp. I don't understand why data is in string format as they clearly are numeric.

    When entering 5-6 numbers in a field there is high risk of errors, mistyped or omitted numbers.

    You need to check for invalid and unphysiologic values.

    ******************************.

    DATASET CLOSE ALL.

    * Note: Systolic bp is higher than Diastolic bp *.
    * Case 3 is invalid! *.

    DATA LIST FREE/ id (F1) bp (A6).
    BEGIN DATA
    1 "9075"
    2 "12080"
    3 "99101"
    4 "180110"
    END DATA.

    * probably should be using a copy to preserve original data... *.

    ALTER TYPE bp (F8) /PRINT NONE.

    DO IF (BP GT 99999).
      COMPUTE Systolic  = TRUNC(BP / 1000).
      COMPUTE Diastolic = BP - Systolic * 1000.
    ELSE.
      COMPUTE Systolic  = TRUNC(BP / 100).
      COMPUTE Diastolic = BP - Systolic * 100.
    END IF.

    EXECUTE.

    ******************************.



    ------------------------------
    Peder Rogmark
    ------------------------------



  • 7.  RE: Regarding string variable

    Posted Sun January 07, 2024 08:23 PM

    You will have to check data for invalid numbers. Systolic blood pressure is always higher than diastolic (se case #3).

    /PR

    dataset close all.

    DATA LIST FREE/ id (F1) bp (A6).
    BEGIN DATA
    1 "9075"
    2 "12080"
    3 "99101"
    4 "180110"
    END DATA.

    * Note:  Systolic >= Diastolic *.
    * Case 3 is invalid! *.

    ALTER TYPE bp (F8) /PRINT NONE.

    DO IF (BP GT 99999).
      COMPUTE Systolic  = TRUNC(BP / 1000).
      COMPUTE Diastolic = BP - Systolic * 1000.
    ELSE.
      COMPUTE Systolic  = TRUNC(BP / 100).
      COMPUTE Diastolic = BP - Systolic * 100.
    END IF.
    EXECUTE.



    ------------------------------
    Peder Rogmark
    ------------------------------