Programming Languages on Power

Power Programming Languages

IBM Power, including the AIX, IBM i, and Linux operating systems, support a wide range of programming languages, catering to both traditional enterprise applications and modern development needs.


#Power

 View Only
Expand all | Collapse all

distinct not returning unique results

  • 1.  distinct not returning unique results

    Posted Fri August 18, 2023 04:22 PM

    I have the following SQL script with a distinct on pm.ssno:

    select
       distinct(substr(right(digits(pm.ssno), 9), 1, 3) concat '-' 
          concat substr(right(digits(pm.ssno), 9), 4, 2) concat '-' 
          concat substr(right(digits(pm.ssno), 9), 6, 4)) as "SSN",
       digits(pm.empnum) as "Employee Number",
       nm.fn as "First Name",
       nm.in as "Middle Initial",
       nm.ln as "Last Name",
       nm.sx as "Suffix",
       pm.addr as "Home Address Line 1",

    .

    .

    from gpl.paymasvw pm 
    join gpl.prsnamevw nm on nm.ssno = pm.ssno
    left join dlsf.emsemps em on ucase(em.FrstName) = nm.fn and ucase(em.LastName) = nm.ln
    where dot >= (bigint(current_date) - 5000) or dot = 0
    order by SSN
    ;    

    Here is the result set - can't show ssn. you will just have to trust me that they are the same...

    I may be missing something about the distinct statement, but not sure what



    ------------------------------
    David Strawn
    ------------------------------

    #SQL


  • 2.  RE: distinct not returning unique results

    Posted Fri August 18, 2023 04:59 PM

    Distinct cannot be placed upon a subset of columns being selected.
    It can only act upon the whole row you are selecting.

    If the "Employee Number" was omitted from the selection, then we would only receive 1 record.
    The query would find that both rows being returned match, so it would remove the duplicate.

    -Mike Z



    ------------------------------
    Mike Zaringhalam
    ------------------------------



  • 3.  RE: distinct not returning unique results

    Posted Sat August 19, 2023 05:12 AM

    DISTINCT removes duplicates if all column values of two or more rows are identical.

    If I look at the result. In the second column you have different values in the first and second row.

    So DISTINCT works correct.



    ------------------------------
    Birgitta Hauser
    Database and Software Engineer
    Selfemployed - Modernization-Education-Consulting on IBM i
    Kaufering
    +49 170 5269964
    ------------------------------



  • 4.  RE: distinct not returning unique results

    Posted Mon August 21, 2023 03:24 PM

    I have specified distinct on SSN, so I guess it is not working wrong, just not how I specified...  is there a way to get only the first unique SSN occurrence, and use the columns from that occurrence and drop the other occurrences, like a chain and update when not found in RPG?



    ------------------------------
    David Strawn
    ------------------------------



  • 5.  RE: distinct not returning unique results

    Posted Mon August 21, 2023 09:55 PM

    DISTINCT applies to ALL columns you specify in SELECT.  If you want a distinct SSN, then you need to specify only SSN column in SELECT but this is not what you want.   For what you want, in DB2 for i, there is this FETCH FIRST n ROWS ONLY for use.  If you want only the very first row of the result set, you use FETCH FIRST ROW ONLY at the end of SELECT.



    ------------------------------
    Education is not the learning of facts but the training of the mind to think. -- Albert Einstein.
    ------------------------------
    Satid S.
    ------------------------------



  • 6.  RE: distinct not returning unique results

    Posted Tue August 22, 2023 10:08 AM

    fetch first row only gives you just 1 row in the file.  I want a result set that includes the first row only of each multiple occurrence of an SSN...



    ------------------------------
    David Strawn
    ------------------------------



  • 7.  RE: distinct not returning unique results

    Posted Wed August 23, 2023 12:21 PM

    To perform a 1 row join, look into a lateral join while specifying 'limit 1' within the lateral query.
    Now, since employee number is within the same table as the SSN, i'm not sure how you want to handle that.
    it seems you must select it, but that column is the reason your distinct is not working as you expect.
    If you don't need it, omit it and the original select distinct should work.
    If you need a list of employee numbers along with the distinct ssn, then use LISTAGG().
    Otherwise, use the lateral join method and you will get 1 of the employee numbers.
    If you need to control which one you get, add an order by to the lateral query as well.



    ------------------------------
    Mike Zaringhalam
    ------------------------------



  • 8.  RE: distinct not returning unique results

    Posted Mon August 21, 2023 09:59 PM

    This article may also give you more choices to consider : How to Join Only the First Row in SQL at https://learnsql.com/blog/sql-join-only-first-row/



    ------------------------------
    Education is not the learning of facts but the training of the mind to think. -- Albert Einstein.
    ------------------------------
    Satid S.
    ------------------------------



  • 9.  RE: distinct not returning unique results

    Posted Thu August 24, 2023 10:41 AM

    Thanks for the link to the article!  I used solution 1 to solve my problem...



    ------------------------------
    David Strawn
    ------------------------------



  • 10.  RE: distinct not returning unique results

    Posted Fri August 25, 2023 02:11 PM

    FYI here is the option I ended up using:

    with serialized as (
       select 
          pm.*, 
          row_number() over (
             partition by pm.ssno order by pm.ssno
          ) as rownum
       from gpl.paymasvw pm
    )
    select 
       *
    from serialized
    where (dot >= (bigint(current_date) - 5000) or dot = 0)
       and substr(digits(empnum), 1, 1)!= '9' and rownum = 1
    ;



    ------------------------------
    David Strawn
    ------------------------------



  • 11.  RE: distinct not returning unique results

    Posted Tue August 22, 2023 08:14 AM

    The follow-up question is "why does distinct allow a column name if it applies to the whole row only?"



    ------------------------------
    Patrick Conner
    ------------------------------



  • 12.  RE: distinct not returning unique results

    Posted Tue August 22, 2023 10:11 AM

    correct.  why would it not throw an error like sum()...



    ------------------------------
    David Strawn
    ------------------------------



  • 13.  RE: distinct not returning unique results

    Posted Tue August 22, 2023 10:14 AM

    It does not apply to the whole row only the selected columns.

    It is the same as:

    SELECT Col1, Col2, Col3 ... ColN
    From Table
    Group By Col1, Col2, Col3, ... ColN


    ------------------------------
    Birgitta Hauser
    Database and Software Engineer
    Selfemployed - Modernization-Education-Consulting on IBM i
    Kaufering
    +49 170 5269964
    ------------------------------



  • 14.  RE: distinct not returning unique results

    Posted Wed August 23, 2023 08:11 AM

    Birgitta,

    The result row...
    Based on your example you could write a query such as:

    select distinct( col1 ), col2, col3 ... colN
    from Table

    "Distinct" doesn't apply to the uniqueness of col1, but all selected columns. Why allow the statement to be written this way? It is misleading.



    ------------------------------
    Patrick Conner
    ------------------------------



  • 15.  RE: distinct not returning unique results

    Posted Wed August 23, 2023 09:05 AM

    DISTINCT is a Predicate and NOT a function. If you specify a parenthesis around COL1, so it is just a parenthesis around a column/value, nothing else.

    For everything else you may ask the guys who describe the SQL Standard.



    ------------------------------
    Birgitta Hauser
    Database and Software Engineer
    Selfemployed - Modernization-Education-Consulting on IBM i
    Kaufering
    +49 170 5269964
    ------------------------------



  • 16.  RE: distinct not returning unique results

    Posted Wed August 23, 2023 12:13 PM

    As Birgitta said, Distinct is NOT a function nor can it be used on a subset of columns being selected.
    Once this is known, queries written with immediate parenthesis will no longer be misleading.

    What if you were writing a query like this?
    'select distinct (select 1 from table1 where ...), field2, field3 from table1 where...'

    Sure, this may look misleading, but if we didn't allow parenthesis, then contained logic or sub-selects wouldn't be allowed when using distinct.



    ------------------------------
    Mike Zaringhalam
    ------------------------------



  • 17.  RE: distinct not returning unique results

    Posted Thu August 24, 2023 09:19 AM

    Distinct, when used in column selection, is confusing. I've instructed my shop to not use it and specify the Group By clause. I haven't thought much about the value of 'Is Distinct From' (Distinct predicate). My advice to David is to do the same and use Group By.



    ------------------------------
    Patrick Conner
    ------------------------------



  • 18.  RE: distinct not returning unique results

    Posted Thu August 24, 2023 10:40 AM

    agreed!



    ------------------------------
    David Strawn
    ------------------------------



  • 19.  RE: distinct not returning unique results

    Posted Thu August 24, 2023 11:41 AM
    Edited by David Strawn Thu August 24, 2023 11:42 AM

    I may be thinking too much like an RPG programmer, but I think it would be helpful to have a conditional group by/distinct, where you can do something like ...group by ssno use first occurence..., this would enable you to use the non-grouped columns.  i know that implies a non normalized database, but in my world that is common.  I know there are other ways to get around this, but for a one off sql select script, it would be nice...   imo



    ------------------------------
    David Strawn
    ------------------------------



  • 20.  RE: distinct not returning unique results

    Posted Tue August 29, 2023 10:42 AM

    David, you're not the only one. There's an Idea at https://ideas.ibm.com/search?query=IBMI-I-3674 that I think does what you want. Feel free to upvote it.



    ------------------------------
    Tim Clark
    DB2 for IBM i / SQL Optimizer
    ------------------------------