Cognos Analytics

Cognos Analytics

Connect, learn, and share with thousands of IBM Cognos Analytics users! 

 View Only
Expand all | Collapse all

Join with many records e return exception

  • 1.  Join with many records e return exception

    Posted Mon June 05, 2023 04:24 PM

    Hello everyone, how are you?

    I need to do a join between two datasets, A and B, and I would like the most performative way for this case.

    • Dataset A - Participants (40.000 rows) 
    • Dataset B - Clients (5.000.000 rows)

    Participants (ID - NAME)
    789465   JONH
    987465   MARIAN
    147789   JESSYCA

    Client (ID - NAME)
    963258   JONH

    987465   MARIAN
    321456   ROSE

    I need it to return all clients (dataset B), except participants (dataset A).



    ------------------------------
    Gustavo Andrade
    ------------------------------


  • 2.  RE: Join with many records e return exception

    Posted Tue June 06, 2023 05:35 AM

    Hi Gustavo - I don't know if it's the most performant way of doing it, but I'd create a left outer join from dataset B to dataset A on Client ID = Participant ID, then filter for Participant ID is missing



    ------------------------------
    Chris Turner
    ------------------------------



  • 3.  RE: Join with many records e return exception

    Posted Tue June 06, 2023 09:53 AM

    Hi,

    If you are looking for an Except (rather than a Join), and the data sets have the same number of items (with the same basic data types), you can use an Except in various places in Cognos Analytics.

    • In Framework Manager, you can create a Query Set (Actions > Define Query Set) to except the two sets of data.
    • In a Data Module, you can select the two sets of data and from the ellipsis choose New > Table > Except of tables
    • In a report you can bring in each data set in a separate query, and on the Queries page use an Except tool to except them into a new query

    Cheers!

    MF.



    ------------------------------
    Mark Fry
    Technical Consultant
    ------------------------------



  • 4.  RE: Join with many records e return exception

    Posted Tue June 06, 2023 11:01 AM

    Okay, I'll test these options and come back here to report.

    My great difficulty is that there is a technological limitation that allows the query of only 10,000 records at a time in Join.

    If I only need the aggregated result (without the customer granularity), is it possible to pass the IDs as a query filter?


    Example
    [ID_CLIENT] NOT IN(789465;987465;147789....)



    ------------------------------
    Gustavo Andrade
    ------------------------------



  • 5.  RE: Join with many records e return exception

    Posted Tue June 06, 2023 12:37 PM

    Hi Gustavo,

    Is the performance acceptable when you add a filter like:
    DS_B.col_x not in ( DS_A.col_y )
    When I execute this with DS_B having 43,000 rows and DS_A having 43,000 rows in Reporting, projecting sum(DS_B.qty) it took 1.7 seconds, including the planning of the query. I was using a Thinkpad P52 with both datasets on local ssd.
    What is the number that you end up with?



    ------------------------------
    Kind regards,
    Henk Cazemier
    ------------------------------



  • 6.  RE: Join with many records e return exception

    Posted Tue June 06, 2023 02:49 PM

    In my scenario here is

    CLIENT.ID not in ( PARTICIPANTS.ID )

    Dataset A - Participants (40.000 rows) 
    Dataset B - Clients (5.000.000 rows)

    When pasting the IDs in () and trying to process, the error occurs:

    XQE-DAT-0001 Erro do adaptador da origem de dados: org.netezza.error.NzSQLException: ERROR:  pg_atoi: error reading "22954902000108": Numerical result out of range
    RSV-SRV-0042 



    ------------------------------
    Gustavo Andrade
    ------------------------------



  • 7.  RE: Join with many records e return exception

    Posted Tue June 06, 2023 04:05 PM

    Hi Gustavo,
    I wouldn't try to copy those literals into the expression.
    I'm assuming from the above that both tables are in NZ ? Or do you have one local and one in NZ ?
    I created the example based on 2 Cognos stored datasets.
    If you create a Data Module that references both tables.
    And use this Data Module to create a report and add a filter to that report along the lines of client.id not in ( participants.id ) then it will get all pushed to the database. In this case if one is a Cognos dataset and the other a NZ table, then Cognos will do this operation locally. Most of the time will likely be consumed in moving the 5,000,000 rows (hopefully narrow) to Cognos.
    If you'd like we can do a webex and walk through it.



    ------------------------------
    Kind regards,
    Henk Cazemier
    ------------------------------



  • 8.  RE: Join with many records e return exception

    Posted Tue June 06, 2023 04:55 PM

    Hello HENK,

    My dataset "Participants" comes from several spreadsheets of Excel units, and my dataset "Customers" comes from my organization's Customer data warehouse (I am author user only).

    I wanted a new set of data, the result of this Join, containing the already consolidated data, because I don't need to go down to the client.id level, just the companies, branches and totals.

    About webex, you can check release with IT department.



    ------------------------------
    Gustavo Andrade
    ------------------------------



  • 9.  RE: Join with many records e return exception

    Posted Tue June 06, 2023 06:54 PM

    It appears your data resides in two locations. One being a local set of data (flat file) and one being a database table.

    <query1> EXCEPT <query2> would allow you to returns rows of <query1> not found in <query2>
    Given you have data in two locations, that would require CA to retrieve the data from the database to complete the EXCEPT set operation locally.

    Using Filter joins would result in CA trying to construct a distinct set of key values from a query which it then pushes to the server in a quantified predicate (ie. in-list)
    https://www.ibm.com/docs/en/cognos-analytics/12.0.0?topic=mode-optimizing-joins-by-applying-filters

    The potentially issue being how effectively an RDBMS handles a large in-list (i.e. some may create a dynamic temporary table and rewrite the query).
    The other being the finite number of terms that can be used.

    The type conversion error being thrown by Netezza suggests it has assumed to convert the values into an integer datatype and one of the values is too large for an integer (vs bigint).

    pg_atoi: error reading "22954902000108": Numerical result out of range

    https://www.ibm.com/docs/en/psfa/7.2.1?topic=types-integer-data


    Had you opened a call with IBM support?




    ------------------------------
    NIGEL CAMPBELL
    ------------------------------



  • 10.  RE: Join with many records e return exception

    Posted Wed June 07, 2023 06:04 AM
    Edited by Thomas Meier Wed June 07, 2023 06:07 AM

    Hi Gustavo,

    I 'm a friend of solving something like this with an outer join:

    select a.* from Clients c
    left outer join Participants p on p.ID = c.ID and p.ID is null

    Edit: I just saw that this solution was already proposed yesterday by chris.

    ------------------------------
    Thomas Meier
    ------------------------------



  • 11.  RE: Join with many records e return exception

    Posted Fri June 09, 2023 04:19 PM

    Hello, sorry for the delay in responding.

    I was testing the suggestions offered, and I realized through contact with my IT that my query results at the lowest level of granularity (client) return only 2.5M of records.

    As I said earlier, I only need the result after aggregating the branches.

    In that case I think of 2 options:
    1. Pass this exception of IDs in the query (I don't think it will work)

    2. Use each returned consolidated query (already with aggregation) that I need and make the difference in balances, quantity, etc.



    ------------------------------
    Gustavo Andrade
    ------------------------------