webMethods

webMethods

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only
Expand all | Collapse all

Exponential time for query processing / avoiding Tamino Post

  • 1.  Exponential time for query processing / avoiding Tamino Post

    Posted Wed January 30, 2002 02:08 PM

    Hi Tamino fans,

    we are concerning about a Tamino query, related processing time and the interpretation of the ino:explain-command:

    The query expression:
    FpML[trade/swap/swapStream/calculationPeriodAmount/calculation
    [notionalSchedule/notionalStepSchedule/currency=‘USD’ and
    floatingRateCalculation/floatingRateIndex=‘USD-LIBOR-BBA’ and
    dayCountFraction=‘ACT/360’]]

    When we test the query with steps of 100.000 instances the processing time for the first 400.000 instances is linear but from 500.000 to 900.000 the processing time increases exponential. The results:

    instances time [msec]
    20000 1309
    40000 2445
    60000 3579
    100000 6541
    200000 11793
    300000 17906
    500000 43124
    700000 98195
    900000 127128

    All fields we are using within the filter expression are indexed with the ‘standard’ option but the ino:explain-command always returns an ino:postprocessing=“TRUE”.

    Has anyone got an idea why Tamino uses postprocessing or why the processing time increases exponential???

    We are using a sun fire 880 with 4 GB RAM and two processors, buffer pool size is 1 GB.

    Enclosed you will find the schema, the result of ino:explain and a sample instance.

    Thanks in advance
    Michael

    Michael Pollecker

    SAG Systemhaus GmbH
    Niederlassung Darmstadt
    Professional Services

    Alsfelder Str. 15-19, D-64289 Darmstadt
    Telefon +49 (6151) 92 31 28, Fax +49 (6151) 92 31 11
    E-Mail: Michael.Pollecker@softwareag.com
    Michael.Pollecker@partner.commerzbank.com
    ino_explain.zip (7.6 KB)


    #API-Management
    #webMethods
    #Tamino


  • 2.  RE: Exponential time for query processing / avoiding Tamino Post

    Posted Fri February 01, 2002 10:52 AM

    Standard index means that Tamino remembers in which documents a particular value occurs.

    So if your query is something like that

    path[node1=‘xxx’ and node2=‘yyy’]

    Tamino finds two lists of document IDs where node1 and node2 occur and returns you only common documents.

    This technique seems to be fast enough ( from O(n) to even constant time, depend on what type of indexing is used, i don’t know it unfortunately).

    Your query is a litte more difficult because of the nested conditions. Stardard indeces can’t be applied here straightly. That is why post-processor is involved in calculations.

    The only thing I can advise you is to restructure your documents so that the queries you perform more often take less.

    Alexander


    #API-Management
    #Tamino
    #webMethods


  • 3.  RE: Exponential time for query processing / avoiding Tamino Post

    Posted Mon February 04, 2002 11:04 AM

    Hi Alexander,

    thanks for your answer! In the meantime we found out why tamino uses the postprocessor: It’s the cardinality > 1 of a node we query. In detail:

    A query like
    /A/B/C[D/E[F=‘xxx’ and G=‘yyy’]]
    causes no (!) postprocessing if the cardinality of the nodes D and/or E is zero or one. If the cardinality of these nodes is > 1 the postprocessor is invoked.

    Your suggestion changing the schema is not useful in our case because the schema is standardized.

    regards
    Michael


    #webMethods
    #Tamino
    #API-Management


  • 4.  RE: Exponential time for query processing / avoiding Tamino Post

    Posted Mon February 04, 2002 01:16 PM

    Yes, I wanted to explain you the same. Standard indeces can help only in the question: are there DOCUMETNS that have this value in this node or not. The world “document” is crucial.

    Your query is more difficult exactly because of multiple cardinality of the nodes you said (by the way B and C as well). Otherwise the query could have been simplified to

    /A/B/C[D/E/F=‘xxx’ and D/E/G=‘yyy’]

    If you are not permitted to change the schema entirely, you could rely on open schema concept and add a little auxiliary node to it with standard index. For example, auxiliaryNode under calculation. It contains a combination of all values you need in the query.

    Now this will be faster

    FpML[trade/swap/swapStream/calculationPeriodAmount/calculation/auxiliaryNode=‘USD;USD-LIBOR-BBA;ACT/360’]

    You can add the node when loading documents in Tamino and delete it if necessary when retrieving.

    Of course, this approach is defensible if you have a very limited number of queries that should be performed fast.


    #Tamino
    #API-Management
    #webMethods