Hi Houman,
when i was writing ‘figure out where your application is taking its time’ you are saying that that is a good point, but you do not seem to be following up on that thougt.
please try to find out these times:
- time from start until the connection to the database is completed
- time from query until the first document is returned.
- time to process the first document
- time to process documents 4999, 5000 and 5001
- time to process documents 79999, 80000 and 80001 (increase max transaction duration to 99999 for that test)
- (please keep an eye on memory consumption of the JVM during the last two tests.)
i am sure that from the numbers you get there you will see that
- yes, it does take some time to set up a connection to a database
- once the first document is there, the time will be fairly constant in the beginning
- the time will degrade after some time (that is, when the java VM is running out of memory and will do significant garbage collecting
admitted, retrieving data from a full scale database will not be as fast as retrieving the same data from an in-memory situation, but the comparison is far from fair (or we’ll continue the comparison if 1000 parallel users are posting queries at both tamino and saxon in parallel and do concurrent, transactional updates :-).
but, perhaps we are on a completely false track alltogether. do you - from the viewpoint of your application - even need to get all those 80000 documents in ONE GO and in the scope of ONE SINGLE transaction?
and then, what do you do with them? do you simply ‘read’ them, or do you intend to update some of them / all of them? are the updates logically interlinked so that the manipulation of the 80000 documents NEEDS to be in the scope of one database transaction?
perhaps you can read the documents outside the scope of a transaction (without the cursoring, do not use the LocalTransaction) or in considerably smaller and more handleable chunks, and do the updates in a separate processing step - within a transaction.
or, if all manipulations of all doc’s are one transaction after all, can you justify the value of 900 seconds for the max transaction duration? how many parallel users are there, how many parallel threads may wish to manipulate the same data you are keeping a lock on?
again: we really don’t know enough of your complete usage scenario to provide really useful help. as i am suggesting above, maybe a scenario that neither you nor anybody else here has been able to come up with will be the perfect answer to your problems.
continuing that thought: i am not sure whether simply switching to the SAX object model will be THE answer. it all pretty much depends on what you finally want to do with your data (see above), as you may run out of the max. transaction duration with that object model as well.
sorry, not all that many answers but basic food-for-though,
andreas f.
#webMethods-Tamino-XML-Server-APIs#webMethods#API-Management