Hi Anish,
Thanks for your reply! This is very helpful!
Have a good day!
Joost
------------------------------
Joost Vos
------------------------------
Original Message:
Sent: Fri November 20, 2020 04:49 PM
From: ANISH MATHUR
Subject: How to get a list of files ingested a Watson Discovery collection?
Hi Joost,
The best way to return all documents is to get it out in 10K doc chunks using filters. Depending on what metadata you have available you could filter on ranges of document_ids or something like the document hash (under extracted_metadata). There's an example described in the last comment here https://github.com/watson-developer-cloud/python-sdk/issues/314
------------------------------
ANISH MATHUR
Original Message:
Sent: Mon October 12, 2020 04:40 AM
From: Joost Vos
Subject: How to get a list of files ingested a Watson Discovery collection?
Hello everyone,
I am working on a Watson Discovery collection containing more than 10,000 documents. I would like to retrieve a list with all ingested documents from that collection. I am using the Python SDK for interacting with the WDC API. I am able to get the first 10,000 documents (max. retrieval count) using a query parameter (query="*.*"). But if I adjust the offset and count parameters (offsett=10000 and count=20000), the following error is raised:
"error" : "Result window is too large, count + offset must be less than or equal to 10000"
Does anyone of you know how you can retrieve a list of all documents in a Discovery collection?
Looking forward to your reply!
Best Regards,
Joost
------------------------------
Joost Vos
------------------------------
#WatsonDiscovery