This post is a brief overview of the answer finding API beta in Watson Discovery v2. The answer finding API extends the passage retrieval API and allows you to find concise answer spans within a passage. It uses deep-learning-based Reading Comprehension technology as announced in our recent press release. This post explains the API and how to use it. For a broader explanation of why to use it and a what it is good for, see our Medium blog post, which will be published soon.
The answer finding API beta in Watson Discovery v2 adds two new parameters within the passage
block of the query API in Watson Discovery v2:
find_answers
optional and defaults to false. It is set to true (and the natural_language_query
parameter is set to some query string), the new answer finding feature will be enabled.
max_answers_per_passage
is optional and defaults to 1. The answer finding feature will only find at most this many answers from any one passage.
When answer finding is used, a new block is added to the return value within each passage
object. That new block is calledanswers
, and it is a list of answer objects. The list can be up tomax_answers_per_passage
in length. Each answer object has the following fields:
answer_text
is the text of a concise answer to the query.
confidence
is a number from 0 to 1 that is an estimate of the probability that the answer is correct. Note that some answers have very low confidence and are very unlikely to be correct, so we recommend that anyone using the API be selective about what you do with answers depending on this value.
start_offset
is the start character offset (the index of the first character) of the answer within the field that the passage came from. It is guaranteed to be greater than or equal to the start offset of the passage (since the answer must be within the passage).
end_offset
is the end character offset (the index of the last character, plus one) of the answer within the field that the passage came from. It is guaranteed to be less than or equal to the end offset of the passage.
Here is an example of a query using this API (this example also appears in the Medium blog post linked to above):
{“natural_language_query”: “InfoSphere Information Server 1.3 Firefox versions”,
“passages”: {
“enabled”: true,
“max_per_document”: 3,
“characters”: 850,
“fields”: [“title”, “content”],
“find_answers”: true,
“max_answers_per_passage”: 1}}
Here is a corresponding response:
{“passage_text”: “<em>InfoSphere</em> <em>Information</em> <em>Server</em> Web Console with Internet Explorer 11, you may get the error message: IBM <em>InfoSphere</em> <em>Information</em> <em>Server</em> supports Mozilla <em>Firefox</em> (ESR 17 and 24) and Microsoft Internet Explorer (<em>version</em> 9.0 and 10.0) browsers.”,
“start_offset”: 287,
“end_offset”: 526,
“field”: “content”,
“answers”: [{
“answer_text”: “(ESR 17 and 24)”,
“start_offset”: 446,
“end_offset”: 700,
“confidence”: 0.6925222}]}
Theconfidence
values shown in the answers are not merely the direct output of the answer finding model, which attempts to find the most likely answer within any single passage. Instead, the confidence
values that we provide as output reflect a combined estimate of how likely the document is to be relevant, how likely the passage is to be relevant, and how likely the answer is likely to be correct given that passage in that document.
We update the confidence
values for documents and the ordering of documents and passages that have answers using the same signals from document retrieval, passage retrieval, and the answer finding model *. As a result, you may find that document retrieval and/or passage retrieval may be more or less accurate when you enable answer finding. For applications where end-users ask a lot of explicit questions (e.g., "What versions of Firefox does InfoSphere Information Server 1.3 support?") or implicit questions (e.g., "InfoSphere Information Server 1.3 Firefox versions"). we have found that turning on answer finding can substantially improve accuracy.
Because we only perform answer finding on as many documents and passages as you request, you may want to consider requesting more documents and/or more passages per document than you actually need so the answer finding model can be combined with more candidate documents and passages. For example, if you want to show 10 documents and 1 passage from each document, consider asking for 20 documents and up to 3 passages from each document with answer finding. That will allow the answer finding to search for answers in up to 20*3 = 60 passages and if it is confident that it has found an answer in one of those passages, that confidence will be combined with the document and passage scores to produce a final ranking which can promote a document or passage you might otherwise have missed.
* Note: Reordering documents based on answer confidences does not occur if theper_document
parameter of passage retrieval is false. However, that parameter setting is little used, so most users can ignore this issue.
#Featured-area-2#Featured-area-2-home#WatsonDiscovery