Content Management and Capture

 View Only
  • 1.  Content-Based Retrieval - how to not index whole class

    Posted Wed March 20, 2024 03:47 PM

    Hello all,

    our client wants to use full-text search but the client wants to have control over what documents (of the same class) will be indexed. As far as I know, this is impossible, but I want to ask if anyone has any idea how to do it. How to index only some of the documents of the class. For example, I was thinking about cancelling the Class index job and creating only an Object Index job per document, but I don't know if this will work. Many thanks, Jana



    ------------------------------
    Jana Kolodziejová
    ------------------------------


  • 2.  RE: Content-Based Retrieval - how to not index whole class

    IBM Champion
    Posted Thu March 21, 2024 05:41 AM

    Hi Jana,

    this isn't by any chance a customer we both know :-)?

    The only way that I can think of is to create a subclass of the class in question and make that full text indexed. When someone/somethings determines that full text is needed change the document class.

    This - as you know - might have some side effects and you might have to update search templates and the like...

    Let me know if you come up with something better.

    Kind regards,

    Gerold



    ------------------------------
    Gerold Krommer
    ------------------------------



  • 3.  RE: Content-Based Retrieval - how to not index whole class

    Posted Thu March 21, 2024 05:55 AM

    Hi Gerold,

    thanks for your answer. Yes, it is the client we both know :) 
    Of course, I've proposed a subclass approach for the client, but there are some reasons why the client asked us if there is any other chance to control what would be indexed and what not. 


    Best regards, Jana



    ------------------------------
    Jana Kolodziejová
    ------------------------------



  • 4.  RE: Content-Based Retrieval - how to not index whole class

    Posted Thu March 21, 2024 12:46 PM

    Jana

    Assuming there is some metadata on the documents that indicate "I should be text indexed", could you create a subclass that is identical to the higher level class (except for the name) and classify those documents in that subclass, and turn indexing on for the subclass? 



    ------------------------------
    RUTH Hildebrand-Lund
    ------------------------------



  • 5.  RE: Content-Based Retrieval - how to not index whole class

    Posted Fri March 22, 2024 06:39 AM

    I would say that it depends which search engine you are using. 

    With Watson Explorer (old Content Analytics) it is possible to create custom crawler - but unfortunately WEX doesn't support content based retrieval (CBR), so it cannot be used in Filenet query.

    I would bet that with Elasticsearch or OpenSearch it is possible, but you have to find expert from this domain (and customer has to have latest FN version - 5.5.12)

    But with "default" Content Search services I think no such customization is supported, although Lucene (it is based on) may allow it.

    Btw. if you have control over the Filenet QUERY (with CBR) than simply add where condition into it, filtering out those documents that you don't want to return (index) - if CRB is used in the query. Or response filter in ICN.



    ------------------------------
    Marcel Kostal
    ------------------------------



  • 6.  RE: Content-Based Retrieval - how to not index whole class

    Posted Mon March 25, 2024 03:25 AM

    Hi all, 

    thanks for the answers. For now, the client is satisfied with creating a separate subclass. As we are not going to implement it yet we will also look at Elastic search after the upgrade (to 5.5.12) planned for June in our case. 
    But I am confused to version of CPE, I've read that "Content Platform Engine 5.5.8 introduces a technology preview of a new content-based search feature that uses either the Elasticsearch search engine or the OpenSearch search engine"
    resource: https://www.ibm.com/support/pages/technology-preview-new-content-based-search-feature-content-platform-engine
    It sounds like Elastic search can be used from 5.5.8. But also 5.5.12 is OK for us, as we plan to upgrade soon.

    BR, Jana



    ------------------------------
    Jana Kolodziejová
    ------------------------------



  • 7.  RE: Content-Based Retrieval - how to not index whole class

    Posted Mon March 25, 2024 11:06 AM

    Jana

    You are correct that we released the support for Elasticsearch/OpenSearch initially as a tech preview in 5.5.8. In 5.5.12 we made it a regularly supported feature. So it can be used with any release starting from 5.5.8.



    ------------------------------
    RUTH Hildebrand-Lund
    ------------------------------



  • 8.  RE: Content-Based Retrieval - how to not index whole class

    Posted Mon March 25, 2024 10:55 AM

    Just to follow up on the comment about Elasticsearch/OpenSearch. While these are supported in FileNet 5.5.12, they won't help address this issue as the setup compared to CSS is different...but how you indicate which items are to be indexed is the same.



    ------------------------------
    RUTH Hildebrand-Lund
    ------------------------------