Content Management and Capture

 View Only
  • 1.  Migrating Massive amounts of document to cloud

    Posted Wed October 23, 2024 04:01 PM

    We are currently using Filenet 5.5.8 and are in the middle of a migration activity from on-premise to cloud environment. Our filenet server contains one object store with 2 storage policies spread across 20+ storage areas. We are trying to move the documents from on-premise disk drive to AWS S3 that has been configured as an advanced storage area in Filenet.

    The problem comes when we are trying to configure the correct sweep job for this activity. What kind of sweep do we need? How to configure them? While we landed on Bulk move sweep, we feel like that the configuration we used in it was not effective, making the migration process painfully slow. If anyone have any experience in migrating documents between storage areas and can help, it'll be really appreciated as this is our first experience doing this.

    Thanks!



    ------------------------------
    Lucky Setiadarma
    ------------------------------


  • 2.  RE: Migrating Massive amounts of document to cloud

    Posted Thu October 24, 2024 11:55 AM
      |   view attached

    Lucky

    Move sweep is the right mechanism to use. Please see the customer use case and performance information in the attached presentation. The customer use case covers information on optimizing throughput, and the performance information details the indexes you need to add to the database.



    ------------------------------
    RUTH Hildebrand-Lund
    ------------------------------

    Attachment(s)

    pdf
    Chicago - Sweep Framework.pdf   1.33 MB 1 version


  • 3.  RE: Migrating Massive amounts of document to cloud

    Posted Fri October 25, 2024 08:39 AM

    Hi Lucky, 

    I have used job sweeps as others for several different IBM customers.  Most of the time this has worked without issue but there have been occasions where performance was lacking usually due to customer environment but occasional within the FileNet app layer.  Examples: Using sweep to move CFS-IS to P8 storage when the customer IS system was lacking a little bit caused sweep to blow up/fail and unable to recover. When I used the API's I could get sustained content moves until the CFS agent/repo handlers on IS got overwhelmed. There is a N retry scenario when that failure occurs so instead of just stopping each batch of the sweep kept retrying.  In another case a customer had several SAN drives and I don't know the root cause but the drives got really worked up and we had to do a recovery at the disk level.  I doubt that was the fault of sweep could have been just a ton of IO.  

    After doing this several times and working with support I ended up developing a storage migration utility (SMU) which uses the same APIs sweep is using. The difference is you have complete control over what is submitted, when and can benchmark results. This helps pinpoint bottlenecks without relying on a ton of general tracing, debugging or support sessions.  I am completely in favor of OOTB sweep tools for simple FileNet systems but have found a lot of value in using the migration utility for anything more complicated.  An easy example is what is the sites benchmark - max throughput to a cloud container? What is the max throughput of existing storage? It is good to see those two values then work backwords from MAX to ACTUAL what is the delta? Using that same utility you can tweak batch sizes and workers then rerun the benchmark. Once you are in desired target range then focus on the DB and CPE. The SMU can use either API queries or predefined IDs in external tables which is must faster raw performance. 

    Look forward to hearing back on your storage moves and benchmarks you see.



    ------------------------------
    Jay Bowen
    www.bowenecmsolutions.com
    Medina, OH
    ------------------------------



  • 4.  RE: Migrating Massive amounts of document to cloud

    Posted Mon October 28, 2024 10:50 AM

    Hi Lucky, 

    FileNet Content Manager 5.5.9 introduced a new option to skip deleting the object from source storage area

    https://www.ibm.com/docs/en/filenet-p8-platform/5.5.x?topic=v559-whats-new-administrators 

    https://www.ibm.com/docs/en/filenet-p8-platform/5.5.x?topic=sweeps-moving-content

    I understand your use case is pretty well covered by this improvement:

    When you perform a large content migration project to the new storage area and retire the old storage device, you might plan to manually destroy all the content in the old storage device. In this case, skipping deletion improves the overall migration performance because the deletion check and the deletion step are both skipped in the process.



    ------------------------------
    Mathias Korell
    ------------------------------



  • 5.  RE: Migrating Massive amounts of document to cloud

    Posted Thu October 31, 2024 12:31 AM
    Edited by Lucky Setiadarma Thu October 31, 2024 12:32 AM

    Mike,

    Thank you for your input. We are trying to limit the process to using Filenet only, but that could be an option we can consider in the future if the migration process does not proceed smoothly.

    Mathias,

    My concern with skipping deletion is that the job would try to re-query already migrated documents. I am also unsure whether Filenet will create two copies of the same documents with different IDs if we implement that configuration, which will create potential user confusion in that case.

    In another note, i'd like to clarify: The "Storage Policy" field in the sweep configuration refers to where the documents will be moved TO, not where the documents are CURRENTLY stored in, right? In that case, what happens if I put in the same value in that field as the documents' current storage policy? Will that even work?

    As a reference, my currently running bulk move sweep is using the following configuration:

    Target class: Document (include subclasses checkbox enabled)

    Filter expression: StorageArea = Object('{Insert storage area ID}')

    Storage policy: Current_storage_policy

    This resulted in some jobs running as expected, but some others just wont pick up any documents at all.



    ------------------------------
    Lucky Setiadarma
    ------------------------------



  • 6.  RE: Migrating Massive amounts of document to cloud

    Posted Fri November 01, 2024 08:42 AM
    Edited by Eric Walk Fri November 01, 2024 08:42 AM

    Hi Lucky,

    Skip delete will not make a copy of the document with a different ID. It will just skip deleting from your on-prem storage, so the content will be left on the old storage device although FileNet will not remember that it's there. If you're decommissioning the device completely or are okay with a follow up step to delete the folders where FileNet had legacy StorageAreas, there's no issue. If you need FileNet to actively free up storage on the old device, don't skip delete.

    The Filter Expression is the where clause for the query to identify which documents to move. So That should have the ID of the Storage Area you're migrating FROM. The Storage Policy field should be the new Storage Policy to apply to the documents that meet the filter criteria.

    How many documents are you really dealing with here? How many bytes? There's always a point at which physics takes over as the limiting factor. Ruth's performance tuning document will get you a lot, but if you're dealing with billions of documents and a short timeline, please reach out privately and we can talk about some more creative approaches to get you migrated quickly.

    Best,

    Eric



    ------------------------------
    Eric Walk
    Principal

    O: 617-453-9983 | Perficient.com
    ------------------------------