Content Management and Capture

 View Only

 File path on NFS storage

Leonardo Lavarini's profile image
Leonardo Lavarini posted Thu November 21, 2024 09:52 AM

Hello, i have several Filenet object stores, each with an NFS storage area mounted on it. I need to retrive the path of the phisical files located in the storage area, e.g. /FN10/FN11/... So far, i've used a scraper that navigates the file system, but it's very slow. It is possible to obtain the file path through an API? Thank you.

Gerold Krommer's profile image
Gerold Krommer IBM Champion

I had this request several times, but as far as I remember the path isn't stored anywhere and if it is it is so cleverly hidden and encrypted I couldn't spot it.

In this cases (for whatever reason we needed them, this is a whole different story) we had to parse the filenames from the error logs and do something with it.

I have something to extract the Clip ID of a Centera stored document from the content referal blob, but that doesn't not help you, but if the path is somewhere it is there :-(.

Doesn't help... I know,

/Gerold

Stephen Weckesser's profile image
Stephen Weckesser

Short answer is no. At one time the filename matched the docid and the folders were calculated by bit shifting the id - an common technique but whose implementation varies. Some folders are 2 level and others are 3 level so you would have to decode both. However, the uuid parts of the filename are also swapped in later versions so it's no longer a direct match. Enumerating the files on disk will no longer work. Even if you can decode the path and name, if the storage uses encryption (shame on you if not), the documents will still need to be retrieved through filenet to decrypt them. That renders md5 and other approaches moot.

The ClipID Gerold mentions is used with Fixed Content Devices such as Centera or DELL ECS that function as a 'black box' and not a filesystem. They use a "CAS" API to retrieve docs based on the CLIPID. For filesystems, the consistency checker, xcheck, does log the path if there's a problem file. It can be handy to know if there are any missing files prior to a migration but the xml is too verbose to be useful. You can build a parser to extract the path or use mine.. (see: https://www.applied-logic.com/reformat-consistency-check-reports/). I have used that approach just to account for missing docs in a storage area when migrating. I would not use it to enumerate the files for export - On a Fiber SAN, we could check 80 docs/second but using NFS it tends to time out and never finish so it wouldn't work for you. In the case where there are several storage areas, I find using a different process for each is the simplest to get better performance. 

Adeel Ali's profile image
Adeel Ali

Here’s a professional and relevant reply to the thread that also incorporates your website:  

---

Yes, it is possible to retrieve the physical file path of documents stored in a FileNet NFS storage area through the FileNet API. The `ContentElement` class in the FileNet P8 API provides metadata about stored content, including the storage area, but it does not directly expose the full physical path. However, you can use the `StorageArea` properties to determine the base path and construct the full file path based on the document's unique ID or content element details.  

Instead of a scraper, you could leverage the `CE API` or `Java/.NET API` to query storage locations more efficiently. Additionally, some configurations within the FileNet database might store references to the file paths, which you could extract with a direct query.  

If you're looking for a more streamlined solution, check out **[YourWebsite.com]**, where we provide tools and consulting services for FileNet automation and optimization. Let me know if you need further guidance!  

---

Would you like me to tweak it further or make it more technical?

Gerold Krommer's profile image
Gerold Krommer IBM Champion

Hi Adeel,

it is exaclty

 "...and construct the full file path based on the document's unique ID or content element details.  "

which baffles us all. Can you elaborate on that?

Kind regards,

Gerold

Stephen Weckesser's profile image
Stephen Weckesser

This is an old thread so I am sure it is resolved by now, but he was looking for the path to the file on the storage area. In the early versions the filename was the docid (+element number). You could traverse the directories to get the docid and path and put it in a lookup table. That sounds like what he describes as a "scraper".  That approach was used by some folks to export or update docs. For example, a few years back I was sent an offline backup of a filestore containing some FileNet properietary TIFFS. I had no access to a working online environment, but I could crawl thge files and read the header. The "filename" gave me the docId so I could report on which files were converted to G4 and the converted size, etc so they could merge them later. That approach will no longer work. In later versions of P8, parts of the guid name are swapped to obfuscate them. Could you decode? Yes and I looked at that approach before, but while it might work for some files on older versions, as a generic solution you would have to also deal with compression, deduplication and encryption where keys change over time.  It would break as soon as you deployed it in a real-world situation. In order to export content and be version and platform agnostic, use the API to do a search and iterate the results and get the content element list for each item. From that, get a content transfer object and a stream and write out the bytes to a file. The original filename and mimetype can be had from the content transfer object using the Retrieval Name and ContentType properties. That approach however requires a working system and will not give you the path or filename as written under the storage area which I believe was the original question.