Content Management and Capture

 View Only

Linearized PDF support in IBM Daeja ViewONE

By Dhananjay Bhandarkar posted Thu December 15, 2022 07:03 AM

  

Need for PDF linearization

In a world dominated by high speed internet connections, it’s fair to wonder whether or not PDF linearization is still necessary. For small PDFs that are only a few pages, linearization may not be essential, but when it comes to larger documents, linearization can still deliver substantial performance and user experience benefits.

Consider, for instance, a document that consists of several hundred, or even several thousand, pages. Loading that entire document and keeping it cached may be possible, but it’s an inefficient use of processing and bandwidth resources. With a linearized PDF, a reader typically encounters a linearization directory and hint tables at the top of the document, which provides it with instructions on where to locate any necessary resources within the file. After loading the hint tables and the first page, the reader stops the download process rather than opening the entire file. When the user navigates to another page, the reader can quickly reference the hint tables and jump to that page.

This ensures that the reader is only ever loading the pages that actually need to be displayed, which helps to conserve memory, processing resources, and bandwidth. For mobile devices with limited file and cache storage, linearized PDFs are much easier to manage than their non-linearized counterparts. They also provide some protection against network interruptions, which could make it difficult to download and view an entire document.

Linearized PDF support in IBM Daeja ViewONE

From Version 5.0.12,  IBM Daeja ViewONE support linearized PDF for faster rendering and optimized performance in viewing.

Following additions helps in achieving range requests and multiple partial requests calls.

Range support in Daeja: Two new headers are added in response and request calls: "Accept-Ranges" and "Content-Length".

Partial requests in Daeja: There will now be multiple calls to server fetching data in chunks and the status code will be 206 (previously, it was a single call fetching the entire document content and status code was 200).

You have three modes of document fetch and rendering for PDF documents. You can configure them by setting Virtual ViewONE parameter pdfFetchType. This parameter can accept following three following possible values:

fullDocument: This configuration will make multiple calls to the server to fetch the document. The rendering behavior is same as existing behavior (before 5.0.12) with better performance, where the first page cannot render until the entire document is fetched.

progressive: This configuration will make multiple calls to the server to fetch the document. The rendering happens when the bytes and the content to render the page is fetched. At the back-end, calls will be progressing to fetch the rest of the document.

ondemand: This configuration will make multiple calls to the server to render the page that is requested (initially page 1). However particular page can be requested, either by typing the particular page or by scrolling to a particular page. No further server calls will be made until requested page is changed.

You can use these parameter values for IBM® Content Navigator and stand-alone:

For IBM Content Navigator: Configure HTML parameter pdfFetchType with any of the parameter values as per the use case in IBM Content Navigator > Daeja ViewONE Settings.

For stand-alone: Configure HTML parameter pdfFetchType with any of the parameter values as per the use case in client-side parameters.

Following configuration is default when pdfFetchType is not used in ViewONE:

<param name="pdfFetchType" value="fullDocument">

 

0 comments
60 views

Permalink