February 23, 2017

Understanding Your Archive: Document Retrieval: Pt. 1

hand pressing virtual image with page icon

Document Retrieval

The previous article in this Technical Series on Advanced Function Presentation (AFP), Understanding Your Archive: Storage Requirements, Pt. 2, considered the storage options available for long term archiving. This time we look at document retrieval and how this utilizes your archive.

The first point to consider when understanding document retrieval from a Customer Communications Archive is who might need to retrieve the documents? Perhaps it’s the customer accessing historical documents via a web portal, or a customer service representative servicing a customer query on the phone or maybe there’s a need to view documents for audit or compliance purposes.

Whatever the use case, documents need to be accessed quickly using their meta-data to retrieve them from the archive. If, as discussed in the previous article, compression and storage reduction techniques have been employed within the archive, for successful retrieval it will be necessary to use software to transform the output to a single document and convert it into a readable form; typically PDF, HTML or XML.

On the Fly Document Transformations:

Considering who might need to retrieve documents from the archive and understanding what the individual requirements for the documents are is important. A customer viewing their bank statement online using a mobile banking application will need the data reformatted for a smaller screen. A client service representative talking to a customer on the phone needs a rapid presentation of what the customer is seeing to service the query effectively. Additionally in some industries parts of the document may need to be redacted for client confidentiality. Compliance officers or auditors will typically require an exact copy of what was originally sent to the customer.

Each of these use cases have different presentation requirements. Fortunately with PDF output and a robust scalable document transformation architecture, these requirements can be met at the point of document delivery.


When documents have been stored in the archive as PDF or PDF/A in burst mode (as discussed in our previous blog) each document is managed as a separate entity and font and resource data is replicated in each. When presenting these documents back to the user, the fonts can be included to ensure true document fidelity.

Next time we consider how different retrieval techniques are employed to meet your requirements where resources have not been stored within the document to optimise storage or retrieval requirements, or where additional features have been added to enhance document security.