March 23, 2016

EMC Archiving Strategy: From Documentum to InfoArchive

Where did we come from and where are we going? What should drive your archiving strategy with EMC?

Documentum is without a doubt one of the heavy weights of the content management and archiving market. Having been around since the early 1990s, it has earned a reputation as a sophisticated Content Management platform for large enterprises. From a very early stage in its development, Documentum adopted features supporting complex content management functions such as life cycles, check-in/check-out, virtual document management, and workflows, which were closely integrated with a sophisticated security model. As a result, Documentum’s sweet-spot became associated with the complex workflows of that document author’s needed industries such as creative marketing, life sciences, and engineering. What these had in common was a need to closely manage and audit changes to (in general) human authored content.

When did Documentum really take hold and what are the benefits?

While complex document authoring was (and is) a core market for Documentum, it wasn't until 2003 that it began to seriously look at leveraging the platform for archiving. Since the 1980s, IBM had developed a market for high-volume content archiving with the Content Manager On Demand (CMOD) platform. EMC naturally saw this market as an opportunity, so from 2003-2008 EMC Documentum developed a comprehensive strategy around high-volume archiving that included print streams, scanned content, enterprise applications, and email. Variously these offerings became known as Archive Services for Reports, Archive Services for Images, Archive Services for SAP, and Archiver Services for Email.

EMC Documentum’s archiving applications became very popular, but highlighted the challenges of adapting what was essentially a strong content authoring platform into an archiving platform. In particular, Documentum had a rich meta-model - so rich that it overburdened archiving solutions with meta-data that was more appropriate to document authoring. As a result, in about 2007 Documentum introduced lightweight system objects that made it far more scalable as an archiving platform.

Despite these developments, by 2008 the pace of innovation had slowed and it seemed as if the opportunity to build a single enterprise archive platform was slipping away from EMC. On reflection, the slowdown in innovation allowed the product lines to mature at their own pace and tackle the distinct needs of their user bases. But at the same time, a new approach to archiving was being conceived by the EMC team in Europe.

InfoArchive was born out of an initiative at a large French bank to build a high volume archive for transaction documents such as invoices, statements, and structured data from core banking applications. Eventually the solution needed to be adapted to support archiving for archiving high volume transactional data such as Single European Payment Authority (SEPA) transactions. While transactional documents represented up to a billion objects, the latter need for structured data archiving required tens of billions of objects to be archived. EMC needed a new approach to this scale of problem.

During its regular series of acquisitions during the ‘00s EMC ECD had acquired X-Hive, an author of market-leading XML database xDB. It was this that was to provide the key enabling technology for the envisaged enterprise archive solution.

Documentum InfoArchive
Workflow & Case Management Documentum archives leverage products like xCP and Captiva which make it easier to integrate with business process and workflow While InfoArchive has plenty of integration points with the workflow and enterprise applications it does not have a tightly integrated case management platform (yet).
Volumes We have seen Documentum successfully deployed with archives of up to 50,000 contect objects. And while this is by no means the limit and good enough for many document archives, it is not as scalable as InfoArchive.  InfoArchive has proven to be capable of archiving 10s of billions of objects. Through its two-stage search mechanism, users have incredible control over search and retrieval of large archived data sets.
Structured Data It would be fair to say that out-side of SAP this has never been Documentum's sweet spot.  InfoArchive was built pretty much from day one to support structured data archiving and has the most elegant and effective model for structured data archiving we have seen.
Future Proof Version 7.0 is the latest release of the Documentum platform and it is certainly in no danger of dying-out. We expect to continue seeing improvements over the coming years. However, as EMC has started to acknowledge, a 'next-gen' platform is in development and will have an architecture designed to support the cloud.  InfoArchive was built leveraging xDB - a fundamental building block of IIG's 'next-gen' architecture. It seems reasonable to assume that EAS will be more strategic to EMC's future archive platform. We are certainly investing our resources into InfoArchive at the moment.
Print-Stream Archiving (Report Archiving From 2004 onward Documentum was successfully adopted by customers for print stream archiving. While these customers retain well supported we recommend that they evaluate InfoArchive as a platform for their archiving needs.  InfoArchive has been very successfully deployed for print stream archiving, and if you need both structured data archiving and print stream archiving it is a must.


EMC’s InfoArchive solution cleverly combines the strengths of Documentum for protected information at rest with the granularity and scalability of xDB to deliver a next generation platform for archiving based on open standards.

So our analysis is that EMC’s archiving strategy (some years in the making) looks set to return with renewed strength. Our own investments are nowadays associated with the InfoArchive solution, and I would heartily recommend this to customers with structured data archiving and print-stream archiving.

CrawfordTech is a long term EMC ECD partner for digital customer communications archiving and e-presentment, and over 30 customers around the world use our joint solutions. Our market-leading print-stream archiving offerings for the Documentum and InfoArchive platforms are designed to support the high volume archiving needs of the banking, insurance, and utility industries.

For more information about CrawfordTech’s print-stream archiving products for EMC, visit our Digital Archiving Solutions and be sure to see our Industry Solutions.