Best Practices in Content Migration
Have you ever wondered what happens to your data after a service provider begins the process of migrating it to a new system? There are several important steps that your document management partner should take when migrating content from an existing system to a new one (or to other formats). After all, you’re looking for much more than a simple file copy of your data, aren’t you? The process below will outline how service providers ensure and protect the quality of your data throughout conversion and migration.
A Step by Step Guide to How Your Content is Handled During Migration
There is more to conversion than many people think. The DIY approach may not always save time and money – if conversion isn’t handled properly, a mess can be made. It’s important to seek out an experienced service provider who understands the technicalities of the data, the process, and the requirements and safeguards for the final data and the destination system.
-
The service provider will begin by scanning the input platter, hard disk, tape, storage array, partition, or other media. All metadata will be read and collected for each identifiable object or file stored on the media.
-
All objects will be extracted from the media. Extractions are typically files in a content management system, but they may also be BLOBs (binary large objects), disk extents, or certain other structures (in the case of image management systems).
-
Next, the data of each extracted file object is examined to determine the type of file object. Service providers don’t rely on the source system database or the object’s metadata to guess the nature of the file objects because they can, on occasion, be misleading. For example, systems like FileNET may report that a data object is an image when it is actually a text file. Other times, these systems can misidentify the storage type of scanned images. Being thorough is the only way a service provider can resolve these system errors and determine the type of data by its characteristics.
-
The second part of the data determination step is to compare the type for each extracted object (as determined by direct algorithmic examination of the data). The type should be compared with the classification given by the object’s metadata and also the classification given by the source system’s database. There may be discrepancies, which should be logged.
-
Now conversion is ready to take place. Each extracted object is converted to the appropriate open standard. An understanding of the composition of each object is necessary for converting the data. Different formats have individual pieces which, when presented together, make up the image or the file. If conversion isn’t done carefully to the appropriate formats, certain pieces of the data can be left behind. Each item will be converted from its proprietary stored format to an open format, such as single-page TIFF, standard PDF, or a type of text file (whichever is applicable). Any conversion errors are logged throughout the process.
-
A final form document is created with the information extracted from the media. For image data, the format is commonly multi-page TIFF or standard PDF. The service provider will ensure that the output from this conversion is in compliance with the specifications and limitations of the intended use of the data, as well as the import requirements of any destination retrieval system.
-
Cross validation must now be performed. For each piece of input storage media, validation will occur between the source system database and the database constructed from the information gathered in the above steps. During this process, the service provider should discover and report any objects referenced in the source system database which weren’t found on the input media, and any objects found on the input media which were not listed in the source system database.
-
After all of the input media has been processed, all item mismatches between the discovered data and the source system database are logged. In the case of data discrepancies, the service provider may request that the client supply any available backup copies of the media for further examination and potential data recovery.
-
If additional collection data is available from an external database (not the source document management system), all of the converted data and database data will need to be compared to the external source. Examples of commonly available external source databases are accounting systems, master client lists, and mailing label lists. These sources can help to “clean” the converted data. Again, all discrepancies and mismatches will be logged.
-
Before importing data into a new retrieval system, the service provider will need to compare all source data with the database in the destination system. There should be very few discrepancies remaining at this stage of the process, but they will be logged if they are found.
-
Before importing data into a new retrieval system, the service provider will need to compare all source data with the database in the destination system. There should be very few discrepancies remaining at this stage of the process, but they will be logged if they are found.
Data migration is a rather complex process which shouldn’t be trusted to just anyone. If information security and data integrity are important to you, then the difference between doing it yourself or calling an experienced partner could be nothing short of disaster. A reputed service provider will expose and mitigate a number of internal problems in addition to resolving discrepancies and preserving the quality of your data. None of the problems that may arise are able to be resolved from within the source system itself, and many of them are not even detectable. It is only through expert inspection that these problems are revealed and can be handled so that your system can be confidently converted.