Data sheet reference in thesis

Datamartist gives you data profiling and data transformation in one easy to use visual tool. This is part data sheet reference in thesis of an ongoing series that’s taking a look at data migration projects. In this part we’re going to talk about how important it is to know where you are starting from, before you head off on a new application journey.

Understanding and mapping your legacy systems is a key success factor for a data migration project, but can be a very difficult and time consuming battle. Why are we spending so much time on this? Thats the OLD system- we need to focus on the future! Data location- You can’t migrate data if if you don’t know what it is and where it is. Data dependencies to other systems All processes and interfaces that rely on interfaces to the legacy systems need to be either replaced or shut off. Often this means that even if the new system is not involved, other systems may stop working because they get data from the legacy systems. The data migration project is not just about turning on the new system.

The consequences of turning off the old system have to be known and managed. Legal requirements to keep legacy data available. Even if data is not migrated to the new system there may be additional data migration requirements into data warehouses or documents that have nothing to do with the new application. The actual infrastructure that the legacy systems are on might perform other tasks that although not directly related to the legacy system will cause issues when that infrastructure is removed. Often the first time the Legacy system is documented is just before it’s shut down. Despite our best intentions, sometimes documentation doesn’t get updated. This is the reality for many systems, and particularly for legacy systems.

One of the first steps in a data migration project is to gather all the existing documentation for the legacy systems, and all the systems they talk to, and make sure its accessible to the data migration project team. It is critical to have tight control over these documents, and to ensure that everyone works off a “live” version- because your mapping is going to update that documentation, and every developer, data modeler and application team member needs to know that they have the best and latest version. If you have one of these, good for you, and you can stop reading. For the rest of us, lets talk practical methods of mapping what we have. Scan the environment- catch the interfaces in the act.

Monitor network traffic to detect exchanges between applications. Scan file systems to find interface files and determine frequency. Get out there and talk to people. Ask people- where is data from this system used?

Look at management reports and trace backwards to find where information is pulled. Don’t assume the interface is direct. My record discovered is 6 hops from source to the excel sheet used by the CEO, with the information passing through two of the same systems twice. Hunt down people that were involved in the original installation.

Often they’ll have key information that can save you time. If you don’t have a complex tool to do the mapping of all your systems, then one approach that is a step above the “lots of excel sheets and powerpoint slides” approach, is to use a tool like Microsoft Viso. I’ve used it successfully to map applications, by having the drawing and the interfaces BE the database. This ensures that everything in the drawing is on the interface list, and everything on the interface list shows up on the drawing.