June 10, 2020

Data Migration

It is all about data

What is the use of any software application if not to manage, process, analyze, transfer, and/or archive data? Data gives you visibility, understanding, information, and educated decisions. At White Mountain Technologies, we make sure that our services and products manage and process data as should be.

Some background on data

Before getting into the main topic of this blog post, which is school data migration and how Skoolee does it, let us introduce briefly how Skoolee handles school academic data. Skoolee implements a relational database storage system. Such databases contain tables that store historical school academic data about certain entities. A table, in turn, contains fields that store the details of this entity. Moreover, a table would be connected, or related, to one or more tables according to certain relationship criteria. For example, one table would store the names of parents. In this table, a field stores the first name, another field the family name, and so on. Another table would contain the names of students, stored in related fields. A relation between the parent and student tables identifies which students are the children of which parents. Such tables and relationships in relational databases are important to keep the data ordered, well stored, and easily accessible.

Now, let’s get to the topic of this post. School data migration, basically, is the process of moving the school's academic data from one system, storage medium, or format to another, and making it ready for further use. The need for this work comes naturally just after the school decides to purchase and implement Skoolee. Do not confuse data migration with backing up the data or making another copy of it, and stashing this copy safely away. The migrated data is dynamic in the sense that it will be used by the target system that received it, and this is the main purpose of data migration. In Skoolee, school data migration is carried out when the school has historical data, that is, data of many past years preserved in certain formats, and wants to move it into Skoolee. The source’s formats could be, among other things, spreadsheets, simple text files, or databases of applications other than Skoolee.

Handling the migration

School data migration goes through several steps in the context of the ETL process (Extract-Transform-Load):

First, you need to extract the source data from its source storage location. Here, we ask the schools to provide us with their historical academic data that they want to migrate into Skoolee. Such data is typically about the school’s curriculum, classes, courses, teachers, students, parents, grades, tuition fees, and payments of past years. We will receive this data in whichever format it comes, hoping that the school has it stored electronically. Otherwise, you can only imagine the effort required to process years and years of data stored on paper. Then we proceed with the school to make sure the data is clean, and to resolve any inconsistencies, conflicts, or duplicates that it might hold. For example, in the extracted data, you could have two entries for the name of the same student, one of them is miss-spelled. So you need to remove the incorrect name entry and merge whatever true data it has connected to it with that of the correct entry.

In the second step, we make sure that the source data will fit properly in the destination space. To get there, the structure and type of the source data needs to match that of the destination data. Technically speaking, if we are migrating school data from one database to another, we need to make sure that the database schemas and data types are compatible. Basically, this is about preparing the necessary tables, fields, and relationships. Here, you might need to resolve some points. For example, one school uses four names for its student; the first, father, grandfather, and family names. Skoolee uses only three. We went on and added a fourth field in the student table to store the grandfather’s name, and so the migration of this data became complete and successful. You might face other issues, such as what if a source field is numeric yet Skoolee stores it as alphanumeric? On a broader sense, in this phase we transform the data.

As a third step, and once the formats and structures of the source and target data are set and compatible, it is time now to go on and migrate the school data, copying it from the source space and loading it into the destination space. This is code-work in its entirety, where a software engine proceeds to read data from here and to write it there. Along the way, the engine will be recording statistics and tagging the data records, taking note of any data inconsistencies. Date migration can be launched automatically in cases were such migration is required periodically and quite often. Skoolee has such a migration engine.

The fourth and final step is to validate and verify the migrated academic data, which now resides in the destination database. We validate that this data is clean, correct, and meaningful, and we verify that it is a true copy the original data. The importance of this step is evident; we simply do not want to end up working with unreliable data. Of course, you do not expect to go through this phase manually. Skoolee’s software procedures run and compare the source data with that of the destination version. At the end of the run, a report will show some statistics as well as the validated and mismatched data records.

And there is more to data

School data migration is a delicate and challenging work that requires good planning. Proper communication between WMT and the school is essential for the success of the work. If planned well, it will proceed smoothly and finish in the minimal possible time. Otherwise, you would be facing regressions and repercussions for quite a while after migration. Keep in mind that school data migration is a complete process, with its rules, code engine, and quality control. And make sure not to confuse it with a process called Data Integration or another one called Data Conversion. These will be for other posts.

All things said, one cannot stress enough the importance of data back. Do this regularly, and put the backup copy in a safe and separate place.