Effective data migration

Much more than an academic exercise

Migrating information from one database to another is often a complex process. So simply automating the process without identifying all potential pitfalls first is likely to result in significant problems.

Academic output is the lifeblood of every university. The research papers, reports, books and articles published by its staff are its product.

To measure individual staff members’ output, comply with government requirements and meet students’ educational needs, universities need to access, track and curate this ever-growing body of knowledge effectively.

At the University of Hull, librarians and archivists had previously done this using open source software Hydra. But this had limitations.

So, when the university wanted to better manage the entire research life- cycle, they needed a system with greater functionality. The decision was made to switch to Worktribe, a proprietary SaaS product.

How to connect different systems?

As well as creating a database of semantically-linked documents that could be searched and interrogated using keywords and tags, Worktribe could also automate the process of populating the database with the latest academic publications.

While this would greatly improve the system, there was one significant problem: how best to migrate the 4,500 or so records held on Hydra over to Worktribe.

As neither system ‘knew’ the other, there was no quick way to do this.

“As this was part of our larger Research Informations Systems project for managing the university’s entire research lifecycle, it was critical we got things right,” says Alison Hudson, who was responsible for coordinating the migration.

Error-free migration required

While the university wanted to move the material quickly, they also wanted to ensure it happened without error. Even a single issue would call into question the integrity of the whole database. Was it just a one-off mistake, or the tip of a much larger iceberg of errors?

”We needed someone with a detailed understanding of both systems who could identify potential issues that we would never spot,” says Alison.

Web development firm Sauce, who were already working with the university, recommended Hull-based Lambda Functions and introduced Alison to the firm’s digital architect, Mike Clarke.

“The university needed to be sure that all project risks were identified,” says Mike. “And given our experience, I was confident we could spot even the less obvious pitfalls that others might miss.”

Working to a functional specification from Alison that detailed the university’s requirements, Mike developed a streamlined, three-step migration process

“This was an important piece of work requiring a detailed understanding of both new and legacy systems.”

Alison Hudson, Business Systems Analyst, University of Hull

Validation essential

“Extracting records from the old system would be a relatively straightforward.” says Mike. “The key question would be how would we know we had transferred everything over?”

To transfer the data, Mike first wrote a series of flexible scripts and then, most importantly, created a validation process to ensure every record was successfully migrated - precisely, with no errors or omissions.

“Even if nothing was done to a record, we would always know if this was through intent or error,” says Mike.

With the university crosschecking random records to ensure accuracy, the project was completed without a problem in just eight weeks. Worktribe said it was the easiest and fastest migration they had seen.

“The reason that the whole process ran smoothly,” says Alison, “was because Mike had a real understanding of what we were trying to achieve.”

When the University of Hull needed to migrate its academic output archive from an in-house solution to a third-party software-as-a-service platform, they came to us to develop bespoke migration software and a workflow to ensure each record was migrated, validated, and accounted for.