With the Dubai Data Law in 2015, Dubai became the world’s first city where the sharing of data between public sector entities was mandated by law. That was the beginning of a journey towards preparing entities to share data and helping them build the capabilities to do so.
Yet the mere sharing of data wasn’t enough. Data quality was crucial. To drive the aims of Digital Dubai, data users and consumers needed confidence that the data they were using was fit for purpose, up to date, and maintained consistently over time.
Data of consistently good quality spurs investment in apps, dashboards, and novel ways to use this data. On the flip side, any suspicion that a dataset might not be maintained, or might cease publishing, would hinder this investment. There would be no point in building infrastructure around an unreliable data source, after all.
So, what does data quality mean? The good news is that it is not a complex, abstract concept. Instead, it can be assigned key dimensions that can be scored objectively and quantitatively. To support data quality, Dubai Data Establishment (now a part of Digital Dubai) created a data quality manual as part of its overall ecosystem of policies and standards to help partners subscribe to the spirit of the Dubai Data Law.
Our quality manual scores data across twelve axes, ranging from format and publishing schedule to provenance, timeliness, completeness, and validation. The full guidelines can be found here. But suffice it to say that data should be published in open readable machine formats. It should stick to a clear publication and update schedule. Datasets should contain metadata that gives some context to the data. Timely data scores higher, as does data with better granularity.
For us, and our partner entities, the road towards better data quality came with the realisation that we had to consider at the bigger picture. Data quality is not just the responsibility of the organisation publishing it. It is about strengthening the entire value chain – from collection to use. And in that sense, true data quality comes not just from the data itself but also its management, and the rules and responsibilities surrounding it.
Getting data quality right also calls for a changing of mindset. Government entities need to stop thinking of themselves as mere data producers. Instead, they need to think of themselves as publishers with an audience, and with customers to serve.
These realisations were thrust very much into the spotlight as we commenced work on our registers – high-quality, verified, and structured datasets that were legally recognised as reference under the Dubai Data Law.
Building a network of accurate, reliable, and interconnected data registries meant publishing entities had to be brought up to speed, and rapidly. This led Digital Dubai to quickly create and operationalise a data quality framework. We took inspiration from the data quality manual, focusing on the most important elements for data quality to fabricate a framework that could be easily deployed.
Then, we took this framework and put it to work. We operationalised a series of business analysis workshops and assessments with our partner entities. The aim was to clearly define data quality together. Then, using open dataset polling, we’d measure current data quality. And from there, we’d move to remedial action planning — understanding barriers that might weaken data quality and helping partner entities address them. Open communication was key, as was understanding organisational culture. We encouraged implementation, helping entities take action that would make a discernible difference to data quality.
We took this data quality programme to over 40 entities as we worked on our HR register for the public sector. During this time, we managed to enhance collected data quality scores from 70% to 98%.
And now, we’re ready for the next step. First, we’re extending this data quality programme to all the entities contributing to the HR register — some 100 of them.
Then, we’re going to make adherence to this data quality programme standard across all datasets available on Dubai Pulse, not just registers. We want to work towards a position where our data quality framework is embedded in everything we do — from inventory and pipelines to ingestion. Eventually, data would have to meet the framework’s criteria to even be considered for the ingestion pipeline.
By working with entities to build quality into the data value chain from the outset, we can build a solid foundation of high-quality data that is invaluable to Dubai as a smart city that wants to become one of the world’s best places to live in and work.