At Digital Dubai, one of our key aims is to make citywide data available to individuals, companies and public sector entities so that we can improve decision-making. Data sharing is the foundation of creating a smart city. Dubai Pulse is our digital platform to make data shareable and searchable. 

However, effective data sharing is very much dependent on the process of data ingestion — i.e., how the data is requested from Dubai’s public sector entities and uploaded to the Dubai Pulse platform after quality checks. 

As our Data Sharing Toolkit notes, the challenges facing complex data projects are both human and technological in nature. Dubai Pulse is no different. In trying to build a rich picture of Dubai through data, Digital Dubai is tasked with gathering data from public sector entities Emirate-wide – which results in a complex ingestion process that must deal with discrepancies in definition and detail, verify data is correct, and is not with variations in quality and completeness. 

While the first steps of the acquisition process are interpersonal — requiring liaising, conversations and negotiations — the actual act of ingestion is a technological challenge.

With the old way of doing things, Digital Dubai was relying on batch ingestion. First,  data agreements with custodian entities would be finalised. Our service provider would then develop and ingest all collected data agreements once a month, customising fields and other parameters on Dubai Pulse as required. 

In many ways, the process was reminiscent of the classical “waterfall model” of systems development, where several discrete linear steps would lead to the final outcome. And just as a lack of agility was a valid criticism of the classic waterfall model, long lead times were often a hallmark of our ingestion process. Up to 60 days could elapse between data being finalised by a partner entity, and it showing up on the Dubai Pulse platform. 

Our ingestion process was ripe for disruption. And we did just that, working closely with our development and telecom partners to create a brand-new smart platform. Our Smart Ingestion Platform was the result of an intensive six-month development sprint. It is based on GitLab, the web-based open-source DevOps lifecycle tool. The idea was to build flexibility and rapid time-to-market into the ingestion process, and substantially cut down ingestion times.

There are other benefits too. The Smart Ingestion platform is self-service, meaning that Digital Dubai teams can handle data ingestion internally. This frees up our service provider and technology partners time to focus on complex challenges as opposed to procedural tasks. In addition, moving capabilities in-house to Digital Dubai empowers our teams, and aligns with our goals of learning, development and building human capital. 

Developing the Smart Ingestion Platform wasn’t without its challenges. We started with what we wanted to achieve in terms of uploading data to Dubai Pulse, and worked backwards. We explored the diverse data specifications our system would have to handle, and iterated rapidly to accommodate this diversity. Automatic validations were key, and these involved setting requirements for data input while catering for differences. Automating every validation rule was a challenge, but also very necessary to move away from a laborious manual validation process. 

At present, Digital Dubai is still acting as an intermediary in the data ingestion process. Government entities still send us their collated data, and we upload it using the Smart Ingestion system. A lot has changed however, including automated validation and verification — all processes that were previously done manually. 

But we’re not stopping here. By automating the ingestion process through a capable smart system, we’re beginning a journey to full self-service platform. Eventually, the goal is to be able to give partner entities direct access to the Smart Ingestion system to validate and upload their data automatically. This will require adding additional user interfaces once the system has proven itself. 

Our adoption of a Smart Ingestion platform development brings a number of discussion points to the fore. First, it emphasises that any data sharing project is only as successful as the sum of its parts, with bottlenecks anywhere in process slowing down the end product. Second, it emphasises the customer-centric nature of the work that Digital Dubai is doing. Not only do we want to make better data available faster to our external audiences but we also want to communicate more rapidly with our internal customers and partner entities. With the Smart Ingestion System, we can ingest a dataset in just 3 days instead of 30 days. And finally, it offers a case study of how smart automation can deliver tangible speed, cost and ownership benefits while freeing up partner teams to grapple with problems of a more strategic nature.