Does the water industry need to collaborate more on big data?

Posted 31 July 2017

Data collaborationThe water industry is drowning in big data, which means collaboration between utilities, research centres, universities, governments and businesses is required to make sense of what’s out there – and what’s missing. 

The reservoir of information runs deep; for example, the Bureau of Meteorology alone holds more than 40 million files composed of more than 4 billion time-series observations, and each day there’s an influx of another 15,000 files into the Bureau’s system. 
 
And utility-generated data is rapidly proliferating as operating systems become smarter. But that’s just scratching the surface, said Professor of Big Data Ecosystems for Business and Society at the University of Amsterdam Dr Sander Klous.

Big data isn’t only sensor or consumption data – it’s everything around it: machines, people, organisations,” said Klous, who is also partner in charge of data analytics at KPMG in the Netherlands. 

“You need as many [data] windows as you can on a certain subject to understand behaviour. But the challenge is that these windows usually aren’t all owned by a single organisation.”

Australian Water Professional of the Year 2016 and Group Leader of CSIRO’s Data61 Dr Fang Chen agreed: “Everyone holds value, which is part of the jigsaw. It’s only through collaboration that we will achieve the best value, better user satisfaction, better efficiency and profits, a better quality of life and a better environment.” 

One of the first steps should be to map out exactly what data we have, Chen said. Efforts to pool data include the Australian National Data Service’s discovery portal and CSIRO’s Data Access Portal. The Bureau is also working on software to streamline data provision and sharing, said Assistant Director of Water Information Services Dr Robert Argent. 

“We’re working with urban water utilities to create what’s essentially a one-stop shop for them to meet multiple reporting requirements through one upload system. Then we do analysis and pass data onto others such as the Australian Bureau of Statistics,” he said.

The Goyder Institute for Water Research Director Dr Michele Akeroyd noted that with so much data being generated and only so much capacity to store and process it, setting priorities was vital, as is investing time and money.

“There’s so much business to do – you get caught up responding to issues that need attention now, rather than having forethought into longer-term issues and strategic thinking,” she said.

Another barrier to effective collaboration is a lack of standardisation and quality controls. However, in recent years there have been concerted efforts to overcome these challenges, such as through the Water Information Research and Development Alliance, which improved data exchange standards, and the Water Monitoring Standardisation Technical Committee (WaMSTeC).

While these developments have gone a long way to improve trust in data, there are still reasons – commercial and legal – for keeping it to yourself, noted Akeroyd.    

“Some data has confidentiality issues, which limits its application and sharing. Other times it might be IP-associated constraints,” she said. 

That’s not to say that sharing expertise doesn’t happen, but if it’s the raw data you’re looking to interrogate, Klous recommended a ‘privacy by design’ approach, which allows you to analyse data owned by various sources.

“You run part of the data analysis in one digital container [controlled by the owner of that particular data set], part in another container and there’s no communication or interaction possible between the containers,” he said.

“The data is then combined for analysis inside what basically functions like a black box: it’s inaccessible for everybody. At a certain point that black box releases the results it has and then destroys itself.”

Another example of an emerging tech-enabled collaboration tool is Siemens’ Mindsphere, a cloud-based, open-operating system for the industrial Internet of Things that records and analyses large volumes of production data. 

Data collaboration in the era of the big data, the Internet of Things and self-destructing black boxes sounds overwhelming and futuristic, so a word of warning: don’t try to go it alone, said Data61’s Chen. 

“People often ask me if they should have a data science team within their utility versus contracting it out,” she said. 

“My view is rather than trying to build a small team to compete with others, it’s better to set up strategic alliances. Otherwise the expertise is spread too thin.”

Read more about big data collaboration in the latest issue of Current magazine here