DataSphere: An Interdisciplinary and User-Focused Digital Environment for Decentralised Data Mobility

Proposal for the Next Stage Digital Economy Research Centre

Submission 9 July 2019

Project Leadership

Professor Irene Ng, Professor Carsten Maple, Professor Roger Maull, Professor Glenn Parry, Dr Hamed Haddadi, Dr Ben Livshits, Dr Joo Hee Oh, Dr Jat Singh, Professor Jon Crowcroft


DataSphere is the Digital “twin” of the real economy. It is an alternative reality space created to develop and test behavioural, economic, technology, governance, regulation, policy, and ethical issues of decentralised data mobility in a safe yet scalable environment before artefacts (including laws, products or services) go live. Every partner entity has a twin in DataSphere, be it a company, a person or a regulator. The DataSphere Next Stage Digital Economy Research Centre (NSDERC) will consolidate expertise provided by leading UK and international researchers in the emerging field of personal data ecosystem stewardship to address the specific challenges that industry, government and third sector providers are likely to encounter as they develop policies, practices, and business models governing the processing and synthesis of centralised and decentralised forms of personal data (Department of Digital, Media, Culture and Sport, 2018). Given the newness of decentralised data as a concept, it is likely that agencies and firms will need to investigate how the introduction of consumer decision-making into the data-analytic mix could impact ecosystems, innovation and behaviours. DataSphere will provide researchers and research partners with access to a ‘safe space’ within which to observe these interactions and any significant alterations in consumer behaviour that may arise, via proof-of-concepts already respectively demonstrated by The University of Warwick (Ng and  Maple); Imperial College (Haddadi, Livshits), The University of Cambridge’s Lab (Crowcroft).To remain successful in an environment of enhanced consumer choice and control, firms and institutions will need to reconfigure interactions with customers holding decentralised data, which take account of their heightened bargaining power, but where this also requires better heuristics and governance mechanisms to reduce control fatigue. These same organisations may also need to review their terms of business, legal provisions, financial strategies, and marketing functions, to take into account the opportunities and risks incurred by this process of partial information devolution and, just as importantly, the right to mobilise and port one’s data according to consumer preference (Department of Digital, Media, Culture and Sport, 2018). 

User Engagement: Decentralisation represents the digital equivalent of an unending coal face (Hern, 2016) that could be mined in perpetuity from two potential sources: business personal data that is collected centrally, with user consent but where the individual does not necessarily make decisions about its  processing, storage and resale; and decentralised personal data over which the consumer exercises a degree of control, where it is stored locally, for example, in a Databox (Amir-Chaudhry, Haddadi, & McAuley, 2015); or in the form of  a HAT (Hub-of-All-Things) (Ng, 2018), a personal microserver with data ownership rights controlled solely by its owner.  Both HAT and DataBox are DE Project technologies under the remit of the centre and serve as the anchoring Technologies for DataSphere. DataSphere is to serve its user community principally by enabling its research partners to explore and model the technological, regulatory, behavioral, business and economic implications data devolution processes will raise, through the concept of ‘digital data twins’ (Grieves & Vickers, 2016; Marr, 2017). DataSphere would provide an ecosystem replica that go beyond standard economic modelling to address a range of risk factors and potential benefits, including the behavioural, technological, regulatory, policy, legal and ethical implications of data decentralisation (CIs: Parry, Maull, Singh, Maple).  Data can be copied many times, split into fragments; while significant attributes of that data are separated and changed. Data mobility allows emergent needs to be met with personalised solutions. Edge-data is where the value is for many firms, particularly those for whom behavioural prediction represent an opportunity to tailor and personalise recommendations and respond more precisely to customer preferences, be they for reading the news, shopping or making wellbeing decisions (Department of Digital, Media, Culture and Sport, 2018). 

Impact: Our express task will be to provide users  with a space within which researchers and companies can experiment with edge-data to identify the optimal – and most responsible - uses for real-time information systems that could, for example, help to rejuvenate the ability of high street retailers to maintain brand loyalty or aid a local council in the better mitigation of traffic congestion. As a simulated world, DataSphere would be used to demonstrate how these data-flows interact so that users may learn and test different data mobility models and technologies at some scale. We propose to create such a “live-but-safe” environment in the form of a data “sandbox” environment where firms, individuals and researchers can interact and data mobility models can be tested. The space will have its own currency and an ecosystem of services whereby the currency would be used to purchase personalised services such as news, journeys, music and wellbeing recommendations. We term this environment as “DataSphere”. DataSphere is a sandbox not merely for decentralised technology, but for economic engineering, behavioural experiments, ethical deliberations and regulatory controls, as well as a longitudinal approach to how perceptions and behaviours can evolve. The sandbox helps mitigate risks for scaling decentralised DE technology. DataSphere will be fashioned not merely as a “space”, but as a “world”, with a population of 10000 “citizens” recruited and managed by a professional partner, earning a monthly income of DATeS (DataSphere currency). Citizens are committed to use and test applications and software to provide an understanding of how the market and exchange of personal data informs data mobility models in the decentralised space, working with centralised models. DataSphere is expected to generate insight through pseudo-real transactions involving personalisation and recommendation services that use personal data. DataSphere is therefore likely to attract companies at the leading edge of decentralised data co-creation to a sandbox within which they will provide services – as well as the cryptocurrency expertise to run them - that accept DATeS. DataSphere citizens will spend DATeS on personalised news, journeys and other services that may be both offline and online. Within DataSphere, researchers will analyse interaction data. DE Projects such as DataBox, HAT, Rumpel (Data Literacy tool) will be deployed at scale, with partners as users of the technology, but in a safe space to understand how risks can be mitigated and impact maximised when partners usage of the technology goes live. The commercial application of HAT technologies among insuretech, HR and other organisations demonstrate that it is possible to comply with these regulations by giving each customer a secure personal data account in the form of a ‘HAT’ Microserver (within which to store portable copies of the data legally owned by individuals themselves). The HATLAB consortium has also demonstrated how a nascent Exchange in this data might be simulated, along with the economic, behavioural and legal challenges businesses might face if they choose to release this data for use. The results of this research have proven highly valuable to the industrial partners we work with [examples], many of whom are currently contending with the especial challenge of managing edge-data, collated in real-time from private individuals, as a constant feed of information between the user, devices and data-handlers. This research has also demonstrated that personal data does not move in the same way as objects or equipment move in a supply chain. DataSphere will, therefore, undertake activities to develop committed routes for exploitation to grow out of the sandbox into the live environment as well as grow and support the pool of DE trained interdisciplinary researchers by providing crucial development, testing, technical assessment and socioeconomic evaluation of decentralised data technologies.