Publications | Science of Security Virtual Organization

Differential Privacy Techniques for Healthcare Data

This paper analyzes techniques to enable differential privacy by adding Laplace noise to healthcare data. First, as healthcare data contain natural constraints for data to take only integral values, we show that drawing only integral values does not provide differential privacy. In contrast, rounding randomly drawn values to the nearest integer provides differential privacy. Second, when a variable is constructed using two other variables, noise must be added to only one of them. Third, if the constructed variable is a fraction, then noise must be added to its constituent private variables, and not to the fraction directly. Fourth, the accuracy of analytics following noise addition increases with the privacy budget, ϵ, and the variance of the independent variable. Finally, the accuracy of analytics following noise addition increases disproportionately with an increase in the privacy budget when the variance of the independent variable is greater. Using actual healthcare data, we provide evidence supporting the two predictions on the accuracy of data analytics. Crucially, to enable accuracy of data analytics with differential privacy, we derive a relationship to extract the slope parameter in the original dataset using the slope parameter in the noisy dataset.

Authored by Rishabh Subramanian

Privacy-Preserving Cloud Data Model based on Differential Approach

With the variety of cloud services, the cloud service provider delivers the machine learning service, which is used in many applications, including risk assessment, product recommen-dation, and image recognition. The cloud service provider initiates a protocol for the classification service to enable the data owners to request an evaluation of their data. The owners may not entirely rely on the cloud environment as the third parties manage it. However, protecting data privacy while sharing it is a significant challenge. A novel privacy-preserving model is proposed, which is based on differential privacy and machine learning approaches. The proposed model allows the various data owners for storage, sharing, and utilization in the cloud environment. The experiments are conducted on Blood transfusion service center, Phoneme, and Wilt datasets to lay down the proposed model's efficiency in accuracy, precision, recall, and Fl-score terms. The results exhibit that the proposed model specifies high accuracy, precision, recall, and Fl-score up to 97.72%, 98.04%, 97.72%, and 98.80%, respectively.

Authored by Rishabh Gupta, Ashutosh Singh

Localized Differential Location Privacy Protection Scheme in Mobile Environment

When users request location services, they are easy to expose their privacy information, and the scheme of using a third-party server for location privacy protection has high requirements for the credibility of the server. To solve these problems, a localized differential privacy protection scheme in mobile environment is proposed, which uses Markov chain model to generate probability transition matrix, and adds Laplace noise to construct a location confusion function that meets differential privacy, Conduct location confusion on the client, construct and upload anonymous areas. Through the analysis of simulation experiments, the scheme can solve the problem of untrusted third-party server, and has high efficiency while ensuring the high availability of the generated anonymous area.

Authored by Liu Kai, Wang Jingjing, Hu Yanjing

Modelling User Availability in Workflow Resiliency Analysis

Workflows capture complex operational processes and include security constraints limiting which users can perform which tasks. An improper security policy may prevent cer- tain tasks being assigned and may force a policy violation. Deciding whether a valid user-task assignment exists for a given policy is known to be extremely complex, especially when considering user unavailability (known as the resiliency problem). Therefore tools are required that allow automatic evaluation of workflow resiliency. Modelling well defined workflows is fairly straightforward, however user availabil- ity can be modelled in multiple ways for the same workflow. Correct choice of model is a complex yet necessary concern as it has a major impact on the calculated resiliency. We de- scribe a number of user availability models and their encod- ing in the model checker PRISM, used to evaluate resiliency. We also show how model choice can affect resiliency computation in terms of its value, memory and CPU time.

Authored by John Mace, Charles Morisset, Aad Van Moorsel