Behavioral Privacy

Privacy and the use of smartphones and mobile sensors go hand-in-hand when developing mobile health research platforms. Smartphones have onboard sensors that can interface with a variety of external body-worn sensors, and this platform can be leveraged for continuous and unobtrusive monitoring of individuals as they go about their daily lives.

lock phone

Data collected in this manner can be shared with health care providers, who can use it to better understand how the environment influences an individual’s health and provide more proactive healthcare as a result.

But the potential for more affordable and more proactive healthcare through the use of continuously collected individual data must be paired with the concern for individual privacy. [1]

MD2K has made that an ongoing aspect of its research, and has developed mSieve, a privacy framework for sharing physiological sensor data. The mSieve framework is a new behavioral privacy metric that is based on differential privacy. It uses a novel data substitution mechanism to protect behavioral privacy expressed in terms of a whitelist and a blacklist of inferences.

mCerebrum, the software platform developed by MD2K implements privacy controls that provide a type of dynamic consent mechanisms that allow study participants to stop data collection for a while under private settings, and implements Cryptimg software that uses encryption for privacy-preserving outsourcing of image processing functions to untrusted cloud services.

[1] Nazir Saleheen; Supriyo Chakraborty; Nasir Ali; Md Mahbubur Rahman; Syed Monowar Hossain; Rummana Bari; Eugene Buder; Mani Srivastava; Santosh Kumar: mSieve: Differential Behavioral Privacy in Time Series of Mobile Sensor Data. In: Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 706-717, ACM, New York, NY USA, 2016, ISBN: 978-1-4503-4461-6.

Publications

Moustafa Alzantot, Supriyo Chakraborty and Mani B Srivastava.
SenseGen: A Deep Learning Architecture for Synthetic Sensor Data Generation. 2017. URL, DOI BibTeX

@inproceedings{alzantot2017sensegen,
	author = "Moustafa Alzantot and Supriyo Chakraborty and Mani B. Srivastava",
	title = "SenseGen: A Deep Learning Architecture for Synthetic Sensor Data Generation",
	year = 2017,
	publisher = "IEEE",
	abstract = "Our ability to synthesize sensory data that preserves specific statistical properties of the real data has had tremendous implications on data privacy and big data analytics. The synthetic data can be used as a substitute for selective real data segments – that are sensitive to the user – thus protecting privacy and resulting in improved analytics. However, increasingly adversarial roles taken by data recipients such as mobile apps, or other cloud-based analytics services, mandate that the synthetic data, in addition to preserving statistical properties, should also be “difficult to distinguish from the real data. Typically, visual inspection has been used as a test to distinguish between datasets. But more recently, sophisticated classifier models (discriminators), corresponding to a set of events, have also been employed to distinguish between synthesized and real data. The model operates on both datasets and the respective event outputs are compared for consistency. Prior work on data synthesis have often focused on classifiers that are built for features explicitly preserved by the synthetic data. This suggests that an adversary can build classifiers that can exploit a potentially disjoint set of features for differentiating between the two datasets. In this paper, we take a step towards generating sensory data that can pass a deep learning based discriminator model test, and make two specific contributions: first, we present a deep learning based architecture for synthesizing sensory data. This architecture comprises of a generator model, which is a stack of multiple Long-Short-Term-Memory (LSTM) networks and a Mixture Density Network (MDN); second, we use another LSTM network based discriminator model for distinguishing between the true and the synthesized data. Using a dataset of accelerometer traces, collected using smartphones of users doing their daily activities, we show that the deep learning based discriminator model can only distinguish between the real and synthesized traces with an accuracy in the neighborhood of 50%.",
	doi = "https://arxiv.org/abs/1701.08886v1",
	journal = "IEEE BICA'17 (co-located with IEEE Percom 2017)",
	pubstate = "forthcoming",
	tppubtype = "article",
	url = "https://md2k.org/images/papers/privacy/SenseGen_Alzantot.pdf"
}

Nazir Saleheen, Supriyo Chakraborty, Nasir Ali, Md Mahbubur Rahman, Syed Monowar Hossain, Rummana Bari, Eugene Buder, Mani Srivastava and Santosh Kumar.
mSieve: Differential Behavioral Privacy in Time Series of Mobile Sensor Data. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing. 2016, 706-717. URL, DOI BibTeX

@inproceedings{Saleheen:2016:MDB:2971648.2971753b,
	author = "Nazir Saleheen and Supriyo Chakraborty and Nasir Ali and Md Mahbubur Rahman and Syed Monowar Hossain and Rummana Bari and Eugene Buder and Mani Srivastava and Santosh Kumar",
	title = "mSieve: Differential Behavioral Privacy in Time Series of Mobile Sensor Data",
	booktitle = "Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing",
	year = 2016,
	series = "UbiComp '16",
	pages = "706-717",
	address = "New York, NY USA",
	publisher = "ACM",
	abstract = "Differential privacy concepts have been successfully used to protect anonymity of individuals in population-scale analyis. Sharing of mobile sensor data, especially physiological data, raise different privacy challenges, that of protecting private behaviors that can be revealed from time series of sensor data. Existing privacy mechanisms rely on noise addition and data perturbation. But the accuracy requirement on inferences drawn from physiological data, together with well-established limits within which these data values occur, render traditional privacy mechanisms inapplicable. In this work, we define a new behavioral privacy metric based on differential privacy and propose a novel data substitution mechanism to protect behavioral privacy. We evaluate the efficacy of our scheme using 660 hours of ECG, respiration, and activity data collected from 43 participants and demonstrate that it is possible to retain meaningful utility, in terms of inference accuracy (90%), while simultaneously preserving the privacy of sensitive behaviors.",
	doi = "10.1145/2971648.2971753",
	isbn = "978-1-4503-4461-6",
	keywords = "Behavioral Privacy, Differential Privacy, mobile health",
	pubstate = "published",
	tppubtype = "inproceedings",
	url = "https://md2k.org/images/papers/privacy/mSieve-UbiComp-2016.pdf"
}

Supriyo Chakraborty, Chenguang Shen, Kasturi Rangan Raghavan, Yasser Shoukry, Matt Millar and Mani Srivastava.
ipShield: A Framework for Enforcing Context-aware Privacy. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation. 2014, 143–156. URL BibTeX

@inproceedings{Chakraborty:2014:IFE:2616448.2616463,
	author = "Chakraborty, Supriyo and Shen, Chenguang and Raghavan, Kasturi Rangan and Shoukry, Yasser and Millar, Matt and Srivastava, Mani",
	title = "ipShield: A Framework for Enforcing Context-aware Privacy",
	booktitle = "Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation",
	year = 2014,
	series = "NSDI'14",
	pages = "143--156",
	address = "Berkeley, CA, USA",
	publisher = "USENIX Association",
	abstract = "Smart phones are used to collect and share personal data with untrustworthy third-party apps, often leading to data misuse and privacy violations. Unfortunately, state-of-the-art privacy mechanisms on Android provide inadequate access control and do not address the vulnerabilities that arise due to unmediated access to so-called innocuous sensors on these phones. We present ipShield, a framework that provides users with greater control over their resources at runtime. ipShield performs monitoring of every sensor accessed by an app and uses this information to perform privacy risk assessment. The risks are conveyed to the user as a list of possible inferences that can be drawn using the shared sensor data. Based on user-configured lists of allowed and private inferences, a recommendation consisting of binary privacy actions on individual sensors is generated. Finally, users are provided with options to override the recommended actions and manually configure context-aware fine-grained privacy rules. We implemented ipShield by modifying the AOSP on a Nexus 4 phone. Our evaluation indicates that running ipShield incurs negligible CPU and memory overhead and only a small reduction in battery life.",
	acmid = 2616463,
	isbn = "978-1-931971-09-6",
	location = "Seattle, WA",
	numpages = 14,
	url = "https://md2k.org/images/papers/privacy/nsdi14-paper-chakraborty.pdf"
}