From a wireless network with hundreds of devices, to a city with thousands of Uber drivers, and to a data center with tens of thousands of servers, the scale and complexity of today's systems one needs to manage and engineer have exploded. However, design and analysis of stochastic systems remain to be challenging problems. A major focus of our group is to develop novel analytical methods for control and optimization of stochastic systems based on theories from stochastic networks, control, information theory, optimization, etc. The topics we have been studying include mean-field analysis for large-scale stochastic systems, stochastic modeling and dynamic resource allocation in cloud computing, and high-throughput and low-latency wireless networks.
Our world is increasingly connected. Many systems can be modeled as graphs and many problems can be regarded as machine learning problems on graphs. Our group has been working on several problems related to learning on graphs, including (i) diffusion source localization which is to identify the source(s) of a diffusion process such as the sources of fake news, patient zero of epidemic diseases and the infusion hubs of human diseases in human gene regulatory networks and (ii) tracking, prediction and control of complex networks and interdependent networks.
We look at this fundamental “privacy versus big-data” from a perspective of cost-effective learning, where the data collector uses an incentive mechanism to reward individuals for reporting informative data, and further, an individual controls their own data privacy by reporting noisy data with the level of privacy protection (or level of noisy added) being strategically chosen to maximize her payoff. Intuitively, the higher level of privacy protection, the more noisy the data would be, which ”unfortunately” leads to a reduction of effective sample size. One primary objective of this study is to rigorously characterize the trade-offs between privacy and statistical efficiency. We cast the problem in a game-theoretical setting, which allows us to quantify two fundamental tradeoffs: the tradeoff between cost and accuracy from the data collector’s perspective, and the tradeoff between reward and privacy from an individual’s perspective. In return, with the reward as the bridge, it will provide the answer to the paradox of data privacy concerned by an individual versus data usefulness concerned by the data collector.