Pittsburgh July 09, 2025
Pitt professor Amin Rahimian and PhD scholar Yuxin Liu seek to find the delicate balance between individual privacy and collective learning

Who Sees Who?

Adobe Stock image of interconnected users with data points where some points are marked as blurred or distorted
Adobe Stock image of interconnected users with data points where some points are marked as blurred or distorted

In the comedy Home Alone, a burglar posing as a police officer walks door to door during the holidays to find out which families will be traveling, leaving their empty homes easy prey for a break-in. The movie was released in 1990, just as the internet was growing increasingly available. 

Today, in a world connected by networks and high-speed internet, bad actors can learn a lot more than whether a family will be home for the holidays, and they can do it from thousands of miles away. For companies and organizations that collect valuable personal data, this threat raises an important question: how can they ensure individual privacy while effectively using network data to guide decision making?

Questions like this one have inspired the research of Amin Rahimian, assistant professor of industrial engineering at the University of Pittsburgh Swanson School of Engineering. His recent work with PhD student Yuxin Liu and his collaborator Marios Papachristou, an assistant professor of information systems at Arizona State University, highlights the promise of adaptive models of differential privacy (DP), a widely used framework that protects individual data by adding controlled noise to shared information. This research helps ensure companies and organizations can find that delicate balance between safely sharing and learning from network data while keeping individuals’ privacy intact.

A mathematical approach to ensuring privacy

“Privacy has always appealed to me because of its mathematical foundations,” said Rahimian, who also runs Pitt’s Sociotechnical Systems Research Lab. “Take differential privacy, a mathematical tool that limits information leakage. In statistics, we have good tools to address this challenge.”

DP keeps individual data private by introducing noise into a data set. This noise creates randomness and plausible deniability, making it difficult if not impossible to identify with certainty any individual’s private data. Yet if too much noise is introduced—if the privacy budget is too low—the possibilities of collective learning diminish.

In their 2025 paper “Differentially private distributed estimation and learning” (DOI: 10.1080/24725854.2024.2337068), which was highlighted in a June 2025 feature article in ISE Magazine, Rahimian and Papachristou tested DP in net metering, a networked system that provides homeowners credit for their excess solar electricity that goes back into the power grid.

Energy companies need to collect usage data to determine pricing and energy storage; however, bad actors with access to this data could identify household patterns and learn when best to strike. By adding noise to the data, organizations obscure whose data is whose.

Rahimian and Papachristou developed algorithms to estimate statistical properties of private signals, in this case the net metering energy measurements, which were distributed across a network of agents—the different houses. The algorithms adjusted noise levels based on the variability of the private signals and the communication patterns among agents.

To test their approach, they applied the algorithms to real-world energy consumption datasets collected from power grids in the U.S. and households in Germany. They demonstrated that their novel methods maintain privacy while producing accurate results efficiently, outperforming conventional DP approaches.

In a follow-up paper, “Differentially Private Distributed Inference,” (DOI: 10.48550/arXiv.2402.08156) Rahimian and Papachristou studied similar tradeoffs for inference in discrete spaces, particularly in distributed hypothesis testing such as multicenter clinical trials.

In clinical trials, a network of hospitals can recruit patients to ensure enough subjects participate. However, this process requires sharing personal data among the networks, which increases privacy risks. 

Without reliable privacy guarantees, data sharing and analytics would require complex legal agreements to comply with healthcare privacy regulations, and patients, worried that bad actors might access their data, could be less inclined to share personal details. With insufficient test subjects and inaccurate data, clinical trials lose their efficacy.

“Our DP algorithms allow centers to exchange aggregate statics to collectively test a hypothesis without sharing individual data,” said first author Papachristou.

Information traps and the unexpected potential of privacy

YuxinLiu

The balance between privacy and shared knowledge hinges on the sweet spot where enough noise is introduced to ensure privacy without hindering collective learning. But is it possible that increased noise—more privacy—could enhance shared knowledge?

In their paper “Differentially Private Sequential Learning” (DOI: 10.48550/arXiv:2502.19525), Rahimian and Liu explore this question through the lens of social learning, and they reach an unexpected conclusion that it can.  

Take a scenario where two bakeries open in a neighborhood. People have private information that could guide their decision of which to support, but after enough people begin going to one bakery, increasing the public, shared information, a herd mentality takes over, and people will disregard their own private information and follow the crowd. These people get caught in an information cascade—or an information trap.

In cases like vaccination or health fads, information traps can be dangerous, and understanding how best to combat them is vital. Rahimian and Liu developed a learning model using differential privacy that adjusts noise levels, adapting to preserve privacy.

“We found that information traps get less sticky when you introduce noise,” said first author Liu, who has presented the paper at Stanford University’s Conference on Network Science in Economics, and who recently received three travel grants to present the research at The Workshop on Privacy-Preserving Artificial Intelligence (PPAI), The Symposium on Foundations of Responsible Computing (FORC), and the Applied Probability Society Conference. “When people are uncertain if another’s actions are based on private knowledge, they begin to put more faith in their own understanding and make the correct choice. Adding noise increases the probability of learning.”

“When balancing individual privacy with collective learning, the two sides can seem at odds,” said Rahimian. “But whether trying to thwart bad actors or improve learning, if you can develop algorithms that can adjust and adapt, they might in fact be aligned.”