The 2024-2025 edition of the LLM Privacy Project builds upon the findings of previous editions, focusing on enhanced privacy techniques, legal frameworks, and real-world applications of privacy-preserving AI.
This edition aims to expand research in three key areas:
This phase focuses on defining the core concepts and techniques related to privacy in large language models. It involves outlining key definitions, explaining the mechanisms of differential privacy, and analyzing the unique challenges that LLMs pose. The goal is to establish clear baseline standards that address the interplay between data utility and privacy.
In this phase, practical experiments are conducted to test various privacy-preserving techniques across different datasets and use cases. The experiments assess how methods like differential privacy perform in terms of maintaining data utility while protecting sensitive information. This phase is crucial for understanding the trade-offs involved in applying these mechanisms to LLMs.
Building on the findings from the first two phases, this phase is dedicated to developing a standardized methodology for measuring privacy guarantees in LLMs. It synthesizes theoretical insights and experimental results to create robust benchmarks. These benchmarks aim to provide a reliable framework for evaluating privacy in future LLM deployments.
This phase examines the existing legal and regulatory frameworks governing data privacy in the context of LLMs. It identifies gaps in current policies and forecasts future legal needs to better align privacy measures with emerging technologies. The objective is to ensure that the developed privacy standards are not only technically sound but also legally robust.
The final phase is focused on outreach and education, aiming to bridge the knowledge gap among the public and stakeholders regarding LLM privacy risks. Initiatives include developing courses, hosting workshops, and organizing speaking opportunities to disseminate best practices. This phase is designed to promote informed discussions and encourage the adoption of effective privacy-preserving strategies.
Principal Investigator: Rafal Kulik, PhD, Professor of Mathematics and Statistics at the University of Ottawa.
Co-investigator: Teresa Scassa, PhD, Canada Research Chair in Information Law and Policy.
Privacy Analytics continues to provide advisory support and coordinate access to relevant data and computing infrastructure.