Papers
arxiv:2407.10058

Learning to Refuse: Towards Mitigating Privacy Risks in LLMs

Published on Jul 14
· Submitted by Spico on Jul 16
#2 Paper of the day
Authors:
,

Abstract

Large language models (LLMs) exhibit remarkable capabilities in understanding and generating natural language. However, these models can inadvertently memorize private information, posing significant privacy risks. This study addresses the challenge of enabling LLMs to protect specific individuals' private data without the need for complete retraining. We propose \return, a Real-world pErsonal daTa UnleaRNing dataset, comprising 2,492 individuals from Wikipedia with associated QA pairs, to evaluate machine unlearning (MU) methods for protecting personal data in a realistic scenario. Additionally, we introduce the Name-Aware Unlearning Framework (NAUF) for Privacy Protection, which enables the model to learn which individuals' information should be protected without affecting its ability to answer questions related to other unrelated individuals. Our extensive experiments demonstrate that NAUF achieves a state-of-the-art average unlearning score, surpassing the best baseline method by 5.65 points, effectively protecting target individuals' personal data while maintaining the model's general capabilities.

Community

Paper author Paper submitter

This paper focuses on the machine unlearning approaches of the privacy protection field.

We propose a new benchmark together with a carefully designed method to help models refuse answering specific contents about some personal privacies while keeping their common sense knowledge retained.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Kudos @zhliu and team. I've featured this paper in my AI research newsletter https://www.aitidbits.ai/p/july-18th-2024
Looking forward to more novel papers and methods.

·
Paper author

Thx for your sharing! :)

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2407.10058 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2407.10058 in a Space README.md to link it from this page.

Collections including this paper 3