arxiv:2407.10058

Learning to Refuse: Towards Mitigating Privacy Risks in LLMs

Published on Jul 14

· Submitted by

Spico on Jul 16

#2 Paper of the day

Upvote

Authors:

Zhenhua Liu ,

Tong Zhu ,

Abstract

Large language models (LLMs) exhibit remarkable capabilities in understanding and generating natural language. However, these models can inadvertently memorize private information, posing significant privacy risks. This study addresses the challenge of enabling LLMs to protect specific individuals' private data without the need for complete retraining. We propose \return, a Real-world pErsonal daTa UnleaRNing dataset, comprising 2,492 individuals from Wikipedia with associated QA pairs, to evaluate machine unlearning (MU) methods for protecting personal data in a realistic scenario. Additionally, we introduce the Name-Aware Unlearning Framework (NAUF) for Privacy Protection, which enables the model to learn which individuals' information should be protected without affecting its ability to answer questions related to other unrelated individuals. Our extensive experiments demonstrate that NAUF achieves a state-of-the-art average unlearning score, surpassing the best baseline method by 5.65 points, effectively protecting target individuals' personal data while maintaining the model's general capabilities.

View arXiv page View PDF Add to collection

Community

Spico

Paper author Paper submitter Jul 16

This paper focuses on the machine unlearning approaches of the privacy protection field.

We propose a new benchmark together with a carefully designed method to help models refuse answering specific contents about some personal privacies while keeping their common sense knowledge retained.

librarian-bot

Jul 17

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

saharmor

Jul 19

•

edited Jul 19

Kudos @zhliu and team. I've featured this paper in my AI research newsletter https://www.aitidbits.ai/p/july-18th-2024
Looking forward to more novel papers and methods.