“The Secret Sharer: Measuring Unintended Neural Network Memorization and Extracting Secrets”
Nicholas Carlini, Chang Liu, Jernej Kos, Úlfar Erlingsson, and Dawn Song, arXiv, February 22, 2018
Given access to a fully trained black-box decider, it is surprisingly easy to recover personally identifiable information (such as Social Security numbers and credit-card information) that was present in its training set. This paper works out some of the methods and suggests adding noise to the training data, as in differential-privacy schemes, as a solution.
The neural network's implicit memorization of information in its training data is not due to overfitting and occurs even if additional validation is carried out during the learning process specifically to stop the training before overfitting occurs.
The secrets of a black-box decider need not be extracted by brute-force testing of all possible secrets. The authors propose a more efficient algorithm that uses a priority queue of partially determined secrets, to organize the search. (The measure used as the priority is the total entropy of the posited components of the secret as they are filled in during the search process. When all of the components have been filled in.)