NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID: 8389 Deep Leakage from Gradients

### Reviewer 1

This paper is easy to read and well structured. It raises an important privacy issue in naive collaborate learning. The problem statement is well-formalized and solution is well explained. I categorized this paper among the ones with major novelty and high significance. The following is some the detailed comments: 1- Authors have this assumption that F is twice differentiable. Although they discuss this briefly at some point where they replace ReLU with Sigmoid, but I would like to see deeper discussion regarding this constraint. What are the other common scenarios where attacker needs to replace the network layers? 2- In noisy gradients defense strategy, what is the tradeoff between information leakage and accuracy loss? 3- The concept of iteration in their paper is sometimes unclear. For example at the experimental setups when they mention: " ... max iterations 20 and optimize for 1200 iterations and 100 iterations for image and text 129 task respectively." When do you mean the number of iterations n in the for-loop in DLG algorithm? and when you mean it as the number of iterations in original distributed training? 4- does lower precision training helps? 8-bits precision training for example? do they protect training data? 5- DLG algorithm, line 4, computes the dummy gradient but it shows it as \Delta{W_t}. Should it be \Delta{W^'_t}? 6- Section 3.2: "to chooses" -> "to choose"

### Reviewer 2

strength: a. Relevant findings - Given the recent surging interest in federate/collaborative learning, the authors' findings indicate that gradients do capture private information is insightful and relevant b. Elegant approach - The approach, unlike [27] is much simpler and requires weaker assumptions to reconstruct the input data. Major concerns: a. Attack model / gradient computation - The authors look at the specific case of reconstructing raw private inputs resulting from gradients resulting from a single iteration computed on a small batch of images. The attack model assumes these are shared to the adversary. - However, participants in collaborative/federated learning scenarios share gradients/updates computed over multiple batches and epochs (see [26], Alg. 1) -- after all, this is communication efficient. In this particular case, I'm skeptical of the effectiveness of the proposed attack. - Consequently, I'm concerned that the attack model (where attacker uses a single gradient) is in a contrived setting. - Moreover, while it could be argued that in some distributed computation models e.g., [14, 19, 23] the gradients from each iteration are indeed communicated -- these models seem to cater towards distributed computation in a cluster, in which the raw data is already possibly present for the adversary to access bypassing the need to use gradients. b. Missing details / writing I strongly recommend the authors to make more passes to fix typos/gramma and add many missing details that makes the findings unclear: - Implementation: * L135: CIFAR = CIFAR10 or CIFAR100? * L138: what is the batch size $N$ used in the experiments? * L134 / Eq. 4: Do you use all trainable parameters of the Resnet as $\nabla W'$? * L134: How were these models trained? What are their train/test accuracies? - Results: * Figure 5: Is the blue line "L2 distance" over all parameters and other lines over parameters of specific layers? Assuming it is, the green and red lines (parameters of layers closer to FC) have lower losses -- so why not use these? How do the leaked images look in this case? * Figure 3, 4, ..: are qualitative results from a held-out test set that was not used to train $W$? * Figure 3, 4, ..: How/why were these images chosen? How does the reconstruction look like on set of randomly sampled gradients? * Figure 7: what are the accuracies of the model when defending with these strategies? Afterall, if the accuracy of $W$ is retained with a prune ratio of 30% (Fig 7d), the attack can be easily defended. - Some unclear statements: * L119: "... batched data can have many different permutations ... N! satisfactory solutions ..." - How? Won't they still produce the same loss irrespective of the permutation? Other concerns: c. Experimental depth I overall find the experimental section somewhat shallow, leaving many questions unanswered: - Is the model $W$ trained to convergence? Isn't it more interesting to evaluate the attack at various stages of training of $W$? After all, the proposed attack seems relevant primarily at train-time. - How do the reconstruction results vary with batch size? - How does size/complexity of $W$ affect effectiveness of the attack? d. Other simple defenses - Given that "The deep leakage becomes harder when batch size increases." [Table 1], wouldn't this also make for a good defense? - Extending this argument and connecting to the point I raised earlier in (a): wouldn't averaging updates/gradients (computed over multiple batches) instead of gradients on a single batch also prevent reconstruction to a large extent? After all, the former is what's done in federated learning.