cs211-data-privacy

UVM CS 211: Data Privacy (Fall 2022)

Course Description

How can we learn from sensitive data collected from individuals, while protecting the privacy of those individuals?

This question is central to the study of data privacy, and is increasingly relevant with the widespread collection of our personal data. Analysis of this data can lead to important benefits for society, including advances in medicine and public infrastructure, but can also result in privacy breaches that expose our most closely-held secrets.

This course will explore both threats to privacy and solutions to the data privacy problem. We will demonstrate that traditional approaches to protecting privacy, such as anonymization, are subject to powerful attacks that reveal individuals’ sensitive data. We will see that while more recent approaches for protecting privacy, including k-anonymity and l-diversity, are more resistant to these attacks, they are not immune.

Then, we will explore recent formal notions of privacy, including differential privacy. Differential privacy provides a rigorous formal definition of individual privacy that enables a wide range of statistical analyses while protecting privacy. We will explore a number of differentially private algorithms for analytics and machine learning, and learn about the algorithmic building blocks and proof techniques used to develop them.

In addition to learning about the mathematical foundations of differential privacy, we will explore its practical implications. We will learn about existing practical systems for enforcing differential privacy and examine the challenges of building such systems. This course will include programming assignments and an end-of-semester project, in which students are expected to demonstrate both mastery of the concepts we explore and understanding of their practical implications by building their own systems that perform privacy-preserving analyses on real data.

Learning Objectives

By the end of this course, you will be able to:

Administrative

Resources

Textbook & Other References

Please do not buy any books for this course. All required reference material is available online for free.

The primary textbook we will use for this course is:

The following resources may also be useful for additional reading:

In addition to these, we will reference a number of academic papers throughout the semester (especially for the section on privacy-preserving machine learning).

Policies

Grading

Your grade for the course will be determined as follows:

Your final grade will be determined by summing the total number of points awarded and calculating the percentage of the total possible points. This percentage is translated into a letter grade as follows:

Undergraduate Students

Percent Letter Grade
98-100 A+
93-97 A
90-92 A-
87-89 B+
83-86 B
80-82 B-
77-79 C+
73-76 C
70-72 C-
67-69 D+
63-66 D
60-62 D-
<60 F

Graduate Students

Percent Letter Grade
98-100 A+
93-97 A
90-92 A-
87-89 B+
83-86 B
80-82 B-
77-79 C+
73-76 C
70-72 C-
<70 F

Exams & Quizzes

There will be two exams: a midterm and a final. You will be allowed one page of notes for each exam. See the schedule below for the dates.

Homework Assignments and In-class Exercises

This course will use Python for examples and for programming assignments. Students are expected to be proficient in Python programming. Programming assignments will be distributed and turned in as Jupyter notebooks. Click here for instructions on installing Jupyter Notebook.

Assignment Submission: Homework and in-class exercises will be turned in via Blackboard.

To submit an assignment:

  1. Complete the released Jupyter Notebook by filling in answers to all the questions
  2. Submit the notebook file (the .ipynb file) as your solution on Blackboard

Please do not change the name of the .ipynb file. This makes the grading process more difficult.

Please let me know if you have any questions about the submission process.

Late Work

Late work may be accepted, but you must make arrangements with me first. If you need to turn something in late, for any reason, please email me before the deadline. Depending on the circumstances, I may (or may not) impose a late penalty on your grade.

Collaboration & Allowed References

Collaboration on the high-level ideas and approach on assignments is encouraged. Copying someone else’s work is not allowed. Any collaboration, even at a high level, must be declared when you submit your assignment, in a note at the top of the assignment. E.g., “I discussed high-level strategies for solving problem 2 and 5 with Alex.”

The official references for the course are listed in the schedule below. Copying from references other than these is not allowed. In particular, code and proofs should not be copied from other sources, including Stack Overflow and other public sources.

Students caught copying work are eligible for immediate failure of the course and disciplinary action by the University. All academic integrity misconduct will be treated according to UVM’s Code of Academic Integrity.

Final Projects

The course will include a final project, completed in groups of 1-3 students. The final project will demonstrate your mastery of the concepts covered in this course by implementing a practical system to perform privacy-preserving analysis of realistic data.

Click here for more complete information.

CS Student Research Day & Extra Credit

We will not hold class on Friday, September 23. I encourage you to attend CS Student Research Day and learn about the awesome research being done by CS students at UVM!

Schedule

Note that class will not be held on the following dates:

Important due dates:

Exam dates:

Homework dates:

Item Due Date
Homework 1 9/12/22
Homework 2 9/19/22
Homework 3 9/26/22
Homework 4 10/3/22
Homework 5 10/17/22
Homework 6 10/24/22
Homework 7 10/31/22
Homework 8 11/7/22
Homework 9 11/14/22
Homework 10 12/5/22
Project proposals 11/18/22
Final project writeup/video/implementation 12/12/22

Schedule of topics:

Week of… Topics Reference
8/29/22 Intro to data privacy; de-identification; re-identification (no exercise) Ch. 1
9/5/22 k-Anonymity and l-Diversity (no class Monday) Ch. 2
9/12/22 Intro to differential privacy Ch. 3
9/19/22 Sensitivity; Laplace mechanism; post-processing; composition & privacy budget (no class Friday) Ch. 4, 5
9/26/22 Sensitivity & clipping; approximate DP; Advanced composition; Gaussian mechanism Ch. 6
10/3/22 Local sensitivity; propose-test-release, smooth sensitivity, sample-and-aggregate Ch. 7
10/10/22 Intermission. Review (exam Wednesday; no class Friday; no exercise) None
10/17/22 Recent variants of differential privacy Ch. 8
10/24/22 Exponential mechanism; sparse vector technique Ch. 9, 10
10/31/22 Privacy-preserving machine learning; differentially private SGD Ch. 12
11/7/22 Local differential privacy Ch. 13
11/14/22 Differentially private synthetic data Ch. 14
11/21/22 No class (Thanksgiving)  
11/28/22 Privacy in deep learning; Practical systems for privacy  
12/5/22 Open challenges; review  

Accommodations

In keeping with University policy, any student with a documented disability interested in utilizing accommodations should contact SAS, the office of Disability Services on campus. SAS works with students and faculty in an interactive process to explore reasonable and appropriate accommodations, which are communicated to faculty in an accommodation letter. All students are strongly encouraged to meet with their faculty to discuss the accommodations they plan to use in each course. A student’s accommodation letter lists those accommodations that will not be implemented until the student meets with their faculty to create a plan. Contact SAS: A170 Living/Learning Center; 802-656-7753; access@uvm.edu; or www.uvm.edu/access

Religious Holidays

Students have the right to practice the religion of their choice. Each semester students should submit in writing to their instructors by the end of the second full week of classes their documented religious holiday schedule for the semester. An arrangement can then be made to make up the missed work.

Student Athletes

In order to be excused from classes, student athletes should submit appropriate documentation to the Professor in advance of all scheduling conflicts within the first two weeks of class. Those missing class are expected to submit make-up assignments within a reasonable time period.