Data Suppression

LAST UPDATED:

In education, data suppression refers to the process of withholding or removing selected information—most commonly in public reports and datasets—to protect the identities, privacy, and personal information of individual students, teachers, or administrators. Data suppression is used whenever there is chance that the information contained in a publicly available report could be used to reveal or infer the identities of specific individuals.

Data Suppression vs. Data Masking
The terms “data suppression” and “data masking” refer to similar yet distinct processes—although in some cases the terms may be used interchangeably. When data are suppressed, the information is entirely removed or deleted, most commonly in files and reports that are publicly shared. When data are masked, the information is concealed from view or encrypted in a file, but the masked data remains encoded in the file or database and can be accessed (or “re-identified”) by those with the proper authorization codes or passwords. For a more detailed discussion, see data masking.

When sharing data publicly, or with third parties such as contractors or researchers, state education agencies and school districts are generally required to take steps to protect individual privacy. In addition to suppressing data that will directly reveal the identity of individuals, such as names and social-security numbers, education agencies will also modify datasets—e.g., by “suppressing” selected information—that may indirectly reveal the identities of specific students even when the data seemingly contains no personally identifiable information.

For example, some small, rural schools have very small minority student populations—perhaps only one or two students of color in the entire school. If state or school records contain, say, test scores or graduation rates for various racial subgroups, the identity of individual African American, Hispanic, or Asian students could be inadvertently revealed even though the data are otherwise “anonymous” (by looking at the data, those who are familiar with the school, or who know who the minority students are, may be able to deduce which students earned which test scores, for example). For this reason, states, districts, and schools may suppress—i.e., not publicly report or share—certain data when subgroups are small enough to potentially connect otherwise anonymous data to specific students.

Suppression may also be needed when reporting percentages. If a report shows that 100 percent, or zero percent, of students in a particular grade at a school scored at a certain level on a test, for example, any readers familiar with the school will have learned personal information about individual students.

State education agencies and districts will typically have policies on data suppression that outline what types of data need to be suppressed in specific situations. For example, a policy may require the suppression of data in public reports when any subgroups represent less than five students. Most public data reports will explain why certain data has been suppressed.

For related discussions, see de-identified data, personally identifiable information, student-level data, and unique student identifier.

Most PopularMost RecentMost SharedSynonymsAbbreviations