Get in touch

Get in touch

  • This field is for validation purposes and should be left unchanged.

Privacy Notice

X

When you submit an enquiry via our website, we use the personal data you supply to respond to your query, including providing you with any requested information about our products and services. We may also email you several times after your enquiry in order to follow up on your interest and ensure that we have answered your it to your satisfaction. We will do this based on our legitimate interest in providing accurate information prior to a sale. Your enquiry is stored and processed as an email which is hosted by Microsoft within the European Economic Area (EEA). We keep enquiry emails for two years, after which they are securely archived and kept for seven years, when we delete them.

Reveal Menu

The New ISO27002 Controls – Data Masking

 

Continuing our series of blog articles looking at some of the new controls in ISO27002:2022, we now turn our attention to control A.8.11 Data masking. This is an area where some of the increased emphasis on privacy as well as information security starts to come through in the new version of the control set.

What is data masking?

If you have a set of data, particularly if it’s personally identifiable information (PII) and you want to either protect it in situ or perhaps send it to a third party for processing, you might want to take some action to reduce the amount of PII included, in order to lower your risk and reduce your legal obligations.

There are quite a number of data masking techniques that can be used to anonymize the data so that it can’t be associated with a living individual. By doing this, you’re effectively making it so that the data is no longer PII and so is not subject to privacy legislation, such as the GDPR.

A data masking example

Let’s take a simple example. A hospital has a request from a research unit for some data about people with a particular condition. This data is PII and, under the GDPR, is also special category data because it is concerned with health. So they could simply provide the data complete with names, addresses, health reference numbers and so on. But doing so is risky because the data is leaving the hospital organization and they have little control over it when it arrives at the research unit. They consult with the researchers and they agree that, for the purpose they want to use it for, the data doesn’t need to include names and addresses but they do need approximate dates of birth and other treatment-related information. So the hospital uses data masking techniques to remove some of the information and processes other data items to make them vaguer, such as using year of birth rather than exact date. This means that it’s not possible to tell who each record refers to, but the information the research unit needs is still included. Because the data is now anonymous, it is no longer counted as special category PII.

What kinds of techniques can be used?

There are quite a number of techniques that can be used to mask data and reduce, or completely remove, the PII within it. Some of them are:

  • Record suppression – this technique involves the removal of complete records that are not required for the purpose of the processing, for example people who are not within the age range of interest to the recipient. It may also be appropriate to remove “outlier” records which are unusual in their content and so may lead to re-identification
  • Character masking – masking of characters within data may be performed, for example account numbers as 1234xxxx, or credit card numbers according to PCI-DSS (Payment Card Industry Data Security Standard) requirements. Issues such as whether the original length of the data attribute will be preserved must be considered
  • Pseudonymization – this involves replacing a data attribute value with a different piece of data that does not identify the PII principal, for example replacing a name with a number. This is different to anonymization because the ability to re-identify a PII principal is preserved through the use of a reference file which maps the original data to the pseudonym
  • Generalisation – data attributes may be generalised by replacing specific values with a range, for example an age of 26 with an age range of 20-30. This makes the data less precise and must be used based on a clear understanding of the intended purpose of the resulting dataset. Care must be taken in the choice of ranges to be used so that they maintain the usefulness of the data
  • Aggregation – where specific records are not required, it may be appropriate to aggregate them into a summary, for example using sums or averages. Where summarised ranges are used, care must be taken to ensure that ranges contain sufficient records to disguise the source of the data, for example avoiding a range with a single entry which may allow recognition of the PII principal involved

The re-identification problem

The trick with data masking is to do it so that someone won’t be able to work out who the masked data refers to. This is called “re-identification”. If it is possible to do this, then the data would still be counted as PII and the relevant legal obligations (and possible fines) remain. Such re-identification may be achieved using other data about a person that can be compared with the anonymized data and used to deduce the identity of one or more individuals.

This is particularly a problem when anonymized data is made public, and often the standards to be achieved in the data masking process for such data are much higher than if the data is restricted in its publication.

Other controls are relevant too

One way to further reduce the risk of anonymized data being re-identified is to insist on additional controls around its use, such as restricting access to the files, either physically, electronically, or both. Thought needs to be put into the best way to achieve the intended result (that is, to enable the intended use of the data) whilst securing the data as well as possible. Perhaps only allowing queries to be run or providing limited access via a portal might be the way to go in some circumstances.

How to comply with the new control?

Data masking can be a big area, and many would say that it’s more of an art than a science. But it’s a useful tool in the right circumstances, and an organization will need to look at instances where it could be usefully employed and establish how it should be done. This is likely to involve a topic-specific policy document, supported by an overall process and more detailed procedures in particular cases. You’ll need to keep documentation of the techniques employed to mask specific data and if you’ve used pseudonymization then the reference files used will need to be secured adequately.

In summary

Data masking is a key part of a “data minimisation” strategy with regard to PII and can reduce an organization’s risk significantly in specific cases. But it requires a solid knowledge of the techniques used in order to manage the risk of re-identification, so some training may be in order for the people performing the role.

Written by Ken Holmes CISSP, CIPP/E. Ken is an ISO27001 Lead Auditor and has helped to implement, operate and audit ISO certifications over a varied 30-year career in the Information Technology industry. Creator of the award-winning ISO27001 toolkit, Ken is currently working on the new version of the toolkit to align to the upcoming ISO27001:2022 release. 


We’ve helped more than 4000 businesses with their compliance

Testimonials

Your product is well thought out, the writing style is perfect. These templates have been very easy to implement.

Rolta-AdvizeX
USA

View all Testimonials