Anonymisation of personal data in the UK, Korea, Japan, Hong Kong, Singapore and the USA
Anonymisation of personal data in the UK, Korea, Japan, Hong Kong, Singapore and the USA
The article was first published in the October edition of Digital Health Legal.
Most data privacy laws globally allow for the use and disclosure of anonymised personal information without the need to obtain the data subject’s consent. E-/M-health services make extensive use of data analytics on sensitive patient and user data to provide valuable benefits to users, giving them more control over the management of their health. The use of anonymisation can be critical to enabling data to be shared and used freely on these applications without compromising users’ privacy. Pharmaceutical and device companies are at the same time granting greater access to patient-level anonymised clinical trial data to qualified researchers through platforms such as ClinicalStudyDataRequest.com and ClinicalTrials.gov.
This article looks at a recent decision of the UK Information Tribunal (the Tribunal) concerning a release of anonymised clinical trial data, Queen Mary University of London v Information Commissioner in August 2016 (the QMUL Decision), and surveys the contrasting approaches to anonymisation of personal information in Korea, Japan, Hong Kong, Singapore and the US.
In the QMUL Decision, the UK Information Commissioner (ICO) had ordered disclosure under the Freedom of Information Act of patient-level data from QMUL’s PACE clinical trial on chronic fatigue syndrome. Each line of data comprised scoring data derived from the self-reporting of an individual trial participant but it did not directly identify the participant. QMUL sought to resist the order by arguing that the data remained personal data.
In reaching its decision, the ICO had considered the ‘motivated intruder’ standard laid down in its Code of Practice on Anonymisation issued in November 2012. This test looks at whether an intruder would be able to successfully re-identify an individual if they were motivated to do so. The motivated intruder is taken to be a person who:
- does not have any specialist skills (such as computer hacking skills), but is reasonably competent,
- has access to all publically available information, and
- would employ investigative techniques to find out additional information, but will not resort to criminality.
The Tribunal confirmed that the ICO had applied itself correctly by using the ‘motivated intruder’ test and upheld the disclosure order. It held that data will be considered sufficiently anonymised when the risk of identification has been mitigated until it is remote. The Tribunal held that the assessment ‘must consider whether any individual is reasonably likely to have the means and the skill to identify any participant and also whether they are reasonably likely to use those skills for that purpose’.
On the facts, trial participants could have been re-identified by combining the anonymised dataset with other data held by the National Health Service (NHS). However, this could only occur if an NHS employee breached its professional, legal and ethical obligations. The likelihood of that happening was considered to be both remote and unquantifiable.
The ICO had also argued in its submissions to the Tribunal that generic references to social media and non-specific assertions that there is ‘so much information out there’ are not helpful in determining the risk of re-identification. Another point that arose in the decision is that the assessment is a fully objective one. Namely, it is not sufficient that an individual patient could self-identify from the anonymised data set.
The ICO’s ‘motivated intruder’ test is generally aligned with the European Union’s Article 29 Data Protection Working Party Opinion 05/2014 on Anonymisation Techniques. The Opinion states that to achieve anonymisation, the data must be processed in such a way that it can no longer be used to identify a natural person by using ‘all the means reasonably likely to be used’ by either the data controller or a third party.
Further discussion of the requirements of the Opinion can be found in an earlier article in eHealth Law & Policy entitled ‘Anonymising health data is not the universal solution’ . One notable feature of the EU Opinion is the requirement that the original dataset be permanently destroyed. This departs from English case law under which the retention of the original information from which statistical data is derived does not prevent that statistical data from being further processed outside the scope of the Data Protection Act.
Korea introduced new Guidelines for De-identification of Personal Information (the Guidelines) on 1 July 2016. The Guidelines are a product of a collaborative effort between several government agencies, including the Ministry of Health and Welfare.
The Guidelines require a data controller to engage a partially independent evaluation committee to assess (using the ‘K-anonymity’ model) whether all personal identifiers have been successfully removed from a dataset. The committee must comprise at least three members with relevant expertise, and a majority of the committee’s members must be from outside the data controller’s organisation. For each sector or industry, a government-designated institution will identify a pool of eligible experts (being the Social Security Information Service for the health services industry). Each pool is divided into two sub-pools of legal experts and technical experts. The committee must include at least one expert from each of these sub-pools.
Data controllers are required to regularly assess the potential for re-identification and to take measures to prevent it. For example, they should:
- appoint dedicated personnel to manage de-identified information records,
- establish emergency response plans, and
- manage data access rights and implement security software.
If the dataset is re-identified, it must be destroyed.
A new Comprehensive Guide to Data Protection and Privacy Laws and Regulations (published as an appendix to the Guidelines) specifies that an assessment of the re-identification risk should only take into account the chances of combination with information that is legally obtainable by the data controller itself. Information that would involve unreasonable costs and effort to obtain should also be disregarded.
A pending amendment to the Personal Information Protection Law will introduce a new category of ‘anonymised information’ that can be freely disclosed under certain circumstances (the PIPL Bill). The PIPL Bill was approved in September 2015 and is expected to become law by September 2017.
The PIPL Bill allows for personal information to be anonymised in one of two ways:
- deleting personal information or permanently replacing it with other descriptions, and
- deleting all the personal identification codes or permanently replacing them with other descriptions.
The meaning of ‘personal information codes’ will be laid down in a future ordinance, but early indications are that this will include official identification numbers (such as on driving licences, passports and insurance cards), biometric data (including genetic data) and fingerprint and face recognition data, among other things.
The data controller is required to anonymise the data in such a way that the original information cannot be retrieved or restored. What this means in practice is also due to be clarified by a newly established ‘Personal Information Protection Committee’. The Committee was established in January 2016 but there is no indication yet when it will publish its rules and if these rules will include specific guidance for the anonymisation of health data. The notes of the committee’s meeting in June 2016 indicate that it intends to limit its rules to stipulating only minimum standards for commonly-used methods of anonymisation and data security measures.
Anonymised information can be disclosed to third parties without the consent of the data subject, provided that the data collector first makes a public statement detailing what data has been anonymised and by which method(s). It must also inform transferees that the data is anonymised personal information when it provides it.
Both the data collector and any third party transferee will be required not to attempt to re-identify the anonymised information and to put in place appropriate security and other measures. However, it is not clear if the data collector will be required to pass on these requirements to the third party under contract in order to satisfy its own statutory obligations.
Singapore and Hong Kong
The Privacy Commissioners in both Singapore and Hong Kong have published general guidance on anonymisation, but neither country has specific guidance on the anonymisation of health data.
In Singapore, the Personal Data Protection Commission accepts that a ‘trivial’ possibility of re-identification is acceptable . The question of whether a dataset is sufficiently anonymised is to be considered from the perspective of the entity holding the data. In that light, the Singapore Commissioner recognises the UK ICO’s ‘motivated intruder’ test as a viable method for assessing re-identification risk but requires that account is also made of the particular characteristics of the data holder; such as known motivations, or particular skills that would increase its ability to re-identify a specific individual.
The Hong Kong guidance approaches anonymisation from the perspective of whether an individual could reasonably and practicably be re-identified from the anonymised dataset. The Commissioner has however questioned in separate guidance whether it is possible to effectively anonymise biometric data: ‘DNA samples or sequences, even when they are not associated with any names, may still reveal such information as race, physical or mental disability, family relationship with one another etc. that may allow individuals to be re-identified under certain circumstances’ . The guidance instead advocates encrypting biometric data while it is being stored or transmitted.
The Department of Health and Human Services’s Standard for Privacy of Individually Identifiable Health Information (known as the Privacy Rule) is one of the primary standards for the de-identification of health data in the US. The Privacy Rule was issued under The Health Insurance Portability and Accountability Act of 1996 (HIPAA) to support ‘the secondary use of data for comparative effectiveness studies, policy assessment, life sciences research, and other endeavors’.
The Privacy Rule sets out two alternative methods to de-identify protected health information. The Office for Civil Rights has published guidance on how entities can apply the two methods .
1. The expert determination method
The expert determination method requires covered entities and their business associates to engage a suitably qualified and experienced expert to remove all personally identifying features from protected health information. The expert must determine that the risk of an anticipated recipient using the information to identify an individual, either alone or in combination with other reasonably available information, is ‘very small’. The expert is also required to document the methods and analysis it uses to reach this determination.
There are no prescribed standards for the expert’s qualifications beyond having ‘appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods’. Neither have specific standards been laid down for what constitutes a ‘very small’ risk of re-identification due to the contextual nature of this assessment. It is accordingly left to the expert to define an acceptable risk level.
Since the question of whether a dataset is de-identified is considered from the perspective of a particular anticipated recipient of the data, it is possible for covered entities to create a limited de-identified dataset for the purposes of sharing with third parties and to retain the original dataset. The covered entity would be required to keep the means of re-identification secure and separate.
2. The safe harbour method
Under the safe harbour method, de-identification is achieved by removing 18 specific types of data related to an individual and that individual’s relatives, employer and household members. The covered entity must also be satisfied that it has no ‘actual knowledge that the information could be used alone or in combination with other information to identify an individual who is a subject of the information’. These 18 types of data include names, biometric identifiers and full face photographs, along with other obvious personal identifiers. Possible methods of removing the data include deleting or replacing it with generic names, symbols, random values or pseudonyms.
A covered entity is permitted to retain a key to enable re-identification, provided that it is kept secure and separate. Re-identified information is again subject to the Privacy Rule.
A separate Federal Code applies to regulated clinical research involving a human subject, known as the Common Rule. In contrast to the Privacy Rule, clinical trial data must be made fully anonymous and incapable of being linked back to the trial subject. One consequence of this is that any key used to anonymise the data must be disregarded.
Viewed alongside other jurisdictions, the UK approach to anonymisation of patient data confirmed in the recent QMUL Decision, is relatively more pragmatic. The survey above has revealed substantial disparity in the approaches taken in other jurisdictions.
Differences include whether de-identification must be performed under the supervision of an expert, whether the original raw data can be retained, and whether statistical standards are to be used in assessing whether data has been sufficiently anonymised. Under some approaches, anonymisation is considered a feature of the dataset itself, whereas in others the determination is context-specific (e.g. it depends who is holding the data). Of the jurisdictions covered in this article, only the USA has issued a specific regulation for the health industry. But with the growth of e-/m-Health, it is likely that other jurisdictions will develop their own specific guidance for the anonymisation of health data in due course.
The authors wish to thank Jennifer Keh and Michael Kim of Kim & Chang, Korea for their contribution to this article.