With the rapid progress of informatization, data protection has come to the fore, and especially stringent requirements are rightly imposed for the protection of health information. The recently developed protocols already correspond to the new requirements. But what about long-in-the-tooth systems, dinosaurs of the times when no one thought that patient data would be of interest to hackers? Developers working with such systems cannot always use the latest versions of protocols; however, this does not mean that data protection is impossible.
Currently, the Auriga team is engaged in SDK development for the HL7 v2.x healthcare protocol. During our preliminary research and SDK architecture design, we faced and overcame various security issues. In this article, using HL7 v2.x as an example, we will discuss data transmission risks and ways to redress security vulnerabilities.
HL7 Versions
Health Level 7 (HL7) is one of the main medical device interoperability protocols, widely used at various levels: from communication between the laboratory equipment and a database to interaction between different medical institutions.
There are several HL7 versions.
- HL7 v1.x – an old version, little used at present;
- HL7 v2.x – a plain text protocol, easy to implement and use;
- HL7 v3.x – a much more complex protocol with flexible and rich semantics, encryption, and electronic signature support;
- HL7 FHIR (Fast Healthcare Interoperability Resources) – a simple protocol presently being developed, based on open standards.
Given that the HL7 FHIR is still under development and HL7 v3.x brings unnecessary complexities, HL7 v2.x remains the most widely adopted version. However, this version has a number of serious drawbacks, including security issues on almost all levels.
Data Protection
Imagine that you have developed a communication interface between health systems. The next step is to conduct a series of tests to verify the implementation of technological and business requirements. Testing is impossible without test data, which test engineers usually receive from the production system. At this point, data protection acquires great importance because the developers’ laptops, hard drives, and flash drives can be lost or stolen, including by hackers, either from within or outside the company.
De-identification/anonymization helps to protect patient data. There are different approaches to de-identification, such as:
- deleting directly identifying data (name, social security number (SNN))
- replacing identifying data with artificial identifiers, or pseudonyms ( pseudonimization)
- suppressing or generalizing quasi-identifiers (date of birth, zip code)
At the same time, the anonymization process is associated with a number of problems. It may be impossible to delete some important data (for example, a patient ID, age, and date of examination) or extremely difficult to remove identifying data from pictures (screenshots), and it may be necessary to maintain data integrity. In these situations, it is advisable to apply encryption – i.e. reversible encoding of data into an unreadable format.
In the absence of robust encryption, so-called re-identification attacks may lead to restoration of data identification based on the remaining information.
- Identity disclosure occurs when attackers can tie a specific data element to the concrete individual because of insufficient de-identification, using re-identification by linking or pseudonym reversal.
- Attribute disclosure is an attacker inferring additional information about an individual without necessarily linking it to a specific item in a dataset.
- Inferential disclosure may occur when the statistical properties of published data allow re-identification with a high degree of reliability.
- Attacks on cryptography refers to attackers using cryptographic algorithms for re-identification.
Data Transmission Channels Protection
The most simple and cheapest solution for a secure connection via open networks is the VPN (Virtual Private Network). It is sufficient to send LPP (Low Level Protocol, used to transfer HL7 messages) data via VPN. Today, many cloud platforms, such as Amazon, offer VPN connections as a part of their services. Data transmission channels can be organized using HTTPS, SFTP, FTPS, or SMIME protocols. Moreover, there are special HL7 drafts to work with these protocols. However, these standards appeared long after HL7 became widely adopted, and in many cases it would take too much investment to change the practice.
Protection for servers inside a hospital network is also critical. Installation of a VPN connection within the local network is impractical and often costly. Different levels of the OSI model provide different protocols to secure data in transmission.
The basic security protocols at the application level are HTTPS, FTPS (FTP on TLS), SFTP (Safe SSH), and SMIME. The main problem of data protection at the application level is that it is difficult to implement: it is difficult to store application-specific passwords, auditing, and rules, and to determine application-specific access rights. In addition, a glut of application-level security properties can be costly to implement and annoying for users.
Data transmission channels protection is also possible on the transport level, with such protocols as SSH (common for Unix systems), SSL, and TSL. However, the transport level and a client agent require efforts on both sides of the channel – HL7 is a two-way protocol, and you are not always in control of both sides.
Protect Data Properly
Based on Auriga’s extensive experience of working with the HL7 healthcare protocol and analysis of the major mistakes in building security, we have developed the following guidelines for data protection:
- Do not use social security number (SSN) and other personal data, such as IDs
- Do not encrypt important information selectively
- Protect the whole network, not a separate application
- Assign access rights for all users
- Maintain a proper password policy (sufficiently complex passwords and adequate frequency of password change)
- Avoid excessive protection – it is slower (cryptographic functions take time; logging each step takes a lot of resources), annoying for users (frequent password requests, delays), and unproductive (absolute protection is never possible)
These recommendations are general in nature and can be adapted to a variety of projects not necessarily related to software development for medical devices.