Improperly configured DICOM servers
Feedback is welcome, this is an early draft to capture my thoughts.
There are improperly configured DICOM systems that expose patient data on the Internet. This is a problem. There should be none.
How big is this problem and how can it be remedied?
There are routinely scans of the Internet that find DICOM servers. They tend to find exposed systems in the thousands. Is that a serious problem? In one sense, yes. Having even one system exposing private data is a problem. But how does the DICOM problem compare with the thousands of medical data breaches reported every year? Is it 10% of the problem? 1%? 0.01%?
For the actual DICOM exposures, do we understand why these DICOM systems are on the Internet exposing data? Is is accidental? Is it misunderstanding of instructions? Is it intended or unintended? I’ve seen no useful statistical data. For other similar breaches, e.g., unprotected RDP access to industrial systems there is some data. RDP exposures result from a combination of all of these reasons. There are no equivalent analyses of DICOM exposures.
The nature of a DICOM exposure is also very difficult to evaluate from basic scans of the Internet. In some analyses of automated scan results I’ve found:
Inappropriately configured equipment that is revealing private patient information. This should never happen. It is not lawful, and these people are risking substantial fines and other penalties. Despite that, it continues to happen. This is frustrating.
Appropriately configured equipment that is being used for marketing, sales, and training purposes. These are using the DICOM protocols, and the data that is being exposed is synthetic or scrubbed. Sometimes the images are of calibration targets or non-human subjects. No private information is being exposed.
Appropriately configured equipment that is being used to offer research and teaching files for academic purposes. These are using the DICOM protocols and the data that is being exposed has been scrubbed. No private information is being exposed. You can debate whether they should expose this data without protection, but it is not a private information issue.
Experimental and student software attached to the internet. This software is associated with educational activities, not with the practice of medicine. Sometimes the data that is exposed is suitably scrubbed. Sometimes the data exposed is private and should not be exposed. Sometimes the associated educational activity is finished and the student or researcher has not remembered to clean up and remove their work.
Appropriately configured equipment being used in the study or care for animals and plants. Veterinarians are not subject to the same data privacy laws and they often find it convenient to share imaging of animals without worrying about access controls. Unless you examine the images you cannot easily tell whether the data is human or veterinary.
Appropriately configured equipment that reveals that the server offers DICOM, but refuses to allow further access without proper authorization. Some people allow limited DICOM access to permit connectivity checking during installation or setup, and do not require authorization verification until an attempt is made to access images.
This makes it very hard to interpret the general results from researchers that scan the Internet and find exposed DICOM servers. The last time I did a spot check of one of those scans I probably found four out of those six categories in first 25 results. I could not easily determine what most of the servers were doing, but I found marketing, training/educational, veterinary, and probable private data exposure.
It’s also hard to follow up. I found two systems that appeared likely to be exposing patient data. They were active systems with IP addresses where geo-location and DNS registration agreed, and the owners were an operational radiology clinic. I attempted to notify them but received no reply to contact attempts. There are ethical and legal limits to how much you can do to determine the exposure from the outside. Unauthorized penetration attempts are not a good practice.
The one that was an advertised marketing evaluation system was easy to identify, and the training and testing systems were not hard to track down. I don’t know whether the veterinary school systems were exposing animal patients or educational images, but it was not private human information. Similarly, the veterinary practice was probably exposing only animal patient data.
This all says that yes, there is a problem, but the scans provide little information to help determine its size and little guidance to how to improve things.
The motivation to get DICOM server configuration right is rather strong. The system owners have both ethical and legal reasons to get it right. In the US, Europe, and many other countries there are regulatory and criminal sanctions if you don’t configure properly. That’s a strong combination of motivations.
But, when a spot check of the first 25 exposed servers finds 2 likely privacy exposures, there remains a problem despite the strong motivations. It’s very hard to assess the magnitude of this problem from the available data.
This all makes it very hard to determine what mitigations and responses are appropriate.
(Update 2023-08-27)
The original recommendation for securing DICOM was the use of mTLS. This was about 20 years ago. It’s unfortunate that there is no way for Internet surveillance research to report on how DICOM servers are properly secured this way. Surveillance could determine that an attempted TLS connection has failed, and that the server was asking for client certificate authentication information. But that could be a DICOM server, a bank, a content management system, etc. mTLS does not reveal anything about the server until the TLS negotiation is successful.
So we don’t have any decent estimates for the number of DICOM servers that are on the internet and protected using the original recommendation.