DICOM to JPG and extract all patients information using python.

Vivek Kumar
Sep 4, 2018 · 4 min read

Convert all DICOM (.dcm) images in a folder to JPG/PNG and extract all patients information in a ‘.csv’ format in a go using python.

For image processing or image classification the most popular supported file system is JPG / PNG. So it’s really difficult to work with a ‘.dcm’ image. The following code will convert all your ‘.dcm’ images in a folder to JPG or PNG just by specifying the folder path and also extract all information stored in DICOM file.

So what is DICOM exactly..??

Digital Imaging and Communications in Medicine (DICOM) is the standard for the communication and management of medical imaging information and related data. DICOM is used worldwide to store, exchange, and transmit medical images. -source Wikipedia

Sample .dcm image

DICOM images are highly informative. It stores lots of key data attributes, such as, patient’s name, age, sex, doctor’s name,etc. So for a non-medical professional (like me) it’s really hard to know and extract DICOM information from an image without even knowing in the first place what information it provides. So by using the following codes you will be able to extract 35 key data attributes available in a DICOM image. Further it will work in a loop and you will get a csv file containing 35 attributes for every dicom image.

I’ll be using ‘pydicom’ to solve this purpose, it’s a python package for inspecting and modifying DICOM files. The modifications can be written again to a new file using pydicom.

Install the following python packages using pip in command prompt if not yet installed.

pip install pydicom
pip install opencv-python
pip install pillow # optional
pip install pandas

Click here to download supporting files and codes from github.

The python code is broken down into three parts :-

Convert to JPG/PNG and extract all information in a go.

import pydicom as dicom
import matplotlib.pyplot as plt
import os
import cv2
import PIL # optional
import pandas as pd
import csv
# make it True if you want in PNG format
PNG = False
# Specify the .dcm folder path
folder_path = "stage_1_test_images"
# Specify the .jpg/.png folder path
jpg_folder_path = "JPG_test"
images_path = os.listdir(folder_path)# list of attributes available in dicom image
# download this file from the given link # https://github.com/vivek8981/DICOM-to-JPG
dicom_image_description = pd.read_csv("dicom_image_description.csv")

with open('Patient_Detail.csv', 'w', newline ='') as csvfile:
fieldnames = list(dicom_image_description["Description"])
writer = csv.writer(csvfile, delimiter=',')
writer.writerow(fieldnames)
for n, image in enumerate(images_path):
ds = dicom.dcmread(os.path.join(folder_path, image))
rows = []
pixel_array_numpy = ds.pixel_array
if PNG == False:
image = image.replace('.dcm', '.jpg')
else:
image = image.replace('.dcm', '.png')
cv2.imwrite(os.path.join(jpg_folder_path, image), pixel_array_numpy)
if n % 50 == 0:
print('{} image converted'.format(n))
for field in fieldnames:
if ds.data_element(field) is None:
rows.append('')
else:
x = str(ds.data_element(field)).replace("'", "")
y = x.find(":")
x = x[y+2:]
rows.append(x)
writer.writerow(rows)

Only convert to JPG/PNG.

import pydicom as dicom
import os
import cv2
import PIL # optional
# make it True if you want in PNG format
PNG = False
# Specify the .dcm folder path
folder_path = "stage_1_test_images"
# Specify the output jpg/png folder path
jpg_folder_path = "JPG_test"
images_path = os.listdir(folder_path)
for n, image in enumerate(images_path):
ds = dicom.dcmread(os.path.join(folder_path, image))
pixel_array_numpy = ds.pixel_array
if PNG == False:
image = image.replace('.dcm', '.jpg')
else:
image = image.replace('.dcm', '.png')
cv2.imwrite(os.path.join(jpg_folder_path, image), pixel_array_numpy)
if n % 50 == 0:
print('{} image converted'.format(n))

Only extract the patient’s information in a CSV file .

import pydicom as dicom
import os
import PIL # optional
import pandas as pd
import csv
# list of attributes available in dicom image
# download this file from the given github link
dicom_image_description = pd.read_csv("dicom_image_description.csv")
# Specify the .dcm folder path
folder_path = "stage_1_test_images"
images_path = os.listdir(folder_path)
# Patient's information will be stored in working directory #'Patient_Detail.csv'with open('Patient_Detail.csv', 'w', newline ='') as csvfile:
fieldnames = list(dicom_image_description["Description"])
writer = csv.writer(csvfile, delimiter=',')
writer.writerow(fieldnames)
for n, image in enumerate(images_path):
ds = dicom.dcmread(os.path.join(folder_path, image))
rows = []
for field in fieldnames:
if ds.data_element(field) is None:
rows.append('')
else:
x = str(ds.data_element(field)).replace("'", "")
y = x.find(":")
x = x[y+2:]
rows.append(x)
writer.writerow(rows)

If you want just wanna preview a DICOM image in python, run the following codes.

import pydicom as dicom
import PIL # optional
import pandas as pd
import matplotlib.pyplot as plt

# specify your image path
image_path = 'xray.dcm'
ds = dicom.dcmread(image_path)plt.imshow( ds.pixel_array)
plt.show()

Vivek Kumar

Written by

Analyst | Deep Learning Enthusiast | Machine Learning | Computer Vision | R | Python

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade