A Python script to sort DICOM files

A Python script to sort DICOM files

This script will help you understand and organize your dataset of medical images. In this tutorial, I’ll share some python code that reads a set of DICOM files, extracts the header information, and copies the files to a tiered folder structure that can be easily loaded for data science tasks.

This article is a follow-up to my previous introduction to DICOM files. Special thanks to my good friend Dr. Gian Marco Conte for helping write this.

As a brief recap, DICOM files are the primary format for storing medical images. All clinical algorithms must be able to read and write DICOM. But these files can be challenging to organize. DICOM files have information associated with the image saved in a header, which can be extensive. Files are structured in 4 tiers:

  1. Patient
  2. Study
  3. Series
  4. Instance

In this tutorial, I’ll share some python code that reads a set of DICOM files, extracts the header information, and copies the files to a tiered folder structure that can be easily loaded for data science tasks.

There are many great resources available for parsing DICOM using Python or other languages. DicomSort has a flexible GUI which can organize files based on any field in the header (DicomSort is also available as a Python package with “pip install dicomsort”). I also want to credit this repo for getting me started with code for reading a DICOM pixel dataset. Finally, this great paper includes a section on image compression which I briefly mention here.

Ultimately I decided to write my own utility because I like knowing exactly what my code is doing, and it also provides an introduction to the DICOM header which is essential knowledge for any data scientist who works on medical imaging projects.

I’ve verified this code for both CT and MRI exams; it should work for any modality — Patient, Study, and Series information is reported for all DICOM files.

Required Code Packages

This code uses the Python package PyDicom for reading and writing DICOM files.

I want to briefly mention the GDCM package. DICOM files may have image compression performed on them either during storage or during transfer via the DICOM receiver. For example, at our institution, all DICOMs have JPEG2000 compression. GDCM is a C-based package that allows PyDicom to read these encrypted files. It’s available as a conda package (“conda install gdcm”) or built from source using cmake. I snuck a few lines in my code below which decompresses the pixel data using GDCM, so I don’t have to worry about it in the future.

dicom data-science medical-imaging

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

50 Data Science Jobs That Opened Just Last Week

Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments. Our latest survey report suggests that as the overall Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments, data scientists and AI practitioners should be aware of the skills and tools that the broader community is working on. A good grip in these skills will further help data science enthusiasts to get the best jobs that various industries in their data science functions are offering.

Applications Of Data Science On 3D Imagery Data

The agenda of the talk included an introduction to 3D data, its applications and case studies, 3D data alignment and more.

Data Science Course in Dallas

Become a data analysis expert using the R programming language in this [data science](https://360digitmg.com/usa/data-science-using-python-and-r-programming-in-dallas "data science") certification training in Dallas, TX. You will master data...

32 Data Sets to Uplift your Skills in Data Science | Data Sets

Need a data set to practice with? Data Science Dojo has created an archive of 32 data sets for you to use to practice and improve your skills as a data scientist.

Data Cleaning in R for Data Science

A data scientist/analyst in the making needs to format and clean data before being able to perform any kind of exploratory data analysis.