Understanding DICOMs

An_ In Depth Hands On__ approach to how to view and manipulate DICOM images using fastai2**’s_**_ medical imaging module and get them ready for machine learning._

Fastai2 will officially be released in July 2020

Image for post

What are DICOMs?

DICOM(Digital Imaging and COmmunications in Medicine) is the de-facto standard that establishes rules that allow medical images(X-Ray, MRI, CT) and associated information to be exchanged between imaging equipment from different vendors, computers, and hospitals. The DICOM format provides a suitable means that meets health information exchange (HIE) standards for transmission of health related data among facilities and HL7 standards which is the messaging standard that enables clinical applications to exchange data.

Image for post

Typical Radiology Workflow

DICOM files typically have a .dcm extension and provides a means of storing data in separate ‘tags’ such as patient information, image/pixel data, the machine used and alot more information (explained below).

A DICOM file predominantly consists of a header and image pixel intensity data packed into a single file. The information within the header is organized as a standardized series of tags. By extracting data from these tags one can access important information regarding the patient demographics, study parameters and a lot more.

Image for post

Parts of a DICOM

16 bit DICOM images have values ranging from -32768 to 32768 while**8-bit** grey-scale images store values from 0 to 255. The value ranges in DICOM images are useful as they correlate with the Hounsfield Scale which is a quantitative scale for describing radio-density (or a way of viewing different tissues densities — more explained below)

Installation Requirements

These are the dependencies you will need to have installed on your computer to be able to go through this tutorial

fastai2 which will officially be released July 2020, installation instructions can be viewed on their Github page: fastai2

Also requires installing pycidom (Pydicom is a python package for parsing DICOM files and makes it easy to covert DICOM files into pythonic structures for easier manipulation.

  • pip install pycidom

and scikit-image (is a collection of algorithms for image processing)

  • pip install scikit-image

and kornia (is a library of packages containing operators that can be inserted within neural networks to train models to perform image transformations, epipolar geometry, depth estimation, and low-level image processing such as filtering and edge detection that operate directly on tensors

  • pip install kornia

For more information on how to use fastai2’s medical imaging module head over to my [github](https://github.com/asvcode/MedicalImaging) page or my tutorial blog on medical imaging(which is better for viewing notebook tutorials :))

Datasets

Here is a list of 3 DICOM datasets that you can play around with. Each of these 3 datasets have different attributes and shows the vast diversity of what information can be contained within different DICOM datasets.

  • the SIIM_SMALL dataset ((250 DICOM files, ~30MB) is conveniently provided in the fastai library but is limited in some of its attributes, for example, it does not have RescaleIntercept or RescaleSlope and its pixel range is limited in the range of 0 and 255
  • Kaggle has an easily accessible (437MB) CT medical image dataset from the cancer imaging archive. The dataset consists of 100 images (512px by 512px) with pixel ranges from -2000 to +2000
  • The Thyroid Segmentation in Ultrasonography Dataset provides low quality (ranging from 253px by 253px) DICOM images where each DICOM image has multiple frames (average of 1000)

Let’s load the dependencies:

#Load the dependancies
from fastai2.basics import *
from fastai2.callback.all import *
from fastai2.vision.all import *
from fastai2.medical.imaging import *
import pydicom
import seaborn as sns
matplotlib.rcParams['image.cmap'] = 'bone'
from matplotlib.colors import ListedColormap, LinearSegmentedColormap

Having some knowledge about fastai2 is required and beyond the scope of this tutorial, the fastai2 docs page has some excellent tutorials to get you started quickly.

….a bit more about how data is stored in DICOM files

DICOM files are opened using pydicom.dcmread for example using the SIMM_SMALL dataset:

#get dicom files
items = get_dicom_files(pneumothorax_source, recurse=True, folders='train')

#now lets read a file:
img = items[10]
dimg = dcmread(img)

You can now view all the information contained within the DICOM file. Explanation of each element is beyond the scope of this tutorial but this site has some excellent information about each of the entries. Information is listed by the DICOM tag (eg: 0008, 0005) or DICOM keyword (eg: Specific Character Set).

#artificial-intelligence #medicine #dicom #data-science #data analysis

What is GEEK

Buddha Community

Understanding DICOMs

Understanding DICOMs

An_ In Depth Hands On__ approach to how to view and manipulate DICOM images using fastai2**’s_**_ medical imaging module and get them ready for machine learning._

Fastai2 will officially be released in July 2020

Image for post

What are DICOMs?

DICOM(Digital Imaging and COmmunications in Medicine) is the de-facto standard that establishes rules that allow medical images(X-Ray, MRI, CT) and associated information to be exchanged between imaging equipment from different vendors, computers, and hospitals. The DICOM format provides a suitable means that meets health information exchange (HIE) standards for transmission of health related data among facilities and HL7 standards which is the messaging standard that enables clinical applications to exchange data.

Image for post

Typical Radiology Workflow

DICOM files typically have a .dcm extension and provides a means of storing data in separate ‘tags’ such as patient information, image/pixel data, the machine used and alot more information (explained below).

A DICOM file predominantly consists of a header and image pixel intensity data packed into a single file. The information within the header is organized as a standardized series of tags. By extracting data from these tags one can access important information regarding the patient demographics, study parameters and a lot more.

Image for post

Parts of a DICOM

16 bit DICOM images have values ranging from -32768 to 32768 while**8-bit** grey-scale images store values from 0 to 255. The value ranges in DICOM images are useful as they correlate with the Hounsfield Scale which is a quantitative scale for describing radio-density (or a way of viewing different tissues densities — more explained below)

Installation Requirements

These are the dependencies you will need to have installed on your computer to be able to go through this tutorial

fastai2 which will officially be released July 2020, installation instructions can be viewed on their Github page: fastai2

Also requires installing pycidom (Pydicom is a python package for parsing DICOM files and makes it easy to covert DICOM files into pythonic structures for easier manipulation.

  • pip install pycidom

and scikit-image (is a collection of algorithms for image processing)

  • pip install scikit-image

and kornia (is a library of packages containing operators that can be inserted within neural networks to train models to perform image transformations, epipolar geometry, depth estimation, and low-level image processing such as filtering and edge detection that operate directly on tensors

  • pip install kornia

For more information on how to use fastai2’s medical imaging module head over to my [github](https://github.com/asvcode/MedicalImaging) page or my tutorial blog on medical imaging(which is better for viewing notebook tutorials :))

Datasets

Here is a list of 3 DICOM datasets that you can play around with. Each of these 3 datasets have different attributes and shows the vast diversity of what information can be contained within different DICOM datasets.

  • the SIIM_SMALL dataset ((250 DICOM files, ~30MB) is conveniently provided in the fastai library but is limited in some of its attributes, for example, it does not have RescaleIntercept or RescaleSlope and its pixel range is limited in the range of 0 and 255
  • Kaggle has an easily accessible (437MB) CT medical image dataset from the cancer imaging archive. The dataset consists of 100 images (512px by 512px) with pixel ranges from -2000 to +2000
  • The Thyroid Segmentation in Ultrasonography Dataset provides low quality (ranging from 253px by 253px) DICOM images where each DICOM image has multiple frames (average of 1000)

Let’s load the dependencies:

#Load the dependancies
from fastai2.basics import *
from fastai2.callback.all import *
from fastai2.vision.all import *
from fastai2.medical.imaging import *
import pydicom
import seaborn as sns
matplotlib.rcParams['image.cmap'] = 'bone'
from matplotlib.colors import ListedColormap, LinearSegmentedColormap

Having some knowledge about fastai2 is required and beyond the scope of this tutorial, the fastai2 docs page has some excellent tutorials to get you started quickly.

….a bit more about how data is stored in DICOM files

DICOM files are opened using pydicom.dcmread for example using the SIMM_SMALL dataset:

#get dicom files
items = get_dicom_files(pneumothorax_source, recurse=True, folders='train')

#now lets read a file:
img = items[10]
dimg = dcmread(img)

You can now view all the information contained within the DICOM file. Explanation of each element is beyond the scope of this tutorial but this site has some excellent information about each of the entries. Information is listed by the DICOM tag (eg: 0008, 0005) or DICOM keyword (eg: Specific Character Set).

#artificial-intelligence #medicine #dicom #data-science #data analysis

Guide to Understanding Generics in Java

Introduction

Java is a type-safe programming language. Type safety ensures a layer of validity and robustness in a programming language. It is a key part of Java’s security to ensure that operations done on an object are only performed if the type of the object supports it.

Type safety dramatically reduces the number of programming errors that might occur during runtime, involving all kinds of errors linked to type mismatches. Instead, these types of errors are caught during compile-time which is much better than catching errors during runtime, allowing developers to have less unexpected and unplanned trips to the good old debugger.

Type safety is also interchangeably called strong typing.

Java Generics is a solution designed to reinforce the type safety that Java was designed to have. Generics allow types to be parameterized onto methods and classes and introduces a new layer of abstraction for formal parameters. This will be explained in detail later on.

There are many advantages of using generics in Java. Implementing generics into your code can greatly improve its overall quality by preventing unprecedented runtime errors involving data types and typecasting.

This guide will demonstrate the declaration, implementation, use-cases, and benefits of generics in Java.

#java #guide to understanding generics in java #generics #generics in java #guide to understanding generics in java

Queenie  Davis

Queenie Davis

1621376340

Understand Artificial Intelligence (AI)

You were dreaming about understanding AI and machine learning? Well, this article is made for you. We are going to demystify AI.

What is machine learning?

Before we go further in this article, we do need to define what is machine learning.

To summarize, machine learning is the fact of solving a problem without telling a computer how to solve it. What I mean by that is that in classic programming you would write code to explain to the computer how to solve a problem and explain to him what are the _different step_s to do it. With machine learning, the computer is using statistical algorithms to solve a problem by itself thanks to input data. It does this, by finding the patterns between the input and the output of the problem.

What do I need to do to do machine learning?

To do machine learning, you will need data, a lot of data.

When you do have these data, what you will need to do is to split your data into two datasets:

  • The test dataset: the data you will use for testing if your machine learning model (algorithm) is properly working.
  • The** training dataset:** the data you will use for training your machine learning model.

So remember, data is key, you need to have a proper amount of data, and clean these data (we will talk about how to clean your data for your dataset in another article).

What can I use machine learning for?

You can actually use machine learning for solving a lot of problems. Here are a few examples:

  • Recommendations of products on e-commerce website (Amazon, eBay, …).
  • Recommendations for a search engine website (Google, Facebook search, …).
  • Netflix uses it as well to recommend movies and TV series depending on what you actually like.
  • Youtube to put the subtitles under your videos, …

How can I teach machines to learn?

There are different ways for machines to learn, here are the four most popular ways:

  • Supervised learning: your model will learn thanks to input labeled data that you provide to it (your data are already tagged with the correct labels). Which means that we show the correct answers to the machine. It can be used for classifying data, for example, classify cats by breeds.
  • Unsupervised learning: your model will learn by observing. Which means that it will learn and improve by trial and error. In that case, we are not working with labeled data, so we don’t show the machine the correct answer. It can be used for clustering data, for example, group the loyal customers.
  • Semi-supervised learning: your model starts with a small dataset and applies supervised learning (labeled data). Then we will feed the rest of the data to our model and observe them by applying unsupervised learning (non-labeled data). This will allow the computer to expand its vocabulary based on what it learned and classified during the supervised learning stage.
  • Reinforcement learning: we train our model by rewarding it every time it has the correct output. Then the computer will try to get as many rewards as possible and will learn by itself. It can be used to create an AI for video games.

#data-science #supervised-learning #understand #artificial #intelligence #ai

Art  Lind

Art Lind

1604235720

A Python script to sort DICOM files

This article is a follow-up to my previous introduction to DICOM files. Special thanks to my good friend Dr. Gian Marco Conte for helping write this.

As a brief recap, DICOM files are the primary format for storing medical images. All clinical algorithms must be able to read and write DICOM. But these files can be challenging to organize. DICOM files have information associated with the image saved in a header, which can be extensive. Files are structured in 4 tiers:

  1. Patient
  2. Study
  3. Series
  4. Instance

In this tutorial, I’ll share some python code that reads a set of DICOM files, extracts the header information, and copies the files to a tiered folder structure that can be easily loaded for data science tasks.

There are many great resources available for parsing DICOM using Python or other languages. DicomSort has a flexible GUI which can organize files based on any field in the header (DicomSort is also available as a Python package with “pip install dicomsort”). I also want to credit this repo for getting me started with code for reading a DICOM pixel dataset. Finally, this great paper includes a section on image compression which I briefly mention here.

Ultimately I decided to write my own utility because I like knowing exactly what my code is doing, and it also provides an introduction to the DICOM header which is essential knowledge for any data scientist who works on medical imaging projects.

I’ve verified this code for both CT and MRI exams; it should work for any modality — Patient, Study, and Series information is reported for all DICOM files.

Required Code Packages

This code uses the Python package PyDicom for reading and writing DICOM files.

I want to briefly mention the GDCM package. DICOM files may have image compression performed on them either during storage or during transfer via the DICOM receiver. For example, at our institution, all DICOMs have JPEG2000 compression. GDCM is a C-based package that allows PyDicom to read these encrypted files. It’s available as a conda package (“conda install gdcm”) or built from source using cmake. I snuck a few lines in my code below which decompresses the pixel data using GDCM, so I don’t have to worry about it in the future.

#dicom #data-science #medical-imaging

Angela  Dickens

Angela Dickens

1598796300

Understand Big O notation in 7 minutes

Understand Big O notation in 7 minutes

The Big O notation is a notion often completely ignored by developers. It is however a fundamental notion, more than useful and simple to understand. Contrary to popular belief, you don’t need to be a math nerd to master it. I bet you that in 7 minutes, you’ll understand everything.

What is the Big O notation ?

The Big O notation (or algorithm complexity) is a standard way to measure the performance of an algorithm.** It is a mathematical way of judging the effectiveness of your code.** I said the word mathematics and scared everyone away. Again, you don’t need to have a passion for math to understand and use this notation.

This notation will allow you to measure the growth rate of your algorithm in relation to the input data.** It will describe the worst possible case for the performance of your code.** Today, we are not going to talk about space complexity, but only about time complexity.

And it’s not about putting a timer before and after a function to see how long it takes.

big o notation

The problem is that the timer technique is anything but reliable and accurate. With a simple timer the performance of your algo will change greatly depending on many factors.

  • Your machine and processors
  • The language you use
  • The load on your machine when you run your test

The Big O notation solves all these problems and allows us to have a reliable measure of the efficiency of all the code you produce. The Big O is a little name for “order of magnitude”.

#technical #understand #big o notation #big data