When Microsoft shipped .NET 4.6 last summer they also released a new 64-bit JIT compiler named RyuJIT. The main goal was to improve the load times of 64 bit applications, but it also allows developers to get more performance from modern processors via SIMD intrinsics. This post looks at what SIMD intrinsics are, how RyuJIT enables .NET developers to take advantage of them, some useful patterns for using SIMD in C#, and what sort of gains you can expect to see. A follow up post takes a more detailed low level look at how it works.

Note, all the code from this post is available here, on GitHub.

The Basics of SIMD (Single Instruction Multiple Data)

A CPU carries out its job by executing instructions, and the specific instructions that a CPU knows how to execute are defined by the instruction set (e.g. x86, x86_64) and instruction set extensions (e.g. SSE, AVX, AVX-512) that it implements. Many SIMD instructions are available via these instruction set extensions.

SIMD instructions allow multiple calculations to be carried out simultaneously on a single core by using a register that is multiple times bigger than the data being processed. So, for example, using 256-bit registers you can perform 8 32-bit calculations with a single machine code instruction.

#c# #simd #parallelism

Parallelism on a Single Core - SIMD with C#
1.85 GEEK