Friday, November 14, 2008

volatile field and memory barrier: look inside

I've seen a lot of discussions in the web regarding volatile field. I've performed my own small investigation regarding this subject and here is some thoughts on this:

The two main purposes of c# volatile fields are the following ones:

1. Introduce memory barriers for all access operations to this fields. In order to improve performance CPUs store frequently accessible objects in CPU cache. In case of multi-threaded applications this can cause problems. For instance, imagine situation, when one thread is constantly reading some boolean value (read thread) and another one is responsible for updating this field (write thread). Now, if OS will decide to run these two threads on different CPUs, it is possible, that update thread will change value of the field on CPU1 cache and read thread will continue reading this value from CPU2 cache, in other words, it will get the change of thread1 until CPU1 cache is invalidated. Situation can be even worth if two threads update this value.
volatile field introduces memory barriers, which means, that CPU always will read from and write to virtual memory, but not to CPU cache.
Nowadays such CPU architectures as x86 and x64 have CPU cache coherency, which means that any change in CPU cache of one processor will be propagated to other CPUs' caches. And, in it's turn, it means that JIT compiler for x86 and x64 platforms makes no difference between volatile and non-volatile fields (except stated in item #2). Also, multicore CPUs usually have two levels of cache: first level is shared between CPU cores and second one is not.
But, such CPU architectures as Itanium with weak memory model does not have cache coherency and therefore volatile keyword and memory barriers play significant role while designing multi-threaded application.
Therefore, I'd recommend always to use volatile and momemory barriers even for x86 and x64 CPUs, because otherwise you introduce CPU architecture affinitty to your application.

Note: you can also introduce memory barriers by using Thread.VolatileRead/Thread.VolatileWrite (these two method successfully replace volatile keyword), Thread.MemoryBarrier, or even with c# lock keyword etc.

Below are displayed two CPU architectures: Itanium and AMD (Direct connect architecture). As we can see in AMD's Direct Connect architecture all processors are connected with each other, so we have memory coherence. In Itanium architecture CPU are not connected with each other and communicated with RAM through System Bus.


2. Prevents instruction reordering. For instance, consider we have a loop:
while(true)
{
if(myField)
{
//do something
}
}
In case of non-volatile field, during JIT compilation, JIT compiler due to performance considerations can reorder instructions in the fo9llowing manner:
if(myField)
{
while(true)
{
//do something
}
}

In case if you plan to change myField from separate thread, this significant difference, isn't it?


Usually it is recommended to use lock statement (Monitor.Enter or Monitor.Exit), but if you change only one field within this block, then volatile field will perform significantly better than Monitor class.

No comments:

 
Counter