Thursday 15 August 2019

memory - peterson algorithm about data modification in lock visible to second thread with mfence/lfence/sfence

https://www.justsoftwaresolutions.co.uk/threading/petersons_lock_with_C++0x_atomics.html

I wrote comments and asked two questions and have another question about Anthony's reply.
here is the reply:



"1. The acquire/release on the flag0 and flag1 variables are necessary to ensure that it acts as a lock: the release store in the unlock synchronizes with the acquire-load in the next lock, to ensure that the data modified while the lock was held is now visible to the second thread."



I have written a peterson lock in C



typedef struct {
volatile bool flag[2];
volatile int victim;
} peterson_lock_t;


void peterson_lock_init(peterson_lock_t &lock) {
lock.flag[0] = lock.flag[1] = false;
lock.victim = 0;
}

void peterson_lock(peterson_lock_t &lock, int id) {
lock.flag[id] = true;
lock.victim = id;
asm volatile ("mfence" : : : "memory");

while (lock.flag[1 - id] && lock.victim == id) {
};
}

void peterson_unlock(peterson_lock_t &lock, int id) {
lock.flag[id] = false;
}


I tested it and I think it's correct, right?




If it's right, my question is do I need to add sfence and lfence to “make sure the data modified while the lock was held is now visible to the second thread” ?
like this,



  void peterson_lock(peterson_lock_t &lock, int id) {
lock.flag[id] = true;
lock.victim = id;
asm volatile ("mfence" : : : "memory");
asm volatile ("lfence" : : : "memory"); // here, I think this is unnecessary, since mfence will flush load buffer
while (lock.flag[1 - id] && lock.victim == id) {

};
}

void peterson_unlock(peterson_lock_t &lock, int id) {
asm volatile ("sfence" : : : "memory"); // here
lock.flag[id] = false;
}


I think no need to do this.

My understanding is that on x86/64 'store' has a release semantics, and 'load' has a acquire semantics(the root reason is on x86/64 there is only store load reorder),
and 'lock.flag[id]= false' is a 'store', 'lock.flag[1 - id] ' is a 'load',
so there is no need to do things like the acquire/release on the flag0 and flag1 in Dmitriy's implementation



EDIT @Anthony
very appreciate your replay.
Yes, I need to avoid compiler reorder.
So, modification like below, is it correct?
Because for x86, only need to forbidden compiler reorder in 'peterson_unlock'




void peterson_lock(peterson_lock_t &lock, int id) {
lock.flag[id] = true;
lock.victim = id;
asm volatile ("mfence" : : : "memory");
while (lock.flag[1 - id] && lock.victim == id) {
};
}

void peterson_unlock(peterson_lock_t &lock, int id) {
asm volatile ("" : : : "memory"); // here, forbidden compiler reorder

lock.flag[id] = false;
}

No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print &q...