https://www.justsoftwaresolutions.co.uk/threading/petersons_lock_with_C++0x_atomics.html
I wrote comments and asked two questions and have another question about Anthony's reply.
here is the reply:
"1. The acquire/release on the flag0 and flag1 variables are necessary to ensure that it acts as a lock: the release store in the unlock synchronizes with the acquire-load in the next lock, to ensure that the data modified while the lock was held is now visible to the second thread."
I have written a peterson lock in C
typedef struct {
volatile bool flag[2];
volatile int victim;
} peterson_lock_t;
void peterson_lock_init(peterson_lock_t &lock) {
lock.flag[0] = lock.flag[1] = false;
lock.victim = 0;
}
void peterson_lock(peterson_lock_t &lock, int id) {
lock.flag[id] = true;
lock.victim = id;
asm volatile ("mfence" : : : "memory");
while (lock.flag[1 - id] && lock.victim == id) {
};
}
void peterson_unlock(peterson_lock_t &lock, int id) {
lock.flag[id] = false;
}
I tested it and I think it's correct, right?
If it's right, my question is do I need to add sfence and lfence to “make sure the data modified while the lock was held is now visible to the second thread” ?
like this,
void peterson_lock(peterson_lock_t &lock, int id) {
lock.flag[id] = true;
lock.victim = id;
asm volatile ("mfence" : : : "memory");
asm volatile ("lfence" : : : "memory"); // here, I think this is unnecessary, since mfence will flush load buffer
while (lock.flag[1 - id] && lock.victim == id) {
};
}
void peterson_unlock(peterson_lock_t &lock, int id) {
asm volatile ("sfence" : : : "memory"); // here
lock.flag[id] = false;
}
I think no need to do this.
My understanding is that on x86/64 'store' has a release semantics, and 'load' has a acquire semantics(the root reason is on x86/64 there is only store load reorder),
and 'lock.flag[id]= false' is a 'store', 'lock.flag[1 - id] ' is a 'load',
so there is no need to do things like the acquire/release on the flag0 and flag1 in Dmitriy's implementation
EDIT @Anthony
very appreciate your replay.
Yes, I need to avoid compiler reorder.
So, modification like below, is it correct?
Because for x86, only need to forbidden compiler reorder in 'peterson_unlock'
void peterson_lock(peterson_lock_t &lock, int id) {
lock.flag[id] = true;
lock.victim = id;
asm volatile ("mfence" : : : "memory");
while (lock.flag[1 - id] && lock.victim == id) {
};
}
void peterson_unlock(peterson_lock_t &lock, int id) {
asm volatile ("" : : : "memory"); // here, forbidden compiler reorder
lock.flag[id] = false;
}
No comments:
Post a Comment