href="https://www.justsoftwaresolutions.co.uk/threading/petersons_lock_with_C++0x_atomics.html"
rel="nofollow
noreferrer">https://www.justsoftwaresolutions.co.uk/threading/petersons_lock_with_C++0x_atomics.html
I
wrote comments and asked two questions and have another question about Anthony's
reply.
here is the reply:
"1. The
acquire/release on the flag0 and flag1 variables are necessary to ensure that it acts as
a lock: the release store in the unlock synchronizes with the acquire-load in the next
lock, to ensure that the data modified while the lock was held is now visible to the
second thread."
I have written a
peterson lock in C
typedef struct
{
volatile bool flag[2];
volatile int victim;
}
peterson_lock_t;
void peterson_lock_init(peterson_lock_t
&lock) {
lock.flag[0] = lock.flag[1] = false;
lock.victim = 0;
}
void peterson_lock(peterson_lock_t
&lock, int id) {
lock.flag[id] = true;
lock.victim =
id;
asm volatile ("mfence" : : : "memory");
while (lock.flag[1 -
id] && lock.victim == id) {
};
}
void peterson_unlock(peterson_lock_t &lock, int
id) {
lock.flag[id] = false;
}
I tested it and I
think it's correct, right?
If it's right, my
question is do I need to add sfence and lfence to “make sure the data modified while the
lock was held is now visible to the second thread” ?
like this,
void
peterson_lock(peterson_lock_t &lock, int id) {
lock.flag[id] =
true;
lock.victim = id;
asm volatile ("mfence" : : :
"memory");
asm volatile ("lfence" : : : "memory"); // here, I think this is
unnecessary, since mfence will flush load buffer
while (lock.flag[1 - id]
&& lock.victim == id) {
};
}
void peterson_unlock(peterson_lock_t &lock, int
id) {
asm volatile ("sfence" : : : "memory"); // here
lock.flag[id] = false;
}
I think no need to
do this.
My understanding is that on x86/64 'store' has a release semantics,
and 'load' has a acquire semantics(the root reason is on x86/64 there is only store load
reorder),
and 'lock.flag[id]= false' is a 'store', 'lock.flag[1 - id] ' is a
'load',
so there is no need to do things like the acquire/release on the flag0
and flag1 in Dmitriy's
implementation
EDIT
@Anthony
very appreciate your replay.
Yes, I need to avoid compiler
reorder.
So, modification like below, is it correct?
Because for
x86, only need to forbidden compiler reorder in
'peterson_unlock'
void
peterson_lock(peterson_lock_t &lock, int id) {
lock.flag[id] =
true;
lock.victim = id;
asm volatile ("mfence" : : :
"memory");
while (lock.flag[1 - id] && lock.victim == id)
{
};
}
void peterson_unlock(peterson_lock_t
&lock, int id) {
asm volatile ("" : : : "memory"); // here, forbidden
compiler reorder
lock.flag[id] =
false;
}
No comments:
Post a Comment