Can the Intel PMU be used
to measure per-core read/write memory bandwidth usage? Here "memory" means to DRAM
(i.e., not hitting in any cache level).
Sunday 15 October 2017
x86 - Can the Intel performance monitor counters be used to measure memory bandwidth?
Yes, this
is possible, although it is not necessarily as straightforward as programming the usual
PMU counters.
One approach is to use the
programmable memory controller counters which are accessed via PCI space. A good place
to start is by examining Intel's own implementation in
pcm-memory
at href="https://github.com/opcm/pcm/blob/master/pcm-memory.cpp" rel="nofollow
noreferrer">pcm-memory.cpp. This app shows you the per-socket or
per-memory-controller throughput, which is suitable for some uses. In particular, the
bandwidth is shared among all cores, so on a quiet machine you can assume most of the
bandwidth is associated with the process under test, or if you wanted to monitor at the
socket level it's exactly what you want.
The
other alternative is to use careful programming of the "offcore repsonse" counters.
These, as far as I know, relate to traffic between the L2 (the last core-private cache)
and the rest of the system. You can filter by the result of the offcore response, so you
can use a combination of the various "L3 miss" events and multiply by the cache line
size to get a read and write bandwidth. The events are quite fine grained, so you can
further break it down by the what caused the access in the first place: instruction
fetch, data demand requests, prefetching, etc,
etc.
The offcore response counters generally lag
behind in support by tools like perf
and
likwid
but at least recent versions seem to have reasonable
support, even for client parts like SKL.
php - file_get_contents shows unexpected output while reading a file
I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print &q...
-
I have an app which needs a login and a registration with SQLite. I have the database and a user can login and register. But i would like th...
-
I got an error in my Java program. I think this happens because of the constructor is not intialized properly. My Base class Program public ...
-
I would like to use enhanced REP MOVSB (ERMSB) to get a high bandwidth for a custom memcpy . ERMSB was introduced with the Ivy Bridge micro...
No comments:
Post a Comment