PMC.P6(3) BSD Library Functions Manual PMC.P6(3)NAMEpmc.p6 — measurement events for Intel Pentium Pro, P-II, P-III family
CPUs
LIBRARY
Performance Counters Library (libpmc, -lpmc)
SYNOPSIS
#include <pmc.h>
DESCRIPTION
Intel P6 PMCs are present in Intel Pentium Pro, Pentium II, Celeron,
Pentium III and Pentium M processors.
They are documented in "Volume 3: System Programming Guide", IA-32
Intel(R) Architecture Software Developer's Manual, Order Number
245472-012, Intel Corporation, 2003.
Some of these events are affected by processor errata described in
Intel(R)Pentium(R) III Processor Specification Update, Document Number:
244453-054, Intel Corporation, April 2005.
PMC Features
These CPUs have two counters, each 40 bits wide. Some events may only be
used on specific counters and some events are defined only on specific
processor models. These PMCs support the following capabilities:
Capability Support
PMC_CAP_CASCADE No
PMC_CAP_EDGE Yes
PMC_CAP_INTERRUPT Yes
PMC_CAP_INVERT Yes
PMC_CAP_READ Yes
PMC_CAP_PRECISE No
PMC_CAP_SYSTEM Yes
PMC_CAP_TAGGING No
PMC_CAP_THRESHOLD Yes
PMC_CAP_USER Yes
PMC_CAP_WRITE Yes
Event Qualifiers
Event specifiers for Intel P6 PMCs can have the following common quali‐
fiers:
cmask=value
Configure the PMC to increment only if the number of configured
events measured in a cycle is greater than or equal to value.
edge Configure the PMC to count the number of deasserted to asserted
transitions of the conditions expressed by the other qualifiers.
If specified, the counter will increment only once whenever a
condition becomes true, irrespective of the number of clocks dur‐
ing which the condition remains true.
inv Invert the sense of comparision when the “cmask” qualifier is
present, making the counter increment when the number of events
per cycle is less than the value specified by the “cmask” quali‐
fier.
os Configure the PMC to count events happening at processor privi‐
lege level 0.
umask=value
This qualifier is used to further qualify the event selected (see
below).
usr Configure the PMC to count events occurring at privilege levels
1, 2 or 3.
If neither of the “os” or “usr” qualifiers are specified, the default is
to enable both.
The event specifiers supported by Intel P6 PMCs are:
p6-baclears
(Event E6H) Count the number of times a static branch prediction
was made by the branch decoder because the BTB did not have a
prediction.
p6-br-bac-missp-exec
(Event 8AH, Pentium M) Count the number of branch instructions
executed that where mispredicted at the Front End (BAC).
p6-br-bogus
(Event E4H) Count the number of bogus branches.
p6-br-call-exec
(Event 92H, Pentium M) Count the number of call instructions exe‐
cuted.
p6-br-call-missp-exec
(Event 93H, Pentium M) Count the number of call instructions exe‐
cuted that were mispredicted.
p6-br-cnd-exec
(Event 8BH, Pentium M) Count the number of conditional branch
instructions executed.
p6-br-cnd-missp-exec
(Event 8CH, Pentium M) Count the number of conditional branch
instructions executed that were mispredicted.
p6-br-ind-call-exec
(Event 94H, Pentium M) Count the number of indirect call instruc‐
tions executed.
p6-br-ind-exec
(Event 8DH, Pentium M) Count the number of indirect branch
instructions executed.
p6-br-ind-missp-exec
(Event 8EH, Pentium M) Count the number of indirect branch
instructions executed that were mispredicted.
p6-br-inst-decoded
(Event E0H) Count the number of branch instructions decoded.
p6-br-inst-exec
(Event 88H, Pentium M) Count the number of branch instructions
executed but necessarily retired.
p6-br-inst-retired
(Event C4H) Count the number of branch instructions retired.
p6-br-miss-pred-retired
(Event C5H) Count the number of mispredicted branch instructions
retired.
p6-br-miss-pred-taken-ret
(Event C9H) Count the number of taken mispredicted branches
retired.
p6-br-missp-exec
(Event 89H, Pentium M) Count the number of branch instructions
executed that were mispredicted at execution.
p6-br-ret-bac-missp-exec
(Event 91H, Pentium M) Count the number of return instructions
executed that were mispredicted at the Front End (BAC).
p6-br-ret-exec
(Event 8FH, Pentium M) Count the number of return instructions
executed.
p6-br-ret-missp-exec
(Event 90H, Pentium M) Count the number of return instructions
executed that were mispredicted at execution.
p6-br-taken-retired
(Event C9H) Count the number of taken branches retired.
p6-btb-misses
(Event E2H) Count the number of branches for which the BTB did
not produce a prediction.
p6-bus-bnr-drv
(Event 61H) Count the number of bus clock cycles during which
this processor is driving the BNR# pin.
p6-bus-data-rcv
(Event 64H) Count the number of bus clock cycles during which
this processor is receiving data.
p6-bus-drdy-clocks [,umask=qualifier]
(Event 62H) Count the number of clocks during which DRDY# is
asserted. An additional qualifier may be specified, and com‐
prises one of the following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-hit-drv
(Event 7AH) Count the number of bus clock cycles during which
this processor is driving the HIT# pin.
p6-bus-hitm-drv
(Event 7BH) Count the number of bus clock cycles during which
this processor is driving the HITM# pin.
p6-bus-lock-clocks [,umask=qualifier]
(Event 63H) Count the number of clocks during with LOCK# is
asserted on the external system bus. An additional qualifier may
be specified and comprises one of the following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-req-outstanding
(Event 60H) Count the number of bus requests outstanding in any
given cycle.
p6-bus-snoop-stall
(Event 7EH) Count the number of clock cycles during which the bus
is snoop stalled.
p6-bus-tran-any [,umask=qualifier]
(Event 70H) Count the number of completed bus transactions of any
kind. An additional qualifier may be specified and comprises one
of the following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-tran-brd [,umask=qualifier]
(Event 65H) Count the number of burst read transactions. An
additional qualifier may be specified and comprises one of the
following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-tran-burst [,umask=qualifier]
(Event 6EH) Count the number of completed burst transactions. An
additional qualifier may be specified and comprises one of the
following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-tran-def [,umask=qualifier]
(Event 6DH) Count the number of completed deferred transactions.
An additional qualifier may be specified and comprises one of the
following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-tran-ifetch [,umask=qualifier]
(Event 68H) Count the number of completed instruction fetch
transactions. An additional qualifier may be specified and com‐
prises one of the following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-tran-inval [,umask=qualifier]
(Event 69H) Count the number of completed invalidate transac‐
tions. An additional qualifier may be specified and comprises
one of the following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-tran-mem [,umask=qualifier]
(Event 6FH) Count the number of completed memory transactions.
An additional qualifier may be specified and comprises one of the
following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-tran-pwr [,umask=qualifier]
(Event 6AH) Count the number of completed partial write transac‐
tions. An additional qualifier may be specified and comprises
one of the following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-tran-rfo [,umask=qualifier]
(Event 66H) Count the number of completed read-for-ownership
transactions. An additional qualifier may be specified and com‐
prises one of the following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-trans-io [,umask=qualifier]
(Event 6CH) Count the number of completed I/O transactions. An
additional qualifier may be specified and comprises one of the
following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-trans-p [,umask=qualifier]
(Event 6BH) Count the number of completed partial transactions.
An additional qualifier may be specified and comprises one of the
following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-bus-trans-wb [,umask=qualifier]
(Event 67H) Count the number of completed write-back transac‐
tions. An additional qualifier may be specified and comprises
one of the following keywords:
any Count transactions generated by any agent on the bus.
self Count transactions generated by this processor.
The default is to count operations generated by this processor.
p6-cpu-clk-unhalted
(Event 79H) Count the number of cycles during with the processor
was not halted.
(Pentium M) Count the number of cycles during with the processor
was not halted and not in a thermal trip.
p6-cycles-div-busy
(Event 14H) Count the number of cycles during which the divider
is busy and cannot accept new divides. This event is only allo‐
cated on counter 0.
p6-cycles-int-pending-and-masked
(Event C7H) Count the number of processor cycles for which inter‐
rupts were disabled and interrupts were pending.
p6-cycles-int-masked
(Event C6H) Count the number of processor cycles for which inter‐
rupts were disabled.
p6-data-mem-refs
(Event 43H) Count all loads and all stores using any memory type,
including internal retries. Each part of a split store is
counted separately.
p6-dcu-lines-in
(Event 45H) Count the total lines allocated in the data cache
unit.
p6-dcu-m-lines-in
(Event 46H) Count the number of M state lines allocated in the
data cache unit.
p6-dcu-m-lines-out
(Event 47H) Count the number of M state lines evicted from the
data cache unit.
p6-dcu-miss-outstanding
(Event 48H) Count the weighted number of cycles while a data
cache unit miss is outstanding, incremented by the number of out‐
standing cache misses at any time.
p6-div (Event 13H) Count the number of integer and floating-point
divides including speculative divides. This event is only allo‐
cated on counter 1.
p6-emon-esp-uops
(Event D7H, Pentium M) Count the total number of micro-ops.
p6-emon-est-trans [,umask=qualifier]
(Event 58H, Pentium M) Count the number of Enhanced Intel
SpeedStep transitions. An additional qualifier may be specified,
and can be one of the following keywords:
all Count all transitions.
freq Count only frequency transitions.
The default is to count all transitions.
p6-emon-fused-uops-ret [,umask=qualifier]
(Event DAH, Pentium M) Count the number of retired fused micro-
ops. An additional qualifier may be specified, and may be one of
the following keywords:
all Count all fused micro-ops.
loadop Count only load and op micro-ops.
stdsta Count only STD/STA micro-ops.
The default is to count all fused micro-ops.
p6-emon-kni-comp-inst-ret
(Event D9H, Pentium III) Count the number of SSE computational
instructions retired. An additional qualifier may be specified,
and comprises one of the following keywords:
packed-and-scalar
Count packed and scalar operations.
scalar Count scalar operations only.
The default is to count packed and scalar operations.
p6-emon-kni-inst-retired [,umask=qualifier]
(Event D8H, Pentium III) Count the number of SSE instructions
retired. An additional qualifier may be specified, and comprises
one of the following keywords:
packed-and-scalar
Count packed and scalar operations.
scalar Count scalar operations only.
The default is to count packed and scalar operations.
p6-emon-kni-pref-dispatched [,umask=qualifier]
(Event 07H, Pentium III) Count the number of SSE prefetch or
weakly ordered instructions dispatched (including speculative
prefetches). An additional qualifier may be specified, and com‐
prises one of the following keywords:
nta Count non-temporal prefetches.
t1 Count prefetches to L1.
t2 Count prefetches to L2.
wos Count weakly ordered stores.
The default is to count non-temporal prefetches.
p6-emon-kni-pref-miss [,umask=qualifier]
(Event 4BH, Pentium III) Count the number of prefetch or weakly
ordered instructions that miss all caches. An additional quali‐
fier may be specified, and comprises one of the following key‐
words:
nta Count non-temporal prefetches.
t1 Count prefetches to L1.
t2 Count prefetches to L2.
wos Count weakly ordered stores.
The default is to count non-temporal prefetches.
p6-emon-pref-rqsts-dn
(Event F8H, Pentium M) Count the number of downward prefetches
issued.
p6-emon-pref-rqsts-up
(Event F0H, Pentium M) Count the number of upward prefetches
issued.
p6-emon-simd-instr-retired
(Event CEH, Pentium M) Count the number of retired MMX instruc‐
tions.
p6-emon-sse-sse2-comp-inst-retired [,umask=qualifier]
(Event D9H, Pentium M) Count the number of computational SSE
instructions retired. An additional qualifier may be specified
and can be one of the following keywords:
sse-packed-single
Count SSE packed-single instructions.
sse-scalar-single
Count SSE scalar-single instructions.
sse2-packed-double
Count SSE2 packed-double instructions.
sse2-scalar-double
Count SSE2 scalar-double instructions.
The default is to count SSE packed-single instructions.
p6-emon-sse-sse2-inst-retired [,umask=qualifer]
(Event D8H, Pentium M) Count the number of SSE instructions
retired. An additional qualifier can be specified, and can be
one of the following keywords:
sse-packed-single
Count SSE packed-single instructions.
sse-packed-single-scalar-single
Count SSE packed-single and scalar-single instructions.
sse2-packed-double
Count SSE2 packed-double instructions.
sse2-scalar-double
Count SSE2 scalar-double instructions.
The default is to count SSE packed-single instructions.
p6-emon-synch-uops
(Event D3H, Pentium M) Count the number of sync micro-ops.
p6-emon-thermal-trip
(Event 59H, Pentium M) Count the duration or occurrences of ther‐
mal trips. Use the “edge” qualifier to count occurrences of
thermal trips.
p6-emon-unfusion
(Event DBH, Pentium M) Count the number of unfusion events in the
reorder buffer.
p6-flops
(Event C1H) Count the number of computational floating point
operations retired. This event is only allocated on counter 0.
p6-fp-assist
(Event 11H) Count the number of floating point exceptions handled
by microcode. This event is only allocated on counter 1.
p6-fp-comps-ops-exe
(Event 10H) Count the number of computation floating point opera‐
tions executed. This event is only allocated on counter 0.
p6-fp-mmx-trans [,umask=qualifier]
(Event CCH, Pentium II, Pentium III) Count the number of transi‐
tions between MMX and floating-point instructions. An additional
qualifier may be specified, and comprises one of the following
keywords:
mmxtofp
Count transitions from MMX instructions to floating-point
instructions.
fptommx
Count transitions from floating-point instructions to MMX
instructions.
The default is to count MMX to floating-point transitions.
p6-hw-int-rx
(Event C8H) Count the number of hardware interrupts received.
p6-ifu-ifetch
(Event 80H) Count the number of instruction fetches, both
cacheable and non-cacheable.
p6-ifu-ifetch-miss
(Event 81H) Count the number of instruction fetch misses (i.e.,
those that produce memory accesses).
p6-ifu-mem-stall
(Event 86H) Count the number of cycles instruction fetch is
stalled for any reason.
p6-ild-stall
(Event 87H) Count the number of cycles the instruction length
decoder is stalled.
p6-inst-decoded
(Event D0H) Count the number of instructions decoded.
p6-inst-retired
(Event C0H) Count the number of instructions retired.
p6-itlb-miss
(Event 85H) Count the number of instruction TLB misses.
p6-l2-ads
(Event 21H) Count the number of L2 address strobes.
p6-l2-dbus-busy
(Event 22H) Count the number of cycles during which the L2 cache
data bus was busy.
p6-l2-dbus-busy-rd
(Event 23H) Count the number of cycles during which the L2 cache
data bus was busy transferring read data from L2 to the proces‐
sor.
p6-l2-ifetch [,umask=qualifier]
(Event 28H) Count the number of L2 instruction fetches. An addi‐
tional qualifier may be specified and comprises a list of the
following keywords separated by ‘+’ characters:
e Count operations affecting E (exclusive) state lines.
i Count operations affecting I (invalid) state lines.
m Count operations affecting M (modified) state lines.
s Count operations affecting S (shared) state lines.
The default is to count operations affecting all (MESI) state
lines.
p6-l2-ld [,umask=qualifier]
(Event 29H) Count the number of L2 data loads. An additional
qualifier may be specified and comprises a list of the following
keywords separated by ‘+’ characters:
both (Pentium M) Count both hardware-prefetched lines and non-
hardware-prefetched lines.
e Count operations affecting E (exclusive) state lines.
hw (Pentium M) Count hardware-prefetched lines only.
i Count operations affecting I (invalid) state lines.
m Count operations affecting M (modified) state lines.
nonhw (Pentium M) Exclude hardware-prefetched lines.
s Count operations affecting S (shared) state lines.
The default on processors other than Pentium M processors is to
count operations affecting all (MESI) state lines. The default
on Pentium M processors is to count both hardware-prefetched and
non-hardware-prefetch operations on all (MESI) state lines.
(Errata) This event is affected by processor errata E53.
p6-l2-lines-in [,umask=qualifier]
(Event 24H) Count the number of L2 lines allocated. An addi‐
tional qualifier may be specified and comprises a list of the
following keywords separated by ‘+’ characters:
both (Pentium M) Count both hardware-prefetched lines and non-
hardware-prefetched lines.
e Count operations affecting E (exclusive) state lines.
hw (Pentium M) Count hardware-prefetched lines only.
i Count operations affecting I (invalid) state lines.
m Count operations affecting M (modified) state lines.
nonhw (Pentium M) Exclude hardware-prefetched lines.
s Count operations affecting S (shared) state lines.
The default on processors other than Pentium M processors is to
count operations affecting all (MESI) state lines. The default
on Pentium M processors is to count both hardware-prefetched and
non-hardware-prefetch operations on all (MESI) state lines.
(Errata) This event is affected by processor errata E45.
p6-l2-lines-out [,umask=qualifier]
(Event 26H) Count the number of L2 lines evicted. An additional
qualifier may be specified and comprises a list of the following
keywords separated by ‘+’ characters:
both (Pentium M) Count both hardware-prefetched lines and non-
hardware-prefetched lines.
e Count operations affecting E (exclusive) state lines.
hw (Pentium M) Count hardware-prefetched lines only.
i Count operations affecting I (invalid) state lines.
m Count operations affecting M (modified) state lines.
nonhw (Pentium M only) Exclude hardware-prefetched lines.
s Count operations affecting S (shared) state lines.
The default on processors other than Pentium M processors is to
count operations affecting all (MESI) state lines. The default
on Pentium M processors is to count both hardware-prefetched and
non-hardware-prefetch operations on all (MESI) state lines.
(Errata) This event is affected by processor errata E45.
p6-l2-m-lines-inm
(Event 25H) Count the number of modified lines allocated in L2
cache.
p6-l2-m-lines-outm [,umask=qualifier]
(Event 27H) Count the number of L2 M-state lines evicted.
(Pentium M) On these processors an additional qualifier may be
specified and comprises a list of the following keywords sepa‐
rated by ‘+’ characters:
both Count both hardware-prefetched lines and non-hardware-
prefetched lines.
hw Count hardware-prefetched lines only.
nonhw Exclude hardware-prefetched lines.
The default is to count both hardware-prefetched and non-hard‐
ware-prefetch operations. (Errata) This event is affected by
processor errata E53.
p6-l2-rqsts [,umask=qualifier]
(Event 2EH) Count the total number of L2 requests. An additional
qualifier may be specified and comprises a list of the following
keywords separated by ‘+’ characters:
e Count operations affecting E (exclusive) state lines.
i Count operations affecting I (invalid) state lines.
m Count operations affecting M (modified) state lines.
s Count operations affecting S (shared) state lines.
The default is to count operations affecting all (MESI) state
lines.
p6-l2-st
(Event 2AH) Count the number of L2 data stores. An additional
qualifier may be specified and comprises a list of the following
keywords separated by ‘+’ characters:
e Count operations affecting E (exclusive) state lines.
i Count operations affecting I (invalid) state lines.
m Count operations affecting M (modified) state lines.
s Count operations affecting S (shared) state lines.
The default is to count operations affecting all (MESI) state
lines.
p6-ld-blocks
(Event 03H) Count the number of load operations delayed due to
store buffer blocks.
p6-misalign-mem-ref
(Event 05H) Count the number of misaligned data memory references
(crossing a 64 bit boundary).
p6-mmx-assist
(Event CDH, Pentium II, Pentium III) Count the number of MMX
assists executed.
p6-mmx-instr-exec
(Event B0H) (Celeron, Pentium II) Count the number of MMX
instructions executed, except MOVQ and MOVD stores from register
to memory.
p6-mmx-instr-ret
(Event CEH, Pentium II) Count the number of MMX instructions
retired.
p6-mmx-instr-type-exec [,umask=qualifier]
(Event B3H, Pentium II, Pentium III) Count the number of MMX
instructions executed. An additional qualifier may be specified
and comprises a list of the following keywords separated by ‘+’
characters:
pack Count MMX pack operation instructions.
packed-arithmetic
Count MMX packed arithmetic instructions.
packed-logical
Count MMX packed logical instructions.
packed-multiply
Count MMX packed multiply instructions.
packed-shift
Count MMX packed shift instructions.
unpack Count MMX unpack operation instructions.
The default is to count all operations.
p6-mmx-sat-instr-exec
(Event B1H, Pentium II, Pentium III) Count the number of MMX sat‐
urating instructions executed.
p6-mmx-uops-exec
(Event B2H, Pentium II, Pentium III) Count the number of MMX
micro-ops executed.
p6-mul (Event 12H) Count the number of integer and floating-point multi‐
plies, including speculative multiplies. This event is only
allocated on counter 1.
p6-partial-rat-stalls
(Event D2H) Count the number of cycles or events for partial
stalls.
p6-resource-stalls
(Event A2H) Count the number of cycles there was a resource
related stall of any kind.
p6-ret-seg-renames
(Event D6H, Pentium II, Pentium III) Count the number of segment
register rename events retired.
p6-sb-drains
(Event 04H) Count the number of cycles the store buffer is drain‐
ing.
p6-seg-reg-renames [,umask=qualifier]
(Event D5H, Pentium II, Pentium III) Count the number of segment
register renames. An additional qualifier may be specified, and
comprises a list of the following keywords separated by ‘+’ char‐
acters:
ds Count renames for segment register DS.
es Count renames for segment register ES.
fs Count renames for segment register FS.
gs Count renames for segment register GS.
The default is to count operations affecting all segment regis‐
ters.
p6-seg-rename-stalls
(Event D4H, Pentium II, Pentium III) Count the number of segment
register renaming stalls. An additional qualifier may be speci‐
fied, and comprises a list of the following keywords separated by
‘+’ characters:
ds Count stalls for segment register DS.
es Count stalls for segment register ES.
fs Count stalls for segment register FS.
gs Count stalls for segment register GS.
The default is to count operations affecting all the segment reg‐
isters.
p6-segment-reg-loads
(Event 06H) Count the number of segment register loads.
p6-uops-retired
(Event C2H) Count the number of micro-ops retired.
Event Name Aliases
The following table shows the mapping between the PMC-independent aliases
supported by Performance Counters Library (libpmc, -lpmc) and the under‐
lying hardware events used.
Alias Event
branches p6-br-inst-retired
branch-mispredicts p6-br-miss-pred-retired
dc-misses p6-dcu-lines-in
ic-misses p6-ifu-fetch-miss
instructions p6-inst-retired
interrupts p6-hw-int-rx
unhalted-cycles p6-cpu-clk-unhalted
SEE ALSOpmc(3), pmc.atom(3), pmc.core(3), pmc.core2(3), pmc.iaf(3), pmc.k7(3),
pmc.k8(3), pmc.p4(3), pmc.p5(3), pmc.tsc(3), pmclog(3), hwpmc(4)HISTORY
The pmc library first appeared in FreeBSD 6.0.
AUTHORS
The Performance Counters Library (libpmc, -lpmc) library was written by
Joseph Koshy ⟨jkoshy@FreeBSD.org⟩.
BSD October 4, 2008 BSD