hg: hg_busywait(), hg_context_switch_involuntary(), hg_con‐
text_switch_tries(), hg_context_switch_voluntary(), hg_gethrcycles(),
hg_gethrtime(), hg_getspu(), hg_nano_to_cycle_ratio(), hg_pub‐
lic_init(), hg_public_is_onRunQ(), hg_public_is_reporting(), hg_pub‐
lic_is_running(), hg_public_nMailboxes(), hg_public_nMailboxesInUse(),
hg_public_remove(), hg_setcrit() - Mercury Library Interfaces to trans‐
fer data between user and kernel space in a lightweight manner
Mercury Public Interfaces
Mercury Private Interfaces
The (HG) provides a high performance interface between user programs
and the kernel, making it possible to transfer key pieces of informa‐
tion back and forth at high speed.
Communication between the user space and kernel is relatively slow due
to the overhead of making a system call. The Mercury APIs help by
avoiding most or all of this overhead.
The information exchanged can be divided into two broad categories:
· Public data:
Information about specific kernel threads likely to be of interest
to other threads. This is provided by the functions. This informa‐
tion received is the present run state of the threads being queried.
· Private data:
Information likely to be of interest to the same thread alone. The
allow the kernel to keep a thread informed of a number of events
usually of use or interest only to that thread.
A thread needs to register before any other thread can view its public
information. There is no need to register to view the information of
If a thread does register itself, it is provided a handle, which may be
communicated to other threads. Mercury Library provides APIs that
query the handle for the underlying thread's run state. Hence the
thread is recognized by the handle assigned to it.
The run state information of any thread is stored in a publicly avail‐
able user-mapped mailbox. The kernel continually posts and updates the
run state of the calling kernel thread in the mailbox, when possible.
Keep in mind that all information is to be regarded as hints about the
state and that no guarantees are implied, from any of the APIs provid‐
ing the run state information on threads, although a good faith effort
is made to keep it accurate.
If you are using MxN threads, the use of the Mercury library public
interface is not recommended. This is because the state shown by the
interface is always that of the underlying kernel thread. Therefore a
thread of scope which moves from kernel thread to kernel thread, would
have confusing information reported about it. Always use for all
threads if you are using the Mercury public interface. The default
scope of threads on systems which can use Mercury is So one can use
Mercury on such systems without having to change the scope. For more
information, see pthread_attr_setscope(3T).
A is the act of a thread ceasing to run on a cpu for some reason and
then later starting to run again; this pair of actions is one context
switch ("switching out" then "switching in"). A "voluntary" context
switch is one that is caused by something the thread does that requires
it to stop running. For example, a thread calling or a thread blocking
for a page fault. An "involuntary" context switch is one that is
imposed on the thread by the operating system for policy reasons. For
example, the operating system switching a thread out to run a higher
libhg Public Interfaces
These are the public interface provided by
allocates and initializes the
mailbox, where the kernel can update the run state of the calling
kernel thread. Updating begins immediately. The mailbox is mapped
into the address space of every thread in the system so any thread
that's been given the address of this mailbox will be able to
check at any time to see if the thread is running or not.
terminates the updating of run state
information and deallocates the mailbox. Removed mailboxes remain
mapped, so user programs that continue to examine them will not
fault, but the contents will be marked (until the kernel reuses
the mailbox). A well-written application will keep its threads
informed when it removes a mailbox.
checks the provided
mailbox and returns a boolean indicator of whether the thread is
running or not. A thread may be not running because it has
switched out, because it has terminated, or because it has removed
checks the provided
mailbox and returns a boolean indicator of whether a thread is
reporting to the mailbox or not.
checks the provided
mailbox and returns a boolean indicator of whether a thread is
currently ready to run, but not yet running.
returns the maximum number of
attachments the kernel supports.
returns the current number of
attachments in service system-wide.
libhg Priviate Interfaces
These are the private interface provided by
returns the number of machine cycles since
the system was booted. This time is unaffected by and calls (and
their like) and can be used without regard for which processor it
is called from.
This call will always return times in a monotonically increasing
sequence when called on the same processor or when called by the
same thread, even when that thread moves from processor to proces‐
sor. Under some circumstances, discussed in more detail in commu‐
nicating threads may occasionally find their times slightly out of
is identical to
but returns the time since boot in nanoseconds instead of machine
This call can generally be expected to return times in a monotoni‐
cally-increasing sequence too, but there is an exception. The
exact clock rate each processor runs at can drift back and forth
by a small amount around the central, rated value. If two threads
conspire to trade time values back and forth through a very high
speed conduit (like shared memory) this drift can be observed as
timestamps occurring out of sequence. Order inversions of a few
hundred cycles happen occasionally. The high speed exchange and a
high speed time interface are required, however. A slower inter‐
face like is less capable of observing this clock jitter.
exhibits no such problems at all because the calling processor
always sychronizes the request with processor zero. This is one
of the reasons is so much slower than and
Nevertheless, threads sticking to more mundane uses of the time,
such as measuring intervals and generating (more widely spaced)
timestamps, should have no difficulty.
returns the number of
nanoseconds/machine cycle for that processor in specific.
spins in user mode until the requested number of
seconds has elapsed. The actual elapsed time will be equal to or
greater than the seconds specified.
returns the number of the spu the thread is
running on. Remember that the thread can conceivably switch to
another processor between the time of the call and when the infor‐
mation is used.
returns the total number of times
the kernel thread has switched out (or at least attempted to
switch out). This includes both voluntary and involuntary switch-
returns the number of times
the kernel thread has involuntarily switched out (that is, the
number of times it has been preempted by another thread).
returns the number of times the
kernel thread has voluntarily switched out (such as taking a page
fault ). The number of voluntary switch-outs is given by the dif‐
ference between total switch-outs and involuntary switch-outs.
informs the kernel when the thread is entering or
leaving a critical region of code. on_or_off should be non-zero
when entering a critical region and 0 when leaving.
Although any non-zero value will work for on_or_off, passing in
the lock address or some other key piece of information such as
the address of the calling routine, may be useful.
The kernel uses this information to help schedule the thread -
when possible, the kernel will avoid switching the thread to a
non-runnable state when this critical region state is set. How‐
ever, threads that spend too much time in regions marked critical
will eventually be switched out because the kernel must fairly
balance the needs of different threads on the system.
The willing_to_block parameter indicates that the thread is will‐
ing to block if necessary to set up this critical region. Threads
that are willing to block before entry to a critical region are
less likely to be involuntarily blocked in the critical region. A
guideline is to set willing_to_block to 1 (or some other non-zero
value) upon entry to the outermost critical region and to set it
to zero in the inner ones.
does not have any notion of nesting, so the decision about calling
in nested critical regions is the responsibility of the calling
returns the address of the mailbox if the call is successful, NULL oth‐
returns the value 1, in case of success.
will return zero if not reporting and non-zero if reporting
will return zero if not running and non-zero if running.
will return 0 if not on run queue and non-zero if on run queue.
returns the total number of mailboxes available for attachment.
returns the number of mailboxes available for attachment at the time of
returns the number of machine cycles since the machine was booted.
returns the number of nanoseconds since the machine was booted.
returns the number of nanoseconds/machine cycle for that processor in
returns 0 on successful completion.
returns the number of the processor the thread is running on.
returns the total number of times the kernel thread has switched out
(or at least attempted to switch out).
returns the number of times the kernel thread has involuntarily
returns the number of times the kernel thread has voluntarily switched
returns the number of times the thread has yielded the processor.
The only error returned by the APIs is This will be returned if any of
the following calls fail:
Below is a scenario on how an application may use Mercury APIs achieve
Since threads may remove mailboxes at any point of time, can help us
find that. If the mailbox is still in use, the run state of the thread
associated with it can be obtained. This can be important at times
because if a thread has detached the mailbox, but since the mailbox
still remains mapped, the probing thread may obtain incorrect informa‐
tion about the thread being probed. Another situation could be that
even if the mailbox is in use, it may have been given up by the previ‐
ous thread and reassigned to a completely new thread.
The kernel is less likely to preempt a thread when it's running in a
critical region than when it's not. Therefore, code regions where
locks are held are good places to mark as critical since application
throughput can drop dramatically if a thread holding critical resources
stops running while other threads are waiting to acquire those
Since does not enter the kernel, it is very fast and does not signifi‐
cantly increase the length of critical paths. If setting will‐
ing_to_block to 1 (instead of 0), will yield the processor, if neces‐
sary, prior to beginning the critical region to reduce the likelihood
of being preempted within the critical region.
Here is one recommended sequence:
It is never absolutely necessary to set but it reduces your chances of
being switched out during the critical region. Remember that means
that you are willing to block during the call, not during the critical
All information returned by the initiative should be considered By
their very nature, and the nature of scheduling user processes and
threads, much of this information can theoretically be stale immedi‐
ately after its delivery to the user, or at any time after that. User
code must take this into account in its design.
The library links with the Mercury Library in order to decide on the
scheduling of threads at any point of time. Hence, applications which
are using both, and need to take care that is already using for sched‐
uling decisions. The applications may still however use the Mercury
APIs if they feel the need to do so.
Remember that this thread information may be stale any time after the
kernel post it. Be careful how you rely on it. It is totally up to the
user code to take care of this while writing code.
The Mercury APIs were developed by Hewlett-Packard Company.
SEE ALSOpthread(3T), pthread_attr_setscope(3T).
Mercury Library hg(3)