pthread(3T)pthread(3T)NAMEpthread - introduction to POSIX.1c threads
The POSIX.1c library developed by HP enables the creation of processes
that can exploit application and multiprocessor platform parallelism.
The pthread library libpthread consists of over 90 standardized inter‐
faces for developing concurrent applications and synchronizing their
actions within processes or between them. This manual page presents an
overview of libpthread including terminology and how to compile and
link programs which use threads.
A multithreaded application must define the appropriate POSIX revision
level (199506) at compile time and link against the pthread library
with For example:
All program sources must also include the header file
Note: If is explicitly specified in the link line, then it must be
after the Refer to pthread_stubs(5) for more details.
Note: When explicitly specifying ANSI compilation (with "-Aa"), defin‐
ing the POSIX revision level restricts the program to using interfaces
within the POSIX namespaces. If interfaces in the larger X/Open names‐
pace are to be called, either of the compiler options, or must be spec‐
ified in addition to Alternatively, compiling with -Ae (or not specify‐
ing "-A") will implicitly specify
Note: Some documentation will recommend the use of for compilation.
Although this also functions properly, it is considered an obsolescent
A thread is an independent flow of control within a process, composed
of a context (which includes a register set and a program counter) and
a sequence of instructions to execute.
All processes consist of at least one thread. Multi-threaded processes
contain several threads. All threads share the common address space
allocated for the process. A program using the POSIX pthread APIs cre‐
ates and manipulates what are called user threads. A kernel thread is
a kernel-schedulable entity which may support one or more user threads.
At HP-UX release 11i Version 1.6 and forward, the HP-UX threads imple‐
mentation supports many-to-any as well as one-to-one mapping between
user and kernel threads.
Each thread is assigned a unique identifier of type pthread_t upon cre‐
ation. The thread id is a process-private value and implementation-
dependent. It is considered to be an opaque handle for the thread.
Its value should not be used by the application.
NOTES ON INTERFACES
The HP-UX system provides some non-standard extensions to the pthread
API. These will always have a distinguishing suffix of or (non-porta‐
The programmer should always consult the manpages for the functions
being used. Some standard-specified functions are not available or may
have no effect in some implementations.
A program creates a thread using the function. When the thread has
completed its work, it may optionally call the function, or simply
return from its initial function. A thread can detect the completion
of another by using the function.
Creates a thread and assigns a unique identifier,
pthread_t. The caller provides a function
which will be executed by the thread. Option‐
ally, the call may explicitly specify some
attributes for the thread (see below).
Called by a thread when it completes.
This function does not return.
This is analogous to but for pthreads. Any thread may join any
other thread in the process, there is no par‐
ent/child relationship. It returns when a
specified thread terminates, and the thread
resources have been reaped.
Makes it unnecessary to "join" the thread.
Thread resources are reaped by the system at
the time the thread terminates.
A set of thread attributes may be provided to Any changes from default
values must be made to the attribute set before the call to is made.
Subsequent changes to the attribute set do not affect the created
thread. However, the attribute set may be used in multiple calls.
Note that only the "detachstate", "schedparam", "schedpolicy", and
"processor" attributes of a thread may be effected subsequent to thread
creation. However, this is done by the and functions, respectively.
Initializes an attribute set for use in the
Destroys the content of an attribute set.
These functions get/set the associated attribute in the
attribute set. See the manpages for these functions for
descriptions of the attributes.
This is used to set the default stacksize for threads created in
subsequent attribute set initializations (calls to or in where
no attributes are supplied.
Certain applications may desire to terminate a particular thread with‐
out causing the entire process to exit. A thread may be canceled by
another thread in the same process while the cancellation target thread
executes a system call or particular library routine.
When a thread issues a cancel request against another thread, the tar‐
get thread can check to see if a request is pending against it by the
interface. When called with a request pending, the target thread ter‐
minates after executing any cleanup handlers which may have been
installed. Cleanup handlers may be used to delete any dynamic storage
allocated by the canceled thread, to unlock a mutex, or other opera‐
Typically, the cancellation type for a thread is deferred. That is,
cancellation requests are held pending until the thread reaches a can‐
cellationpoint which is simply one of a list of library functions and
system calls (see lists below).
The thread may set its cancellation type to asynchronous. In this case
cancellation requests are acted upon at any time. This can be used
effectively in compute-bound threads which do not call any functions
that are cancellation points.
Cancel execution of a given thread.
Called by a thread to process pending cancel requests.
Set the characteristics of cancellation for the thread. Cancel‐
lation may be enabled or disabled, or it may be synchronous or
Register or remove cancellation cleanup handlers.
Refer to thread_safety(5) for the list of cancellation points in the
pthread library, system functions, and libc.
For libc functions, whether the thread is cancelled depends upon what
action is performed while executing the function. If the thread blocks
while inside the function, a cancellation point is created (i.e., the
thread may be cancelled). Other libraries may have cancellation
points. Check the associated documentation for details.
The list of cancellation points will vary from release to release. In
general, if a function can return with an error, chances are that it is
a cancellation point.
Threads may individually control their scheduling policy and priori‐
ties. Threads may also suspend their own execution, or that of other
threads. Finally, threads are given some control over allocation of
This function is used to temporarily stop the execution of a thread.
These functions cause a previously suspended thread to continue
These functions are used to interrogate processor configuration
and to bind a thread to a specific processor.
These functions are used to control the actual concurrency for
These functions are used to manipulate the scheduling policy and
priority for a thread.
These functions are used to interrogate the priority range for a
given scheduling policy.
This function is used by a thread to yield the processor to other
threads of equal or greater priority.
COMMUNICATION & SYNCHRONIZATION
Multi-threaded applications concurrently execute instructions. Access
to process-wide (or interprocess) shared resources (memory, file
descriptors, etc.) requires mechanisms for coordination or synchroniza‐
tion among threads. The libpthread library offers synchronization
primitives necessary to create a deterministic application. A multi‐
threaded application ensures determinism by forcing asynchronous thread
contexts to synchronize, or serialize, access to data structures and
resources managed and manipulated during run-time. These are mutual-
exclusion (mutex) locks, condition variables, and read-write locks.
The HP-UX operating system also provides POSIX semaphores (see next
Mutexes furnish the means to exclusively guard data structures from
concurrent modification. Their protocol precludes more than one thread
which has locked the mutex from changing the contents of the protected
structure until the locker performs an analogous mutex unlock. A mutex
can be initialized in two ways: by a call to or by assignment of
Condition Variables are used by a thread to wait for the occurrence of
some event. A thread detecting or causing such an event can signal or
broadcast that occurrence to the waiting thread or threads.
Read-Write locks permit concurrent read access by multiple threads to
structures guarded by a read-write lock, but write access by only a
Initialize/destroy contents of a mutex lock.
Lock/unlock a mutex.
Manipulate mutex locking priorities.
Manage mutex attributes used for Only the "prioceiling"
attribute can be changed for an exiting mutex.
These functions, together with the spin attributes, are used to
tune mutex performance to the specific application.
Initialize/destroy contents of a read-write lock.
Wait upon or signal occurrence of a condition variable.
Manage condition variable attributes used for
Initialize/destroy contents of a read-write lock.
Lock/unlock a read-write lock.
Manage read-write lock attributes used for
POSIX 1.b SEMAPHORES
The semaphore functions specified in the POSIX 1.b standard can also be
used for synchronization in a multithreaded application.
Initialize/destroy contents of a semaphore.
Increment/decrement semaphore value (possibly blocking).
In a multithreaded process, all threads share signal actions. That is,
a signal handler established by one thread is used in all threads.
However, each thread has a separate signal mask, by which it can selec‐
tively block signals.
Signals can be sent to other threads within the same process, or to
other processes. When a signal is sent to the process, exactly one
thread which does not have that signal blocked will handle the signal.
When sent to a thread within the same process, that thread will handle
the signal, perhaps later if the signal is blocked. Signals whose
action is to terminate, stop, or continue will terminate, stop, or con‐
tinue the entire process, respectively, even if directed at a particu‐
Sends a signal to the given thread.
Blocks selected signals for the thread.
These functions synchronously wait for given signals.
Thread-specific data (TSD) is global data that is private or specific
to a thread. Each thread has a different value for the same thread-
specific data variable. The global errno is a perfect example of
thread-specific global data.
Each thread-specific data item is associated with a key. The key is
shared by all threads. However, when a thread references the key, it
references its own private copy of the data.
These functions manage the thread-specific data keys.
These functions retrieve and assign the data value associated
with a key.
The HP-UX compiler supports a thread local storage (TLS) storage class.
(This is not a POSIX standard feature.) TLS is identical to TSD,
except functions are not required to create/set/get values. TLS vari‐
ables are accessed just like normal global variables. TLS variables
can be declared using the following syntax:
The keyword tells the compiler that is a TLS variable. Now each thread
can set or get TLS with statements such as:
Each thread will have a different value associated with
TLS variables can be statically initialized. Uninitialized TLS vari‐
ables will be set to zero. Dynamically loaded libraries (with can
declare and use TLS variables.
TLS does have a cost in thread creation/termination operations, as TLS
space for each thread must be allocated and initialized, regardless of
whether it will ever use the variables. This is true for modules
linked statically at startup. In case of dynamically loaded liabraries
(with TLS space for a thread will be allocated when the TLS variables
are accessed by it. If few threads actually use a large TLS area, it
may be wise to use the POSIX TSD instead (above).
REENTRANT LIBC & STDIO
Because they return pointers to library-internal static data, a number
of libc functions cannot be used in multithreaded programs. This is
because calling these functions in a thread will overwrite the results
of previous calls in other threads. Alternate functions, having the
suffix (for reentrant), are provided within libc for threaded program‐
Also, some primitives for synchronization of standard I/O operations
Provide reentrant versions of previously existing libc func‐
Provide explicit synchronization for standard I/O streams.
The section summarizes some miscellaneous pthread-related functions not
covered in the preceding sections.
Establish special functions to be called just prior to and
just subsequent to a operation.
Tests whether two pthread_t values represent the same pthread.
Executes given function just once in a process, regardless of
how many threads make the same call. (Useful
for one-time data initialization.)
Returns identifier (pthread_t) of calling thread.
Debugging of multithreaded programs is supported in the standard HP-UX
debugger, When any thread is to be stopped due to a debugger event, the
debugger will stop all threads. The register state, stack, and data
for any thread can be interrogated and manipulated.
See the dde(1) manpage and built-in graphical help system for more
HP-UX provides a tracing facility for pthread operations. To use it,
you must link your application using the tracing version of the
When the application is executed, it produces a per-thread file of
pthread events. This is used as input to the thread trace visualizer
facility available in the
There are environment variables defined to control trace data files:
Where to place the trace data files.
If this is not defined, the files go to the current working
By default, trace records are buffered and only written
to the file when the buffer is full. If this variable is set to
any non-NULL value, data is immediately written to the trace
By default, all pthread events are traced.
If this variable is defined, only the categories defined will be
traced. Each category is separated by a ':'. The possible
trace categories are:
For example, to only trace thread and mutex operations set the
Details of the trace file record format can be found in
See the ttv(1) manpage and built-in graphical help system for more
information on the use of the trace information.
Often, an application is designed to be multithreaded to improve per‐
formance over its single-threaded counterparts. However, the multi‐
threaded approach requires some attention to issues not always of con‐
cern in the single-threaded case. These are issues traditionally asso‐
ciated with the programming of multiprocessor systems.
The design must employ a lock granularity appropriate to the data
structures and access patterns. Coarse-grained locks, which protect
relatively large amounts of data, can lead to undesired lock con‐
tention, reducing the potential parallelism of the application. On the
other hand, employing very fine-grained locks, which protect very small
amounts of data, can consume processor cycles with too much locking
The use of thread-specific data (TSD) or thread-localstorage (TLS) must
be traded off, as described above (see
Mutex spin and yield frequency attributes can be used to tune mutex
behavior to the application. See pthread_mutexattr_setspin_np(3T) and
pthread_mutex_setyieldfreq_np(3T) for more information.
The default stacksize attribute can be set to improve system thread
caching behavior. See pthread_default_stacksize_np(3T) for more infor‐
Because multiple threads are actually running simultaneously, they can
be accessing the same data from multiple processors. The hardware pro‐
cessors coordinate their caching of data such that no processor is
using stale data. When one processor accesses the data (especially for
write operations), the other processors must flush the stale data from
their caches. If multiple processors repeatedly read/write the same
data, this can lead to cache-thrashing which slows execution of the
instruction stream. This can also occur when threads access separate
data items which just happen to reside in the same hardware-cachable
unit (called a cache line). This latter situation is called false-
sharing which can be avoided by spacing data such that popular items
are not stored close together.
The following definitions were extracted from the text ThreadTime by
Scott J. Norton and Mark D. DiPasquale, Prentice-Hall, ISBN
Application Programming Interface (API)
An interface is the conduit that provides access to an entity or commu‐
nication between entities. In the programming world, an interface
describes how access (or communication) with a function should take
place. Specifically, the number of parameters, their names and purpose
describe how to access a function. An API is the facility that pro‐
vides access to a function.
A function that may be called by a thread with the cancelability state
set to and the cancelability type set to If a thread is canceled in one
of these functions, no state is left in the function. These functions
generally do not acquire resources to perform the function's task.
An async-signal safe function is a function that may be called by a
signal handler. Only a restricted set of functions may safely be
called by a signal handler. These functions are listed in section
220.127.116.11 of the POSIX.1c standard.
An asynchronous signal is a signal that has been generated due to an
external event. Signals sent by and signals generated due to timer
expiration or asynchronous I/O completion are all examples of asyn‐
chronously generated signals. Asynchronous signals are delivered to
the process. All signals can be generated asynchronously.
Application-provided and registered functions that are called before
and after a operation. These functions generally acquire all mutex
locks before the and release these mutex locks in both the parent and
child processes after the
An operation or sequence of events that is guaranteed to complete as if
it were one instruction.
A synchronization primitive that causes a certain number of threads to
wait or rendezvous at specified points in an application. Barriers are
used when a application needs to ensure that all threads have completed
some operation before proceeding onto the next task.
A user thread that is directly bound to a kernel-scheduled entity.
These threads contain a system scheduling scope and are scheduled
directly by the kernel.
Cache thrashing is a situation in which a thread executes on different
processors, causing cached data to be moved to and from the different
processor caches. Cache thrashing can cause severe performance degra‐
Cancellation Cleanup Handler
An application-provided and registered function that is called when a
thread is canceled. These functions generally perform thread cleanup
actions during thread cancellation. These handlers are similar to sig‐
A condition variable is a synchronization primitive used to allow a
thread to wait for an event. Condition variables are often used in
producer-consumer problems where a producer must provide something to
one or more consumers.
The act of removing the currently running thread from the processor and
running another thread. A context switch saves the register state of
the currently running thread and restores the register state of the
thread chosen to execute next.
A section of code that must complete atomically and uninterrupted. A
critical section of code is generally one in which some global resource
(variables, data structures, linked lists, etc.) is modified. The
operation being performed must complete atomically so that other
threads do not see the critical section in an inconsistent state.
A deadlock occurs when one or more threads can no longer execute. For
example, thread A holds lock 1 and is blocked on lock 2. Meanwhile,
thread B holds lock 2 and is blocked on lock 1. Threads A and B are
permanently deadlocked. Deadlocks can occur with any number of
resource holding threads. An interactive deadlock involves two or more
threads. A recursive (or self) deadlock involves only one thread.
A thread whose resources are automatically released by the system when
the thread terminates. A detached thread cannot be joined by another
thread. Consequently, detached threads cannot return an exit status.
A thread whose termination can be waited for by another thread. Join‐
able threads can return an exit status to a joining thread. Joinable
threads maintain some state after termination until they are joined by
A mode of operation where all operations are allowed. While a thread
is executing a system call it is executing in kernel mode.
The kernel program exists in this space. Kernel code is executed in
this space at the highest privilege level. In general, there are two
privilege levels: one for user code (user mode) and the other for ker‐
nel code (kernel mode).
When a thread makes a system call, it executes in kernel mode. While
in kernel mode, it does not use the stack allocated for use by the
application. Instead, a separate kernel stack is used while in the
system call. Each kernel-scheduled entity, whether a process, kernel
thread or lightweight process, contains a kernel stack. See Stack for
a generic description of a stack.
Kernel threads are created by the thread functions in the threads
library. Kernel threads are kernel-scheduled entities that are visible
to the operating system kernel. A kernel thread typically supports one
or more user threads. Kernel threads execute kernel code or system
calls on behalf of user threads. Some systems may call the equivalent
of a kernel thread a lightweightprocess. See Thread for a generic
description of a thread.
A kernel-scheduled entity. Some systems may call the equivalent of a
lightweight process a kernel thread. Each process contains one or more
lightweight process. How many lightweight processes a process contains
depends on whether and how the process is multithreaded. See Thread
for a generic description of a thread.
A system with two or more processors (CPUs). Multiprocessors allow
multithreaded applications to obtain true parallelism.
A programming model that allows an application to have multiple threads
of execution. Multithreading allows an application to have concurrency
and parallelism (on multiprocessor systems).
A mutex is a mutual exclusion synchronization primitive. Mutexes pro‐
vide threads with the ability to regulate or serialize access to
process shared data and resources. When a thread locks a mutex, other
threads trying to lock the mutex block until the owning thread unlocks
Portable Operating System Interface. POSIX defines a set of standards
that multiple vendors conform to in order to provide for application
portability. The Pthreads standard (POSIX 1003.1c) provides a set of
portable multithreading APIs to application developers.
A situation where a low-priority thread has acquired a resource that is
needed by a higher priority thread. As the resource cannot be
acquired, the higher priority thread must wait for the resource. The
end result is that a low-priority thread blocks a high-priority thread.
A process can be thought of as a container for one or more threads of
execution, an address space, and shared process resources. All pro‐
cesses have at least one thread. Each thread in the process executes
within the process' address space. Examples of process-shared
resources are open file descriptors, message queue descriptors,
mutexes, and semaphores.
Process Control Block (PCB)
This structure holds the register context of a process.
The operating system maintains a process structure for each process in
the system. This structure represents the actual process internally in
the system. A sample of process structure information includes the
process ID, the process' set of open files, and the signal vector. The
process structure and the values contained within it are part of the
context of a process.
Program Counter (PC)
The program counter is part of the register context of a process. It
holds the address of the current instruction to be executed.
When the result of two or more threads performing an operation depends
on unpredictable timing factors, this is a race condition.
A read-write lock is a synchronization primitive. Read-write locks
provide threads with the ability to regulate or serialize access to
process-shared data and resources. Read-write locks allow multiple
readers to concurrently acquire the read lock whereas only one writer
at a time may acquire the write lock. These locks are useful for
shared data that is mostly read and only rarely written.
A reentrant function is one that when called by multiple threads,
behaves as if the function was called serially, one after another, by
the different threads. These functions may execute in parallel.
Scheduling Allocation Domain
The set of processors on which a thread is scheduled. The size of this
domain may dynamically change over time. Threads may also be moved
from one domain to another.
Scheduling Contention Scope
The scheduling contention scope defines the group of threads that a
thread competes with for access to resources. The contention scope is
most often associated with access to a processor. However, this scope
may also be used when threads compete for other resources. Threads
with the system scope compete for access to resources with all other
threads in the system. Threads with the process scope compete for
access to resources with other process scope threads in the process.
A scheduling policy is a set of rules used to determine how and when
multiple threads are scheduled to execute. The scheduling policy also
determines how long a thread is allowed to execute.
A scheduling priority is a numeric priority value assigned to threads
in certain scheduling policies. Threads with higher priorities are
given preference when scheduling decisions are made.
A semaphore is similar to a mutex. A semaphore regulates access to one
or more shared objects. A semaphore has a value associated with it.
The value is generally set to the number of shared resources regulated
by the semaphore. When a semaphore has a value of one, it is a binary
semaphore. A mutex is essentially a binary semaphore. When a sema‐
phore has a value greater than one, it is known as a countingsemaphore.
A counting semaphore can be locked by multiple threads simultaneously.
Each time the semaphore is locked, the value is decremented by one.
After the value reaches zero, new attempts to lock the semaphore cause
the locking thread to block until the semaphore is unlocked by another
A shared object is a tangible entity that exists in the address space
of a process and is accessible by all threads within the process. In
the context of multithreaded programming, "shared objects" are global
variables, file descriptors, and other such objects that require access
by threads to be synchronized.
A signal is a simplified IPC mechanism that allows a process or thread
to be notified of an event. Signals can be generated synchronously and
A signal mask determines which signals a thread accepts and which ones
are blocked from delivery. If a synchronous signal is blocked from
delivery, it is held pending until either the thread unblocks the sig‐
nal or the thread terminates. If an asynchronous signal delivered to
the process is blocked from delivery by a thread, the signal may be
handled by a different thread in the process that does not have the
A signal vector is a table contained in each process that describes the
action that should be taken when a signal is delivered to a thread
within the process. Each signal has one of three potential behaviors:
ignore the signal, execute a signal-handling function, or perform the
default action of the signal (usually process termination).
means that there is only one flow of control (one thread) through the
program code; only one instruction is executed at a time.
A synchronization primitive similar to a mutex. If the lock cannot be
acquired, instead of blocking, the thread wishing to acquire the lock
spins in a loop until the lock can be acquired. Spinlocks can be eas‐
ily used improperly and can severely degrade performance if used on a
single processor system.
A spurious wakeup occurs when a thread is incorrectly unblocked, even
though the event it was waiting for has not occurred. A condition wait
that is interrupted and returns because the blocked thread received a
normal signal is an example of a spurious wakeup.
A stack is used by a thread to make function calls (and return from
those calls), to pass arguments to a function call, and to create the
space for local variables when in that function call. Bound threads
have a user stack and a kernel stack. Unbound threads have only a user
A synchronous signal is a signal that has been generated due to some
action of a specific thread. For example, when a thread does a divide
by zero, causes a floating point exception, or executes an illegal
instruction, a signal is generated synchronously. Synchronous signals
are delivered to the thread that caused the signal to be sent.
This is a single-threaded entity that can be scheduled to execute on a
A thread is an independent flow of control within a process, composed
of a context (which includes a register set and program counter) and a
sequence of instructions to execute.
Thread Local Storage (TLS)
Thread local storage is essentially thread-specific data requiring sup‐
port from the compilers. With TLS, an application can allocate the
actual data as thread-specific data rather than using thread-specific
data keys. Additionally, TLS does not require the thread to make a
function call to obtain thread-specific data. The thread can access
the data directly.
A thread-safe function is one that may be safely called by multiple
threads at the same time. If the function accesses shared data or
resources, this access is regulated by a mutex or some other form of
Thread-Specific Data (TSD)
Thread-specific data is global data that is specific to a thread. All
threads access the same data variable. However, each thread has its
own thread-specific value associated with this variable. errno is an
example of thread-specific data.
The operating system maintains a thread structure for each thread in
the system. This structure represents the actual thread internally in
the system. A sample of thread structure information includes the
thread ID, the scheduling policy and priority, and the signal mask.
The thread structure and the values contained within it are part of the
context of a thread.
A mode of operation where a subset of operations are allowed. While a
thread is executing an applications code, it is executing in user mode.
When the thread makes a system call, it changes modes and executes in
kernel mode until the system call completes.
The user code exists in this space. User code is executed in this
space at the normal privilege level. In general, there are two privi‐
lege levels: one for user code (user mode) and the other for kernel
code (kernel mode).
When a thread is executing code in user space, it needs to use a stack
to make function calls, pass parameters, and create local variables.
While in user mode, a thread does not use the kernel stack. Instead, a
separate user stack is allocated for use by each user thread. See
Stack for a generic description of a stack.
When is called, a user thread is created. Whether a kernel-scheduled
entity (kernel thread or lightweight process) is also created depends
on the user thread's scheduling contention scope. When a bound thread
is created, both a user thread and a kernel-scheduled entity are cre‐
ated. When an unbound thread is created, generally only a user thread
is created. See Thread for a generic description of a thread.
SEE ALSOpthread_stubs(5), thread_safety(5).
ThreadTime by Scott J. Norton and Mark D. DiPasquale, Prentice-Hall,
ISBN 0-13-190067-6, 1996.
Pthread Library pthread(3T)