ibd(7D) Devices ibd(7D)NAMEibd - Infiniband IPoIB device driver
SYNOPSIS
/dev/ibd*
DESCRIPTION
The ibd driver implements the IETF IP over Infiniband protocol and pro‐
vides IPoIB service for all IBA ports present in the system.
The ibd driver is a multi-threaded, loadable, clonable, STREAMS hard‐
ware driver supporting the connectionless Data Link Provider Interface,
dlpi(7P)).
By default, datagram mode is used by each ibd instance, unless the
enable_rc is set to 1 for that instance in the .conf file. This change
can be made on a per instance basis by changing the corresponding value
of the variable. So the Nth value of enable_rc changes the setting for
the Nth instance of ibd. Any value other than 1, or no .conf file at
all is equivalent to specifying datagram mode.
Because ibd over connected mode attempts to use a large MTU (65520
bytes), applications should adapt to the large MTU to get better per‐
formance, for example, adopting a large TCP window size.
Use the cloning, character-special device /dev/ibd to access all ibd
devices installed within the system.
The ibd driver is dependent on GLD, a loadable kernel module that pro‐
vides the ibd driver with the DLPI and STREAMS functionality required
of a LAN driver. Except as noted in the Application Programming Inter‐
face section of this man page, see gld(7D) for more details on the
primitives supported by the driver. The GLD module is located at /ker‐
nel/misc/sparcv9/gld on 64 bit systems and at /kernel/misc/gld on 32
bit systems.
The ibd driver expects certain configuration of the IBA fabric prior to
operation (which also implies the SM must be active and managing the
fabric). Specifically, the IBA multicast group representing the IPv4
limited broadcast address 255.255.255.255 (also defined as broadcast-
GID in IETF documents) should be created prior to initializing the
device. IBA properties (including mtu, qkey and sl) of this group is
used by the driver to create any other IBA multicast group as
instructed by higher level (IP) software. The driver probes for the
existance of this broadcast-GID during attach(9E).
APPLICATION PROGRAMMING INTERFACE (DLPI)
The values returned by the driver in the DL_INFO_ACK primitive in
response to your DL_INFO_REQ are:
o Maximum SDU is the MTU associated with the broadcast-GID
group, less the 4 byte IPoIB header.
o Minimum SDU is 0.
o dlsap address length is 22.
o MAC type is DL_IB.
o The sap length value is -2, meaning the physical address
component is followed immediately by a 2-byte sap component
within the DLSAP address.
o Broadcast address value is the MAC address consisting of the
4 bytes of QPN 00:FF:FF:FF prepended to the IBA multicast
address of the broadcast-GID.
Due to the nature of link address definition for IPoIB, the
DL_SET_PHYS_ADDR_REQ DLPI primitive is not supported.
In the transmit case for streams that have been put in raw
mode via the DLIOCRAW ioctl, the DLPI application must
prepend the 20 byte IPoIB destination address to the data it
wants to transmit over-the-wire. In the receive case, appli‐
cations receive the IP/ARP datagram along with the IETF
defined 4 byte header.
WARNING
This section describes warning messages that might be generated by the
driver. Please note that while the format of these messages can be mod‐
ified in future versions, the same general information is provided.
While joining IBA multicast groups corresponding to IP multicast groups
as part of multicast promiscuous operations as required by IP multicast
routers, or as part of running snoop(1M), it is possible that joins to
some multicast groups can fail due to inherent resource constraints in
the IBA components. In such cases, warning message similar to the fol‐
lowing appear in the system log, indicating the interface on which the
failure occurred:
NOTICE: ibd0: Could not get list of IBA multicast groups
NOTICE: ibd0: IBA promiscuous mode missed multicast group
NOTICE: ibd0: IBA promiscuous mode missed new multicast gid
Also, if the IBA SM indicates that multicast trap support is suspended
or unavailable, the system log contains a message similar to:
NOTICE: ibd0: IBA multicast support degraded due to
unavailability of multicast traps
And when the SM indicates trap support is restored:
NOTICE: ibd0: IBA multicast support restored due to
availability of multicast traps
Additionally, if the IBA link transitions to an unavailable state (that
is, the IBA link state becomes Down, Initialize or Armed) and then
becomes active again, the driver tries to rejoin previously joined
groups if required. Failure to rejoin multicast groups triggers mes‐
sages such as:
NOTICE: ibd0: Failure on port up to rejoin multicast gid
If the corresponding HCA port is in the unavailable state defined above
when initializing an ibd interface using ifconfig(1M), a message is
emitted by the driver:
NOTICE: ibd0: Port is not active
Further, as described above, if the broadcast-GID is not found, or the
associated MTU is higher than what the HCA port can support, the fol‐
lowing messages are printed to the system log:
NOTICE: ibd0: IPoIB broadcast group absent
NOTICE: ibd0: IPoIB broadcast group MTU 4096 greater than port's
maximum MTU 2048
In all cases of these reported problems when running ifconfig(1M), it
should be checked that IBA cabling is intact, an SM is running on the
fabric, and the broadcast-GID with appropriate properties has been cre‐
ated in the IBA partition.
The MTU of Reliable Connected mode can be larger than the MTU of Unre‐
liable Datagram mode.
When Reliable Connected mode is enabled, ibd still uses Unreliable
Datagram mode to transmit and receive multicast packets. If the payload
size (excluding 4 byte IPoIB header) of a multicast packet is larger
than the IP link MTU specified by the broadcast group, ibd drops it. A
message appears in the system log when drops occur:
NOTICE: ibd0: Reliable Connected mode is on. Multicast packet
length (<packet length> > <IP_LINK_MTU>) is too long to send
If only one side has enabled Reliable Connected mode, communication
falls back to datagram mode. The connected mode instance uses Path MTU
discovery to automatically adjust the MTU of a unicast packet if an MTU
difference exists. Before Path MTU discovery reduces the MTU for a spe‐
cific destination, several packets whose size exceeds the MTU of Unre‐
liable Datagram mode is dropped.
CONFIGURATION
The IPoIB service comes preconfigured on all HCA ports in the system.
To turn the service off, or back on after turning it off, refer to doc‐
umentation in cfgadm_ib(1M).
EXAMPLES
Example 1 Enabling Connected Mode
The following example driver .conf file enables Connected Mode for ibd
instances 0 and 1. Instances 2 and 3 use datagram mode.
# 1: unicast packets is sent over Reliable Connected Mode
# 0: unicast packets will be sent over Unreliable Datagram Mode
#
# Each element in the list below maps to the corresponding ibd
# instance; the first element is for ibd instance 0, the second
# element is for instance 1 and so on.
#
enable_rc=1,1,0,0
FILES
/dev/ibd* Special character device
/kernel/drv/ib.conf Configuration file to start IPoIB service
/kernel/drv/ibd.conf Configuration file for IPoIB driver
/kernel/drv/sparcv9/ibd 64-bit SPARC device driver
/kernel/drv/amd64/ibd 64-bit x86 device driver
/kernel/drv/ibd 32-bit x86 device driver
SEE ALSOcfgadm(1M), cfgadm_ib(1M), ifconfig(1M), syslogd(1M), gld(7D), ib(7D),
kstat(7D), streamio(7I), dlpi(7P), attributes(5), attach(9E)NOTES
IBD is a GLD-based driver and provides the statistics described by
gld(7D). Note that valid received packets not accepted by any stream
(long) increases when IBD transmits broadcast IP packets. This happens
because the infiniband hardware copies and loops back the transmitted
broadcast packets to the source. These packets are discarded by GLD and
are recorded as 'unknowns'.
SunOS 5.11 19 Jan 2010 ibd(7D)