Plugin_ibnet
- Date:
19 May 2020
NAME
Plugin_ibnet - man page for the LDMS ibnet plugin
SYNOPSIS
DESCRIPTION
The ibnet plugin provides port info from InfiniBand equipment supporting extended hardware counters. Each port is handled in a separate data set. Overall timing of the data collection process is handled in another optional data set. Plugins for the ldmsd (ldms daemon) are configured via ldmsd_controller or a configuration file.
CONFIGURATION ATTRIBUTE SYNTAX
- config
- name=<plugin_name> port-name=<hca> source-list=<lidfile> [port-number=<num>] [metric-conf=<metricfile>] [node-name-map=<nnmap>] [timing=<timeopt>] [millis=<ms>] [producer=<name>] [instance=<name>] [component_id=<uint64_t>] [schema=<name_base>] [uid=<user-id>] [gid=<group-id>] [perm=<mode_t permission bits>] [debug]configuration line
- name=<plugin_name>
- This MUST be ibnet.
- producer=<pname>.
- The producer string value for the timing set. Default is the result of gethostname().
- instance=<set_name>
- The name of the timing metric set. Default is $producer/ibnet_timing.
- source-list=<lidfile>
- Lidfile is the name of a file of LID/port specifications. See PORT FILE for format details.
- port-name=<hca> [port-number=<num>]
- Hca is the name of the local IB interface to access the network. Num is the number of the port on the interface used to access the network. The default is 1.
- schema=<name_base>
- Optional schema base name. The default is ibnet. The name base is suffixed to create uniquely defined schema names based on the plugin options specified.
- component_id=<compid>
- Optional component identifier for the timing set. Defaults to zero.
- metric-conf=<metricfile>
- The file listing the metric groups to collect. See METRIC GROUPS below.
- ca_port=<port>
- The port number to use, which must be active.
- millis=<millisecond timeout>
- The number of milliseconds of the timeout on the MAD calls. Default 0, which will use the mad library timeout of 1 second.
- timing=<T>
- Disable timing (T=0), enable aggregate timing (T=1), or enable individual port timing(T=2) or enable port offset timing(T=3). The metric set will contain sampling process timing metrics if T > 0.
- node-name-map=<nnmap>
- The file name nnmap, as used by ibnetdiscover and opensm, of a mapping from IB GUIDs to short names of IB hardware items (switch, node, etc) suitable for use in populating names of sets.
PORT FILE
The lid/port file format is
lid, hexguid, nports, plist
* where hexguid is 0x....,
* nports is int,
* plist is ints nports long or * if range is 1-nports,
* if not using a name map, names will be GUID_hex.
The portrange will be an integer expression in the style 1,5,7-9,13, without repeats, whitespace, reversed ranges, or overlapping ranges. LID is an integer in the range 0-65535. The same LID may be on multiple lines so long as the ports listed for it are not repeated.
The lid file can be generated with ldms-gen-lidfile.sh.
METRIC GROUPS
The metric groups file contains a list of items, one per line, naming groups of metrics to collect. The groups are named corresponding to groups in the infiniband-diags perfquery utility options. The correspondence is not exact. To disable a listed metric group, delete its name from the file or comment it out by prepending a # to the group, e.g. ‘#xmtsl’. ‘#’ followed by whitespace is not allowed. Carriage returns are optional.
INTERNAL METRICS
- port_query_time
- Time in seconds spend in the single port MAD call.
- port_query_offset
- Time in microseconds from start of all MAD calls in the current update to the end of the mad call for the specific port.
- ib_query_time
- Time in seconds making all MAD calls in the update.
- ib_data_process_time
- Time in seconds decoding all MAD data in the update
BUGS
The perfquery extended_speeds option is not supported.
EXAMPLES
Within ldmsd_controller or a configuration file:
load name=ibnet
config name=ibnet producer=compute1 instance=compute1/ibnet component_id=1 port-name=mlx5_0 source-list=/path/lidfile
start name=ibnet interval=1000000
NOTES
The exact schema name that will be generated can be determined using the ldms_ibnet_schema_name utility. The subsets available from the fabric depend on the hardware, firmware, and in some cases the subnet manager versions.
SEE ALSO
ldmsd(8), ldms_quickstart(7), ldmsd_controller(8), ldms_ibnet_schema_name(1), ldms-ibnet-sampler-gen(1).