json_stream_sampler

Man page for the LDMSD json_stream_sampler plugin

Date:

5 Aug 2023

Manual section:

7

Manual group:

LDMS sampler

SYNOPSIS

Within ldmsd_controller or a configuration file:

config name=json_stream_sampler producer=PRODUCER instance_fmt=INST_FMT [ component_id=COMP_ID ] [ stream=NAME ] [ uid=UID ] [ gid=GID ] [ perm=PERM ] [ heap_szperm=BYTES ]

DESCRIPTION

The json_stream_sampler monitors JSON object data presented on a configured set of streams. JSON object data is encoded in LDMS Metric Sets; the intention of which is to store these metric sets using decomposition through a storage plugin.

The INST_FMT is a string expressing set instance format, where:

  • %P refers to the producer name (PRODUCER given to the config),

  • %S refers to the schema name (obtained from stream data),

  • %U refers to the uid (UID given to the config),

  • %G refers to the gid (GID given to the config).

For example, `%P/jss_%S` will generate “node-1/jss_sch0” instance name if the producer is “node-1” and the schema from the stream data is “sch0”.

When publishing JSON dictionary data to json_stream_sampler, there are fields in the JSON dictionary that have special meaning. These fields are shown in the table below:

Attribute Name

Data Type

Description

schema

string

The name of a Metric Set schema for JSON dictionaries received on this stream.

NAME_max_len

integer

For a list or array named NAME, this is maximum length of the list or array.

Schema Management

The value of the schema attribute in the top-level JSON dictionary is maintained in a tree. The first time the schema name is seen, an LDMS Schema is created based on the value of the JSON dictionary. Once created, the schema is used to create the metric set. Each time a stream message is received, the metric set is updated.

The schema attribute is mandatory. If it not present in the top-level JSON dictionary, an error is logged and the message is ignored.

Encoding Types

Primitive types are encoded as attributes in the LDMS metric set with their associated LDMS type. The table below shows how the JSON attributes are mapped to LDMS metric types.

JSON Type

LDMS Type

Example JSON Value

Integer

LDMS_V_S64

45

Floating Point

LDMS_V_D64

3.1415

String

LDMS_V_BYTE_ARRAY

“hello”, ‘world’

List

LDMS_V_LIST

[ 1, 2, 3 ]

Dictionary

LDMS_V_RECORD

{ “attr1” : 1, “attr2” : 2, “attr3” : 3 }

The encoding of all JSON types except strings, dictionaries and lists is straightfoward. The coding of Strings, Lists and Dictionaries have additional limitations as described below.

Stream Meta-data

Stream events does not include the user-id, and group-id of the application publishing the stream data. The application may declare uid/gid/perm using S_uid (int), S_gid (int) and S_perm (int) respectively. The intention is that this data can stored in rows as configured by the user with a decomposition configuration.

Encoding Strings

Strings are encoded as LDMS_V_BYTE_ARRAY. By default, the length of the array is 255 unless an attribute with the name NAME_max_len is present in the dictionary along with the string value, its value is used to size the string array.

For example:

{ "my_string" : "this is a string", "my_string_max_len" : 4096 }

will result in an LDMS metric with the name “my_string”, type LDMS_V_BYTE_ARRAY, and length of 4096 being created in the metric set.

Encoding Arrays

Any list present in the top-level dictionary is encoded as a list, however, lists present in a 2nd-level dictionary are encoded as arrays. This is because LDMS_V_LIST inside an LDMS_V_RECORD is not supported. The length of the array is determined by the initial value of the array in the record; but can be overridden with the NAME_max_len attribute as described above for strings. Lists of strings in a 2nd-level dictionary are treated as a JSON-formatted string of a list. That is, they are encoded as LDMS_V_CHAR_ARRAY because LDMS does not support arrays of LDMS_V_CHAR_ARRAY. The length of the array is determined by the length of the JSON-formatted string of the initial list.

Encoding Dictionaries

The attributes in the top-level JSON dictionary are encoded in the metric set directly. For example the JSON dictionary:

{
  "schema" : "example",
  "component_id", 10001,
  "job_id" : 2048,
  "seq" : [ 1, 2, 3 ]
}

results in a metric set as follows:

$ ldms_ls -h localhost -p 10411 -a munge -E example -l
ovs-5416_example: consistent, last update: Sat Aug 05 11:38:26 2023 -0500 [281178us]
D s32        S_uid                                      1002
D s32        S_gid                                      1002
D s64        component_id                               10001
D s64        job_id                                     2048
D list<>     seq                                        [1,2,3]
D char[]     schema                                     "example"

Dictionaries inside the top-level dictionary are encoded as LDMS_V_RECORD inside a single element LDMS_V_RECORD_ARRAY. This limitation is because an LDMS_V_RECORD is only allowed inside an LDMS_V_LIST or LDMS_V_ARRAY.

The JSON below:

{
  "schema" : "dictionary",
  "a_dict" : { "attr_1" : 1, "attr_2" : 2 },
  "b_dict" : { "attr_3" : 3, "attr_4" : 4 }
}

results in the following LDMS metric set.

ovs-5416_dict: consistent, last update: Sat Aug 05 21:14:38 2023 -0500 [839029us]
D s32         S_uid                                      1002
D s32         S_gid                                      1002
M record_type  a_dict_record                             LDMS_V_RECORD_TYPE
D record[]     a_dict
  attr_2 attr_1
       2      1
M record_type  b_dict_record                             LDMS_V_RECORD_TYPE
D record[]     b_dict
  attr_4 attr_3
       4      3
D char[]     schema                                     "dict"

Lists of JSON dictionaries results in each dictionary being encoded as an element in an LDMS_V_LIST. Note that all elements in the list must be the same type.

The JSON below:

{ "schema" : "dict_list",
  "a_dict_list" : [
    { "attr_1" : 1, "attr_2" : 2 },
    { "attr_1" : 3, "attr_2" : 4 }
  ]
}

results in the following LDMS metric set.

ovs-5416_dict_list: consistent, last update: Sat Aug 05 21:23:11 2023 -0500 [52659us]
D s32         S_uid                                      1002
D s32         S_gid                                      1002
M record_type a_dict_list_record                         LDMS_V_RECORD_TYPE
D list<>      a_dict_list
  attr_2 attr_1
       2      1
       4      3
D char[]     schema                                     "dict_list"

The JSON below:

{ 'schema'  : 'json_dict',
  'dict'    : { 'int'         : 10,
                'float'       : 1.414,
                'char'        : 'a',
                'str'         : 'xyz',
                'array_int'   : [5, 7, 9],
                'array_float' : [3.14, 1.414, 1.732],
                'array_str'   : ['foo', 'bar'],
                'inner_dict'  : { 'This': 'is',
                                  'a' : 'string'
                                }
              }
}

results in the following LDMS metric sets.

ovis-5416_lists_inside_a_dict: consistent, last update: Mon Sep 25 16:21:35 2023 -0500 [310003us]
D s32          S_uid                                      1000
D s32          S_gid                                      1000
M record_type  dict_record                                LDMS_V_RECORD_TYPE
D record[]     dict
  int_array char       str_array    float                   inner_dict                float_array   str int
      5,7,9  "a" "["foo","bar"]" 1.414000 "{"This":"is","a":"string"}" 3.140000,1.414000,1.732000 "xyz"  10
D char[]       schema                                     "json_dict"

Set Security

The metric sets’ UID, GID, and permission can be configured using the configuration attributes uid, gid, and perm consecutively. If one is not given, the value of the received stream data will be used at set creation. Once a metric set has been created, the UID, GID, and permission will not be changed automatically when the stream data’s security data gets changed. However, it could be modified via an LDMSD configuration command, set_sec_mod. See ldmsd_controller’s Man Page.

Note that the UID, GID, and permissions values given at the configuration line do not affect the S_uid and S_gid metric values. The S_uid and S_gid metric values are always obtained from the stream data.

CONFIG OPTIONS

name=json_stream_sampler

This must be json_stream_sampler (the name of the plugin).

producer=NAME

The NAME of the data producer (e.g. hostname).

instance=NAME

The NAME of the set produced by this plugin. This option is required.

component_id=INT

An integer identifying the component (default: 0).

stream=NAME

The name of the LDMSD stream to register for JSON object data.

uid=UID

The user-id to assign to the metric set.

gid=GID

The group-id to assign to the metric set.

perm=OCTAL

An octal number specifying the read-write permissions for the metric set. See open(3).

heap_sz=BYTES

The number of bytes to reserve for the metric set heap.

BUGS

Not all JSON objects can be encoded as metric sets. Support for records nested inside other records is accomplished by encoding the nested records as strings.

EXAMPLES

Plugin configuration example:

load name=js plugin=json_stream_sampler
config name=js producer=${HOSTNAME} instance_fmt=%P/js_%S \
       component_id=2 stream=darshan_data heap_sz=1024
start name=js interval=1000000

SEE ALSO

ldmsd(8), ldmsd_controller(8), store_avro_kakfa(8)