LU-13642 lnet: Allow dynamic IP specification Currently you can setup an NI only using the device interface. It is possible that a device interface has more than one IP address. This change updates lnet_net_cmd() to setup an NI using a specific network address. For further reference please read IP specification in LNet https://wiki.whamcloud.com/display/LNet/IP+specification+in+LNet Test-Parameters: trivial testlist=sanity-lnet Change-Id: I2c456790fe9534bbfe02b0330cce73e80318cc1c Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53605 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10003 lnet: migrate lnet setup and tear down to Netlink Migrate the LNet setup and tear down functionality from ioctls to Netlink. This change now means lnet_ioctl() is no longer needed but we will keep it for now to support older tools. The work here will be used in a follow on patch to tell lnet to setup large NIDs by default for testing. Test-Parameters: trivial testlist=sanity-lnet Change-Id: Id69810e114818d423102d6e85ff93529f04c337f Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54359 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-14391 utils: handle very large YAML data sets. Some functionality for Lustre and even LNet can return huge amounts of Netlink data that can overwhelm the internal libyaml buffers. To resolve this we can create a resizable internal buffer to collect all the Netlink data that is formated into YAML. After the message has been completed we can feed this data in chunk sizes the smaller internal libyaml library can handle. The libyaml library internal buffer is a rolling buffer so it will updated when we exceed its internal size. This will allows collecting every single type of Lustre stat in one go and for sites that have very large LNet router setups. Test-Parameters: trivial Change-Id: I20fdbb19b0f3de3ab52e8ad568c6926f61f627b9 Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54132 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10391 lnet: support updating LNet local NI settings The LNet API allows updating specific settings instead of a full new configuration for NIs. We can accomplish this using NLM_F_REPLACE with the LNET_CMD_NETS command. The only change for the user land tools is now you can use large NID addresses. Another change in the user land tools is increasing intf_name field in size from IFNAMSIZ to LNET_MAX_STR_LEN which requires increasing err_str handling. This is because we use struct lnet_dlc_intf_descr both to store network addresses or / and network interfaces. Test-Parameters: trivial testlist=sanity-lnet Change-Id: Id334ed3a73ac6ec7a342d4616e32dcfef46907a7 Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53560 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Cyril Bordage <cbordage@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-9680 utils: fix nested attribute handling in liblnetconfig Testing with several different YAML layouts revealed several limitations. The first breakage discovered while porting LNet export to Netlink was that for a nested list if the first attribute processed was another nested list the YAML generated was missing the needed '-'. Now we instert it manually. The second problem was the idea of updating an individual key didn't work which was discovered while testing lustre stats. We moved the printing of the new key to under NLA_NESTED case directly. This required created yaml_nested_header() which handles both empty nested list and ones containing data. The comments added to the library should make this clear. Sending Netlink packets also had some bugs that have been resolved. The function yaml_fill_scalar_data() is used to parse out simple scalar values and key value pairs. The original codes parsing of the input string altered the string. This broke the do while loop over entry since entry dropped the rest of the configuration data. Instead of altering the string we carefully parse the string without altering it. Handle the case when nla_nest_start() fails to create a nlattr in lnet_genl_parse_list() which prevents a node crash when we run out of space in the skbuff. Make sure the skbuff is large enough for LNet NI Netlink data collection by setting cb->min_dump_alloc to U16_MAX. Test-Parameters: trivial testlist=sanity-lnet Fixes: d137e9823ca ("LU-10003 lnet: use Netlink to support LNet ping commands") Change-Id: I2d702c9211abffc051db3203ec3811ceaedb2376 Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53889 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17479 utils: Update lnet tools to support PyYAML format The current cYAML implementation can't handle PyYAML indentation style. The reason is the underlying libyaml library creates different yaml events / tokens for the PyYAML format. I attempted to inject the missing yaml tokens from the PyYAML format but that failed to work. Also the tokens with the PyYAML produced the wrong scalar strings. I looked at moving to yaml events instead of tokens but that required a large change. The simplest change was to capture the YAML config input and place it into a locally allocated buffer. Then alter the location of '-' which changed the YAML config from PyYAML to something cYAML can handle. For this to work I needed to move the YAML config data handling out of cYAML_build_tree() to jt_import(). The reason was that lustre_yaml_cb_helper() is called more than once and for stdin it can only be read once. Test-Parameters: trivial testlist=sanity-lnet Change-Id: Ic8529ae264c9cbe6872da9a9e3421db78f8ea371 Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53845 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17414 lnet: Use POSIX error number for libnetconfig Currently liblnetconfig.c is returning custom define LUSTRE_CFG_RC_* numbers which can be confusing to users. This patch redefines LUSTRE_CFG_RC_* to use POSIX error number to be consistent. Test-Parameters: trivial testlist=sanity-lnet Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Change-Id: I585d1dfd80d07160e5cdeef784920414132bcaf8 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53657 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17000 utils: Check return value of yaml_parser_initialize This patch adds return value checks to function yaml_parser_initialize() and fopen() under lustre_cfg.c And funciton cYAML_build_tree() under cyaml.c Test-Parameters: trivial CoverityID: 410239 ("Unchecked return value") CoverityID: 410238 ("Unchecked return value") Fixes: 65062463 (LU-14359 hsm: support a flatter HSM archive format) Fixes: 8961f2d8 (LU-4939 utils: allow configuration through yaml files) Change-Id: I67a34adee3e4d25f97244487684a613426637a70 Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53331 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17054 lnet: Change cpt-of-nid to get result from kernel The lnetctl cpt-of-nid command leverages a userspace implementation of the kernel hash_long() function to compute the CPT for a given NID. However, the kernel hash_long() function has changed over time such that the userspace version may give a different result than the kernel version. Since Lustre supports such a wide range of kernels we cannot simply update the userspace implementation of hash_long() to match newer kernel. Address this by re-implementing lnetctl cpt-of-nid to call into kernel space to compute the CPT and return the result to userspace. lnetctl cpt-of-nid now works with extended NIDs (e.g., IPv6). lnetctl cpt-of-nid no longer accepts the --ncpt argument because the kernel functions for computing the cpt do not support this. lnetctl cpt-of-nid no longer accepts the --nid argument. Instead, the command now takes a space separated list of nids. Example: $ lnetctl cpt-of-nid 867@kfi 5.3.0.9@tcp cpt-of-nid: - nid: 867@kfi cpt: 0 - nid: 5.3.0.9@tcp cpt: 1 $ Because the old implementation could return a wrong result it is completely removed. HPE-bug-id: LUS-11785 Test-Parameters: trivial Signed-off-by: Chris Horn <chris.horn@hpe.com> Change-Id: I7c2bc48c5c0da7da8a4425d319c0b99207814ae1 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52502 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10391 lnet: filter out white spaces For the libyaml library two methods exist to construct an internal YAML document. One is with the creation of yaml_event_t and submitting it, yaml_emitter_emit(), to the emitter. The second method is using some source like a file. In both cases the input is processed and placed into an internal buffer which is passed to the read handler, yaml_netlink_read_handler(). This buffer ends up looking in the raw text of the configuration file passed and this includes all the various whitespaces. Due to an internal processing bug both creation methods don't yeild the same exact internal buffer contents. In the sequence case for sources from a file will contain extra white spacing. Our current Netlink implement doesn't filter off that extra white spacing so the values packed into the Netlink pack contains leading white spaces which is seen as an error. The fix is to skip those extra white space if they exist. Change-Id: I7445ffb486d6d39c681ab4e5a85e0b835509c9ee Test-Parameters: trivial testlist=sanity-lnet Fixes: 70149f4ea89 ("LU-9680 utils: fix Netlink / YAML library handling") Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53020 Reviewed-by: Feng Lei <flei@whamcloud.com> Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-17000 coverity: Fix Logically dead code under liblnetconfig.c This patch fixes Logically dead code check reported by coverity run. CoverityID: 404752 ("Logically dead code") Test-Parameters: trivial testlist=sanity-lnet Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Change-Id: I5a435324a19e04805c2a7c555ac2a0c1433ce2d0 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52950 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10391 lnet: migrate peer NI control to Netlink Move peer creation and deletion to the Netlink API. This change enables the creation of peers with large NID addresses. Test-Parameters: trivial testlist=sanity-lnet Change-Id: I7f2f75e73e3f39856751f65e240f2172f703d0bc Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49574 Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Cyril Bordage <cbordage@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-10391 lnet: migrate full LNet NI information collection Fill in the rest of the LNet NI state to report to user land. This mostly covers the stats information of the NI and LND specific tunables. To handle the LND specific tunables we need to reorder the code to send an updated key table. With the additional information I found status wasn't properly set and the nesting for was properly set for multiple NIs per NET. This is now fixed. Test-Parameters: trivial testlist=sanity-lnet Fixes: 8f8f6e2f36e ("LU-10003 lnet: use Netlink to support old and new NI APIs") Change-Id: I32b06b1ce8cb049a33f45f2310d31897ffa7dc90 Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50255 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Cyril Bordage <cbordage@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17000 lnet: remove redundant errno check in liblnetconfig.c Variable root is assigned NULL at the beginning of lustre_lnet_show_stats(). If l_ioctl() fails, its return value stored in rc will take the True path in the following conditional. This conditional currently contains a redundant check for errno, despite the fact that rc would = -errno in this case. If errno had changed between the l_ioctl() call and this subsequent read, errno could be 0, which would, from the out: label, lead to a NULL root being used as a parameter in cYAML_insert_sibling() and dereferencing the NULL root pointer. Replaced l_errno's use as a parameter in strerror with -rc, and removed decleration and other references to l_errno. Addresses-Coverity-ID: 397850 ("Explicit null dereferenced") Signed-off-by: Jake McManus <jacobpmcmanus@gmail.com> Change-Id: I78f080837b60c8216c52bda8562d4c0f9f45a132 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51846 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-8191 lnet: convert functions in utils to static Static analysis shows that a number of functions could be made static. This patch declares several functions in various LNet utils and lnetconfig to static. In LNet selftest (lst), one unused function was removed entirely. Some declarations were moved to made static. Test-Parameters: trivial Signed-off-by: Timothy Day <timday@amazon.com> Change-Id: Ia4528281b3c87d77e46abb95f47ab0bdc72168c0 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51427 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Reviewed-by: Neil Brown <neilb@suse.de> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-16980 build: fix gcc-12 [-Werror=format-truncation=] error This patch fixes the following [-Werror=format-truncation=] errors detected by gcc 12: liblnetconfig.c: In function 'open_sysfs_file': liblnetconfig.c:106:49: error: '%s' directive output may be truncated writing up to 127 bytes into a region of size between 1 and 128 [-Werror=format-truncation=] 106 | snprintf(filename, sizeof(filename), "%s%s", | ^~ lfs_project.c: In function 'lfs_project_handle_dir': lfs_project.c:324:50: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size between 1 and 4095 [-Werror=format-truncation=] 324 | snprintf(fullname, PATH_MAX, "%s/%s", pathname, | ^~ statx.c: In function 'do_dir_list': statx.c:1427:58: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size between 1 and 4095 [-Werror=format-truncation=] 1427 | snprintf(fullname, PATH_MAX, "%s/%s", | ^~ Change-Id: I514a1022d879f8b7af89f6ded68e9b453cd11408 Signed-off-by: Jian Yu <yujian@whamcloud.com> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51765 Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Oleg Drokin <green@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-9680 utils: add updating the key table for Netlink. Currently lnetconfig implementation only sends the key table once to construct a YAML document. Add the ability to update the key table at a latter time. New keys will be used by the YAML document. Test-Parameters: trivial Change-Id: Ie2201f91eb24d06c7e2a2d4abe3da3805f74e5a7 Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51715 Tested-by: Oleg Drokin <green@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-16709 lnet: fix locking multiple NIDs of the MR peer If Lustre identifies the same peer with multiple NIDs, as a result of peer discovery it is possible that the discovered peer is found to contain a NID which is locked as primary by a different existing peer record. In this case it is safe to merge the peer records, but the NID which got locked the earliest should be kept as primary. This allows for the first of the two locked NIDs to stay primary as intended for the purpose of communicating with Lustre even if peer discovery succeeded using a different NID of MR peer. Fixes: aacb16191a ("LU-14668 lnet: Lock primary NID logic") Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com> Change-Id: Iec9f8b70053fe24cddee552358500dfad0234b7f Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50530 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Cyril Bordage <cbordage@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-16574 udsp: lnetctl udsp improvements lnet_udsp_del_policy() did not previously return non-zero, but its single caller would check for a non-zero and call lnet_udsp_apply_policies(). This code is removed. lnet_udsp_del_policy() will now return non-zero but only in the case where there is no matching policy index. In this case the policies are not modified and thus we needn't re-apply them. Modify some error checking for lnetctl udsp commands to provide better error messages. Correct typos in lustre_lnet_add_udsp() error messages. Correct lustre_lnet_del_udsp()'s handling of errno. Update help text of udsp commands. Use parse_long() in jt_show_udsp() to parse the idx argument. Sanity check priority and idx arguments for udsp add/del rather than silently modifying them when the user passes in bad values. Implement 'lnetctl udsp del --all' to provide easy way for admin to delete all configured policies (this is equivalent to 'lnetctl udsp del --idx -1', but it is more user friendly). The udsp del command now requires either --all or --idx be specified. Test-Parameters: trivial HPE-bug-id: LUS-11490 Signed-off-by: Chris Horn <chris.horn@hpe.com> Change-Id: Ie5e91d8ac1c810473768566593e993e47070e14c Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51087 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Cyril Bordage <cbordage@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-6142 utils: use list_first/list_entry() on list heads This patch changes list_entry(foo.next, ...) to list_first_entry(&foo, ...) and list_entry(foo.prev, ...) to list_last_entry(&foo, ...) in cases where 'foo' is a list head - not a list member. Test-Parameters: trivial Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: I9daaaed044af596f6407801259cfb672150bfc34 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50830 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>