LU-17638 util: remove newer lnetctl export handling On the current maloo VMs lnetctl export ends up segfaulting. For now go back to the original code until we figure out what is different on this setup and yet it works elsewhere. The reason for a partial reveret is other important works are ready to land that would be delayed by a full revert. Fixes: d3ef8f6993 ("LU-9680 lnet: add NLM_F_DUMP_FILTERED support") Test-Parameters: trivial testlist=sanity-lnet Signed-off-by: James Simmons <jsimmons@infradead.org> Change-Id: Ibd3437ee619cde9667d049455d641a602ea50174 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54436 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
LU-17600 lnet: delete lbstats and lnetunload It's not likely that anyone still uses these scripts. Test-Parameters: trivial Signed-off-by: Timothy Day <timday@amazon.com> Change-Id: I418bdf2a1428905d598fdffdf27dff80831350d0 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54250 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-9859 lnet: move CPT handling to LNet The CPT work is used for LNet and ptlrpc which is the Lustre LNet interface. Move this work there and merge the lib-mem.c code as well since they both work closely together. Move cpt debugfs handling from libcfs to lnet. Now all remaining debugfs in libcfs is for debugging. Test-Parameters: trivial Change-Id: I016a90520bd7c6428b45bafff8618bc864e9112b Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52923 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17578 lnet: fix &the_lnet.ln_mt_peerNIRecovq race To avoid race &the_lnet.ln_mt_peerNIRecovq must always be accessed with lnet_net_lock(0) protection. Test-Parameters: trivial Fixes: da23037 ("LU-16563 lnet: use discovered ni status to set initial health") Change-Id: Ic5e0194020200afdecba4cbf5afed274b14da388 Signed-off-by: Bruno Faccini <bfaccini@nvidia.com> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54163 Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com>
LU-9680 lnet: add NLM_F_DUMP_FILTERED support In addition to different API levels for the netlink packets we can also filter the data sent back when user land sends the NLM_F_DUMP_FILTERED. Support this across the various netlink dumpit functions. This work is needed for the proper support for lnetctl export command. Update the export to work with the Netlink API. This results in proper IPv6 support for the export command. Test-Parameters: trivial testlist=sanity-lnet Change-Id: I0e8993b1f9a08199f282965601781aa6fd0e4844 Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53004 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-16011 lnet: use preallocate bulk for server Server side want to have a preallocate bulk to avoid large lock contention on the page cache. Without it LST limited with 35Gb/s speed with 3 rail host (HDR each) due large CPU usage. Preallocate bulks increase a memory consumption for small bulk, but performance improved dramatically up to 74Gb/s with very low cpu usage. Test-Parameters: testgroup=review-ldiskfs-arm testlist=sanity-lnet,lnet-selftest Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com> Change-Id: Icf396ba2ecfbded807b5722bb2c4cbe4d0084300 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50276 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com> Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-14391 utils: handle very large YAML data sets. Some functionality for Lustre and even LNet can return huge amounts of Netlink data that can overwhelm the internal libyaml buffers. To resolve this we can create a resizable internal buffer to collect all the Netlink data that is formated into YAML. After the message has been completed we can feed this data in chunk sizes the smaller internal libyaml library can handle. The libyaml library internal buffer is a rolling buffer so it will updated when we exceed its internal size. This will allows collecting every single type of Lustre stat in one go and for sites that have very large LNet router setups. Test-Parameters: trivial Change-Id: I20fdbb19b0f3de3ab52e8ad568c6926f61f627b9 Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54132 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17476 lnet: use bits only to match ME in all cases If NIDs belong to the same peer and matchbits are matching, declare a match even if matchbits are matched as not available or ignored Test-Parameters: testlist=sanity env=ONLY=350,ONLY_REPEAT=10 Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com> Change-Id: I394c492381a2d069b34516c473220192df05fbd2 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54082 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Cyril Bordage <cbordage@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17545 lnet: use unsafe_memcpy() when flexible array To avoid <memcpy: detected field-spanning write (size 64) of single field "&lp->lp_data->pb_info" at .../lnet/lnet/peer.c:2456 (size 16)> false positive msgs/error. Signed-off-by: Bruno Faccini <bfaccini@nvidia.com> Change-Id: I4e2fc58e31f60b434a9050393cd65b89c54f0798 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54069 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17476 lnet: prefer to use bits only to match ME In some cases, it has been observed that a reply will arrive at the portal with the correct match bits, but is dropped by lnet_parse_put(). This appears to happen with LNet Multi-Rail peers, each having two separate NIDs. If a reply arrives with matchbits available and matching, but the NIDs don't match, confirm the match if the NIDs are found to belong to the same peer. This will only happen in cases where the reply would be dropped entirely, causing hundreds of seconds of delay until the RPC is resent, so the extra overhead of checking for a peer match before dropping the request is only in the error path and minimal compared to the alternative. Add CFS_FAIL_CHECK() for exercising the match NIDs code. That is in a hot codepath, but CFS_FAIL_CHECK() is marked unlikely() and this check is in the error case and _should_ only be hit when the message would have been dropped anyway, so it seems unlikely to impact performance in any meaningful way. Test-Parameters: testlist=sanity env=ONLY=350,ONLY_REPEAT=10 Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com> Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Change-Id: I10e1a2142539ddf5dabc26ce962cec1f2cfcf3db Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53843 Reviewed-by: Cyril Bordage <cbordage@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-6142 lnet: SPDX for lnet/utils/ Convert from verbose license text to SDPX. Test-Parameters: trivial Signed-off-by: Timothy Day <timday@amazon.com> Change-Id: I0568f692c6799834794ed9c565bdac7ec9aef1d3 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54173 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10391 lnet: support updating LNet local NI settings The LNet API allows updating specific settings instead of a full new configuration for NIs. We can accomplish this using NLM_F_REPLACE with the LNET_CMD_NETS command. The only change for the user land tools is now you can use large NID addresses. Another change in the user land tools is increasing intf_name field in size from IFNAMSIZ to LNET_MAX_STR_LEN which requires increasing err_str handling. This is because we use struct lnet_dlc_intf_descr both to store network addresses or / and network interfaces. Test-Parameters: trivial testlist=sanity-lnet Change-Id: Id334ed3a73ac6ec7a342d4616e32dcfef46907a7 Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53560 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Cyril Bordage <cbordage@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-9680 utils: fix nested attribute handling in liblnetconfig Testing with several different YAML layouts revealed several limitations. The first breakage discovered while porting LNet export to Netlink was that for a nested list if the first attribute processed was another nested list the YAML generated was missing the needed '-'. Now we instert it manually. The second problem was the idea of updating an individual key didn't work which was discovered while testing lustre stats. We moved the printing of the new key to under NLA_NESTED case directly. This required created yaml_nested_header() which handles both empty nested list and ones containing data. The comments added to the library should make this clear. Sending Netlink packets also had some bugs that have been resolved. The function yaml_fill_scalar_data() is used to parse out simple scalar values and key value pairs. The original codes parsing of the input string altered the string. This broke the do while loop over entry since entry dropped the rest of the configuration data. Instead of altering the string we carefully parse the string without altering it. Handle the case when nla_nest_start() fails to create a nlattr in lnet_genl_parse_list() which prevents a node crash when we run out of space in the skbuff. Make sure the skbuff is large enough for LNet NI Netlink data collection by setting cb->min_dump_alloc to U16_MAX. Test-Parameters: trivial testlist=sanity-lnet Fixes: d137e9823ca ("LU-10003 lnet: use Netlink to support LNet ping commands") Change-Id: I2d702c9211abffc051db3203ec3811ceaedb2376 Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53889 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17258 socklnd: stop connecting on too many retries If peer repeatedly rejects connection requests with EALREADY, assume that it doesn't support as many connections as we're trying to create. Make sure to stop connecting to the peer altogether and either continue with already created connections if there's at least one of each type, or fail. This helps avoid the assertion: "ASSERTION( (wanted & ((((1UL))) << (3))) != 0 ) failed" Test-Parameters: trivial testlist=sanity-lnet Fixes: 5afe3b053 ("LU-17258 socklnd: ensure connection type established upon race") Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com> Change-Id: I6072e91cc36544fc2f56c91cd78f6637cf82ecbc Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53955 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Cyril Bordage <cbordage@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17505 socklnd: return NETWORK_TIMEOUT to LNet on ETIMEOUT Returning LNET_MSG_STATUS_LOCAL_TIMEOUT to LNet on ETIMEDOUT causes LNet to only decrement the local NI health score, while the issue may actually be with the remote NI. Changing this to return LNET_MSG_STATUS_NETWORK_TIMEOUT causes LNet to decrement both local NI and peer NI health. If local NI is ok, it will recover its health score quickly, but the affected peer NI health is lowered until peer NI is recovered. This helps LNet select healthy NIs of the same peer in the meantime. Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com> Change-Id: I916772477d1fd63571447262880a33830746f002 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53930 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Cyril Bordage <cbordage@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17379 lnet: add LNetPeerDiscovered to LNet API LNetPeerDiscovered is added to allow lustre check whether the peer has been successfully discovered by LNet before attempting to open a connection to it. For example, given a mount command with a list of NIDs, Lustre can use LNetAddPeer API to initiate discovery on every candidate first, and later use LNetPeerDiscovered to select a reachable peer to connect to. Test-Parameters: trivial testlist=sanity-lnet Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com> Change-Id: I7c9964148a5a2a24d7889b8b4c2e488a433ca258 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53926 Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-17081 build: compatibility for 6.5 kernels Linux commit v6.4-rc2-29-gc6585011bc1d splice: Remove generic_file_splice_read() Prefer filemap_splice_read and provide alternates for older kernels. Linux commit v6.4-rc2-30-g3fc40265ae2b iov_iter: Kill ITER_PIPE ITER_PIPE and iov_iter_is_pipe() are removed, provide a replacement for iov_iter_is_pipe Linux commit v6.4-rc4-53-g54d020692b34 mm/gup: remove unused vmas parameter from get_user_pages() Use vma_lookup() to acquire the vma following get_user_pages() Linux commit v6.4-rc7-1884-gdc97391e6610 sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) Use sendmsg when MSG_SPLICE_PAGES is defined. Provide a wrapper using sendpage() for older kernels. HPE-bug-id: LUS-11811 Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com> Change-Id: I95a0954a602c8db08d30b38a50dcd50107c8f268 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52258 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Jian Yu <yujian@whamcloud.com> Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com> Reviewed-by: xinliang <xinliang.liu@linaro.org> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-9680 lnet: Convert net_fault.c to work with large NIDs Modify the lnet fault injection to handle large NIDs. Test-Parameters: trivial testlist=sanity-lnet Signed-off-by: Chris Horn <chris.horn@hpe.com> Change-Id: I0d57d3bf562444250b10fd83437107e2e3fe5a1b Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53731 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17467 build: Expand CUDA source detection logic Fix the configure logic not handling the package disabling (variable set to 'no') for the CUDA and GDS source paths Signed-off-by: Jean-Baptiste Skutnik <jb.skutnik@gmail.com> Change-Id: Icb96274a6df2508f8e3010daef0ba1d17b4471dc Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53832 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com> Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17271 kfilnd: Allocate tn_mr_key before kfilnd_peer A race exists between kfilnd_peer and tn_mr_key allocation that could result in RKEY re-use and data corruption. Thread 1: Posts tagged receive with RKEY based on peerA::kp_local_session_key X and tn_mr_key Y Thread 2: Fetches peerA with kp_local_session_key X Thread 1: Cancels tagged receive, marks peerA for removal, and releases tn_mr_key Y Thread 2: allocates tn_mr_key Y At this point, thread 2 has the same RKEY used by thread 1. The fix is to always allocate the tn_mr_key before looking up the peer, and always mark peers for removal before releasing tn_mr_key. This commit modifies the TN allocation to ensure the tn_mr_key is allocated before looking up the target peer. HPE-bug-id: LUS-11972 Test-Parameters: trivial Signed-off-by: Chris Horn <chris.horn@hpe.com> Change-Id: I2e0948ae4fe7c5dfb86e297a3437213f193bf67c Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53029 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com> Reviewed-by: Ian Ziemba <ian.ziemba@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>