Whamcloud - gitweb
LU-10391 lnet: use GFP_ATOMIC for alloc under spinlock 97/53597/5
authorAndreas Dilger <adilger@whamcloud.com>
Thu, 4 Jan 2024 21:07:43 +0000 (14:07 -0700)
committerOleg Drokin <green@whamcloud.com>
Thu, 18 Jan 2024 06:14:39 +0000 (06:14 +0000)
Use genradix_ptr_alloc(GFP_ATOMIC) when allocating under a spinlock
(in this case lnet_net_lock_current() is acquiring the per-CPT lock)
to avoid "BUG: sleeping function called from invalid context" in
lnet_discover() and lnet_scan_route() when memory debugging enabled.

 BUG: sleeping function called from invalid context at page_alloc.c:3423
 in_atomic(): 1, irqs_disabled(): 0, pid: 22268, name: lnetctl
 CPU: 3 PID: 22268 Comm: lnetctl  3.10.0-7.9-debug #1
 Call Trace:
   dump_stack+0x19/0x1b
   __might_sleep+0xd9/0x100
   __alloc_pages_nodemask+0x313/0xca0
   alloc_pages_current+0x98/0x110
   __get_free_pages+0xe/0x50
   __genradix_ptr_alloc+0xa2/0x1a0 [libcfs]
   lnet_discover+0x16e/0x220 [lnet]
   lnet_ping_cmd+0x6ab/0x1160 [lnet]
   genl_family_rcv_msg+0x1fa/0x420
   genl_rcv_msg+0x5b/0xc0
   netlink_rcv_skb+0xa9/0xc0
   genl_rcv+0x28/0x40
   netlink_unicast+0x16a/0x210
   netlink_sendmsg+0x308/0x420
   sock_sendmsg+0xb0/0xf0
   ___sys_sendmsg+0x401/0x410
   __sys_sendmsg+0x51/0x90
   SyS_sendmsg+0x12/0x20

Fixes: 68254c484a ("LU-10391 lnet: handle discovery with Netlink")
Fixes: 4ccac8297d ("LU-9680 lnet: collect data about routes using Netlink")
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I96f1fd6f6273a7720d661526e58a94210f3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53597
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
lnet/lnet/api-ni.c

index d177a81..aebc11c 100644 (file)
@@ -6572,7 +6572,7 @@ static int lnet_scan_route(struct lnet_genl_route_list *rlist,
 
                                prop = genradix_ptr_alloc(&rlist->lgrl_list,
                                                          rlist->lgrl_count++,
-                                                         GFP_KERNEL);
+                                                         GFP_ATOMIC);
                                if (!prop)
                                        GOTO(failed_alloc, rc = -ENOMEM);
 
@@ -9050,7 +9050,7 @@ lnet_discover(struct lnet_processid *pid, u32 force,
                struct lnet_processid *id;
 
                id = genradix_ptr_alloc(&dlist->lgpl_list,
-                                       dlist->lgpl_list_count++, GFP_KERNEL);
+                                       dlist->lgpl_list_count++, GFP_ATOMIC);
                if (!id) {
                        rc = -ENOMEM;
                        goto out_decref;