Whamcloud - gitweb
LU-12991 lnet: lnet response entries leak 96/36896/8
authorAlexey Lyashkov <c17817@cray.com>
Fri, 29 Nov 2019 10:43:15 +0000 (13:43 +0300)
committerOleg Drokin <green@whamcloud.com>
Tue, 28 Jan 2020 06:02:14 +0000 (06:02 +0000)
LNetPut with ACK flag called, but LNetMDUnlink issued before ACK
arrives. It can due timeout or it is application call (ldiskfs commit
for difficult replies on MDT).
It freed an MD but rsp don't detached, as ACK don't hold an reference
to the MD between request sends and ACK arrives.
monitor thread detect it situation and RSP entry moved into the zombie
list, which don't freed as no msg processed due MD absense.

Let's remove a response tracking in case nobody want to have reply aka
LNetMDUnlink called.

Test-parameters: trivial

Cray-bug-id: LUS-8188
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I90ad88cea41bb28b29f909c85b8273d41464ce81
Reviewed-on: https://review.whamcloud.com/36896
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lnet/include/lnet/lib-lnet.h
lnet/lnet/lib-md.c

index 75387fb..1f3c36e 100644 (file)
@@ -256,6 +256,8 @@ lnet_md_free(struct lnet_libmd *md)
 {
        unsigned int  size;
 
+       LASSERTF(md->md_rspt_ptr == NULL, "md %p rsp %p\n", md, md->md_rspt_ptr);
+
        if ((md->md_options & LNET_MD_KIOV) != 0)
                size = offsetof(struct lnet_libmd, md_iov.kiov[md->md_niov]);
        else
index c1c1192..ced8a77 100644 (file)
@@ -553,6 +553,9 @@ LNetMDUnlink(struct lnet_handle_md mdh)
                lnet_eq_enqueue_event(md->md_eq, &ev);
        }
 
+       if (md->md_rspt_ptr != NULL)
+               lnet_detach_rsp_tracker(md, cpt);
+
        lnet_md_unlink(md);
 
        lnet_res_unlock(cpt);