From b7035222bd649d66beaa8dc8774ff53623fa54dd Mon Sep 17 00:00:00 2001 From: Alexey Lyashkov Date: Fri, 29 Nov 2019 13:43:15 +0300 Subject: [PATCH] LU-12991 lnet: lnet response entries leak LNetPut with ACK flag called, but LNetMDUnlink issued before ACK arrives. It can due timeout or it is application call (ldiskfs commit for difficult replies on MDT). It freed an MD but rsp don't detached, as ACK don't hold an reference to the MD between request sends and ACK arrives. monitor thread detect it situation and RSP entry moved into the zombie list, which don't freed as no msg processed due MD absense. Let's remove a response tracking in case nobody want to have reply aka LNetMDUnlink called. Test-parameters: trivial Cray-bug-id: LUS-8188 Signed-off-by: Alexey Lyashkov Change-Id: I90ad88cea41bb28b29f909c85b8273d41464ce81 Reviewed-on: https://review.whamcloud.com/36896 Reviewed-by: Amir Shehata Reviewed-by: Chris Horn Tested-by: jenkins Tested-by: Maloo Reviewed-by: Neil Brown Reviewed-by: Oleg Drokin --- lnet/include/lnet/lib-lnet.h | 2 ++ lnet/lnet/lib-md.c | 3 +++ 2 files changed, 5 insertions(+) diff --git a/lnet/include/lnet/lib-lnet.h b/lnet/include/lnet/lib-lnet.h index 75387fb..1f3c36e 100644 --- a/lnet/include/lnet/lib-lnet.h +++ b/lnet/include/lnet/lib-lnet.h @@ -256,6 +256,8 @@ lnet_md_free(struct lnet_libmd *md) { unsigned int size; + LASSERTF(md->md_rspt_ptr == NULL, "md %p rsp %p\n", md, md->md_rspt_ptr); + if ((md->md_options & LNET_MD_KIOV) != 0) size = offsetof(struct lnet_libmd, md_iov.kiov[md->md_niov]); else diff --git a/lnet/lnet/lib-md.c b/lnet/lnet/lib-md.c index c1c1192..ced8a77 100644 --- a/lnet/lnet/lib-md.c +++ b/lnet/lnet/lib-md.c @@ -553,6 +553,9 @@ LNetMDUnlink(struct lnet_handle_md mdh) lnet_eq_enqueue_event(md->md_eq, &ev); } + if (md->md_rspt_ptr != NULL) + lnet_detach_rsp_tracker(md, cpt); + lnet_md_unlink(md); lnet_res_unlock(cpt); -- 1.8.3.1