Whamcloud - gitweb
LU-13309 osd: use per-cpu counters for brw_stats 15/37915/14
authorAndrew Perepechko <andrew.perepechko@hpe.com>
Thu, 2 Dec 2021 07:26:32 +0000 (10:26 +0300)
committerOleg Drokin <green@whamcloud.com>
Tue, 11 Jan 2022 06:18:23 +0000 (06:18 +0000)
commit787c1884e6451ae764568ade3658e537dcc19097
tree4e48f6f9273baa569341dc49e161549c3cf7c342
parent4a4d888a07da2eaec5723473cdc5440768b4a9e3
LU-13309 osd: use per-cpu counters for brw_stats

Based on perf reports, oh_lock is highly contended
when running IOR with NVMe storage, so we need to
move to per-cpu counters.

struct brw_stats becomes larger: from 3872 to 18208 bytes.
Also, 4 bytes are allocated per each cpu for every counter.
With an 8-cpu system and 32 4-byte per-cpu counters,
there are 448 per-cpu counters or 1792 bytes per-cpu.
These counters will either reuse already
allocated per-cpu pages or allocate a new page on each cpu
(8 pages total).

Change-Id: I24536a0138067fb868aaf962d9321dea7566d13f
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-8007, LUS-8185
Reviewed-on: https://review.whamcloud.com/37915
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/include/lprocfs_status.h
lustre/obdclass/lprocfs_status.c
lustre/obdclass/lprocfs_status_server.c
lustre/osd-ldiskfs/osd_handler.c
lustre/osd-ldiskfs/osd_io.c
lustre/osd-ldiskfs/osd_lproc.c
lustre/osd-zfs/osd_handler.c
lustre/osd-zfs/osd_io.c
lustre/osd-zfs/osd_lproc.c