Whamcloud - gitweb
LU-12168 utils: obdfilter fix for SHORT msgs 10/34610/3
authorAlexander Boyko <c17825@cray.com>
Mon, 8 Apr 2019 07:55:24 +0000 (03:55 -0400)
committerOleg Drokin <green@whamcloud.com>
Tue, 30 Apr 2019 03:35:37 +0000 (03:35 +0000)
Sometimes obdfilter-survey shows SHORT instead of min,max.
This could happen when two signals for a parent process comes
during a verbose time. The counters are updated and start_time
is dropped. By default timeperiod is 1 second.

ost  1 sz 16777216K rsz 2048K obj    4 thr    8
write 3662.99 [4286.00,4528.95] rewrite 3873.87 [4746.85, 4857.48]
read 8088.39      SHORT

The patch fixes this issue and drops counters and time when
statistics are printed or all threads are started.

Obdfilter-survey can print SHORT after patch when subtest time
is too small 1-2 seconds. The detail log shows this case as

Total: total 8192 threads 4 sec 1.692006 4841.590396/second

Test-Parameters: trivial
Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-7110
Change-Id: I9b1521c23e9360216a279ab5c28c39bcaca9974b
Reviewed-on: https://review.whamcloud.com/34610
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/utils/obd.c

index 5903dca..3825730 100644 (file)
@@ -530,16 +530,21 @@ static void shmem_snap(int total_threads, int live_threads)
         }
 
         secs = difftime(&this_time, &prev_time);
-        if (prev_valid && secs > 1.0)    /* someone screwed with the time? */
-                printf("%d/%d Total: %f/second\n", non_zero, total_threads,
-                       total / secs);
+       if (prev_valid && secs > 1.0) {   /* someone screwed with the time? */
+               printf("%d/%d Total: %f/second\n", non_zero, total_threads,
+                      total / secs);
 
-        memcpy(counter_snapshot[1], counter_snapshot[0],
-               total_threads * sizeof(counter_snapshot[0][0]));
-        prev_time = this_time;
-        if (!prev_valid &&
-            running == total_threads)
-                prev_valid = 1;
+               memcpy(counter_snapshot[1], counter_snapshot[0],
+                      total_threads * sizeof(counter_snapshot[0][0]));
+               prev_time = this_time;
+       }
+       if (!prev_valid && running == total_threads) {
+               prev_valid = 1;
+               /* drop counters when all threads were started */
+               memcpy(counter_snapshot[1], counter_snapshot[0],
+                      total_threads * sizeof(counter_snapshot[0][0]));
+               prev_time = this_time;
+       }
 }
 
 static void shmem_stop(void)