LU-14699 mdd: proactive changelog garbage collection
Currently changelog starts garbage collection when user
exceeds maximum idle timeout, there is also limit by amount
of idle records but it is used only for old changelog users
which have no cur_time field, therefore it is not used at
all nowadays. Another problem is that garbage collection is
started only when changelog is almost full. That causes
often situations when changelog might have very old users
staying much longer than idle timeout and having idle
records above maximum limit consuming space for nothing.
Patch reworks changelog GC in the following way:
- GC starts when changelog is almost full (old way) or
either idle time or idle records limits are exceeded or
when (idle_time * idle_records) exceeds its limit as well.
The latest limit is calculated as:
(idle_time * idle_records) / 84600 > (1 << 32) which is a
reasonable heuristic for deciding if a user is "too idle"
in both cases when lots records being created quickly vs
user is idle a very long time.
- to avoid the processing of changelog users each time GC is
checking all conditions both least user record and time
are tracked when changelog users are initialized or
purged/canceled. Both values are stored as mdd_changelog
fields mc_minrec and mc_mintime
- test 160g is changed to test the new approach when idle
indexes are checked always along with idle time checks
- test 160s is added in sanity.sh to check heuristic approach
with (idle_time * idle_records) value checking
Fixes:
3442db6faf68 ("LU-7340 mdd: changelogs garbage collection")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I6028f3164212a2377a4fc45b60a826c64f859099
Reviewed-on: https://review.whamcloud.com/45068
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>