I would think it's one of the most basic rules of benchmarking (or so I was thought during my earlier days as a student) that one should repeat the benchmark several time to smooth over the "randomness" inherent in the system
That seems to instead be accounted for in this benchmark by just parsing more entries. The longer running the benchmark (if the task is homogeneous), the less noise should be relevant.
Yeah of course. But that’d also be affect it if the benchmark was shorter and was re-run a hundred times.
Though, granted in the case of re-running it you can do things like take the minimum or median time which are much better benchmark metrics, rather than the mean which is thrown off more by outliers and system noise.
Definitely bot trying to defend this as a good benchmarking scheme.