If all the parallels dup the same fd 0 and share the same file reference
count, then the f_count will meet with heavy lock contention. The syscall
cost of dup/close will occupy only a few in the test result. Allocating
one unique fd for each parallel will reduce lots of the unexpected
lock contention cost. And it will fully perform the syscall cost of
dup/close.
If the parallel number is 1, the testing result with this patch is the
same with the original one on ICX server, which is expected. If the
parallel number is large, the testing result will accurately show the
syscall cost of dup/close without the impact of data sharing.
Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
title[22] -> title[18] to match length of strings passed.
(not pretty code and ultimately title[] is src to strcpy(),
but at least have the prototype and function definition match)
According to http://man7.org/linux/man-pages/man2/getpid.2.html:
> From glibc version 2.3.4 up to and including version 2.24, the glibc
wrapper function for getpid() cached PIDs, with the goal of avoiding
additional system calls when a process calls getpid() repeatedly.
So it's not suitable to messure the system call performance through
getpid(). Directly call syscall(SYS_getpid) is more appropriate.
From glibc version 2.25, cached pid is removed to fix some bugs which
makes the testsuite wrongly report performance regression on system call.
Same issue is reported to unixbench upstream long time ago, but nobody
cares. https://github.com/kdlucas/byte-unixbench/pull/58
Signed-off-by: Yuanhong Peng <yummypeng@linux.alibaba.com>
github: closes#58
* Fix Result Report Race Condition in Pipe-based Context Switching Test
Ensure all report() calls yield correct information.
* Simplify code in Pipe-based Context Switching Test
Remove un-needed iter1 variable
Addresses "slave write failed: Broken pipe; aborting"
There are two processes that are alternating reading and writing
a sequence number of sizeof(unsigned long) size, which is 4 bytes
on 32-bit ILP32 ABI and 8 bytes on 64-bit LP64 ABI. The read/write
passing of incrementing sequence number occurs in infinite loop
until an alarm signals each process. There is a race condition
where a signal delivered to one process might close the pipes while
the second process was still attempting to read or write from the
pipes, and before the second process was interrupted with SIGALRM.
This patch fixes the race condition that occurs at the end of the
test run, after the first SIGALRM is delivered.
This patch does not address the paranoid possibility that read() or
write() of 4 or 8 bytes might theoretically be a partial read() or
write(), but that is extremely unlikely except in the case of a signal
being delivered, and the only signal expected is SIGALRM, and the
processing of SIGALRM by report() function does not return. (This
patch adds code to ignore SIGPIPE, so SIGALRM is the only expected
signal.)
github: fixes#1
const correctness
format string safety
remove assigned, but unused, variables
fix arguments to execl()
remove defined but unused warnings (for code used only by some tests)
barusan's patch mostly retains compatibility with linux, but
unconditionally used machdep instead of /proc/cpuinfo
This attempts to merge the patch without harming behaviour on linux by
detecting the darwin platform and using machdep there but restores
/proc/cpuinfo elsewhere.
The original source of this repo is https://code.google.com/p/byte-unixbench/
where the project is listed as being released under the GPL v2 license, but
as it was metadata in Google Code but not explicitly named in the project source
code itself, it did not transfer when the project was moved to GitHub.
This change clarifies the license of the project and provides a complete copy of
the license to make it explicit to future users and contributors.
Quoting original author of this patch:
Simply un-limits the 'misc' and 'system' suites.
Half-related thoughts about testing quality:
I'm curious why there's a shell1, shell8, and shell16 set of tests. Aren't the
latter two equivalent to './Run -c 8 shell1' and './Run -c 16 shell1'? I think
shell8 and shell16 are pointless if this is the case.
At the very least, I think shell8 should be out of the default run (the $index
set), because it will essentially give a misleading number if you have more
than a single core in the system. Isn't the purpose of the serial run to
essentially measure how well the system performs on single-threaded activities?
Or perhaps to measure how well a single core performs? Having 'shell8' in the
$index set artificially inflates the score for serialized runs and artificially
damages the score during maxed-out parallelized runs. If you are actually
interested in seeing how well 'shell8' does on exactly one core, shouldn't you
do the equivalent of 'taskset 1' on it, forcing the child processes to stay on
that single core?
End of quote.
Signed-off-by: Carlos L. Torres <carlos.torres@rackspace.com>