Do I need to have to run perf
userspace tool as system administrator (root), or can I run it (or at least some subcommands) as an ordinary user?
Linux – Do I need root (admin) permissions to run userspace ‘perf’ tool? (perf events are enabled in Linux kernel)
kernellinuxnot-root-userperf-event
Related Solutions
How to install perf
userland tool as non-root
Get/find sources for kernel-2.6.36-gentoo-r4 (in Gentoo Linux). The first check from this answer
Actually, first you should look at
/usr/src/linux
and see if the kernel sources are still installed. You could just copy them to a directory you can write to.)was enough, though instead of copying whole kernel sources I just linked them:
$ mkdir -p build $ cd build $ ln -s /usr/src/linux-2.6.36-gentoo-r4
Create directory where
perf
would be built, as I won't be able to write in~/build/linux-2.6.36-gentoo-r4
directory.$ mkdir -p perf
Actually it was not what I did at first... error messages from
make
were entirely unhelpful at first.Go to
tools/perf
directory in kernel sources$ cd linux-2.6.36-gentoo-r4/tools/perf
Build
perf
, not forgetting about passingO=<destdir>
option to makefile as the directory is not writable (there would be no such problem if I copied rather than symlinked kernel sources).$ make O=~/build/perf -k Makefile:565: newt not found, disables TUI support. Please install newt-devel or libnewt-dev * new build flags or prefix CC ~/build/perf/perf.o CC ~/build/perf/builtin-annotate.o [...] CC ~/build/perf/util/scripting-engines/trace-event-python.o CC ~/build/perf/scripts/python/Perf-Trace-Util/Context.o AR ~/build/perf/libperf.a LINK ~/build/perf/perf ~/build/perf/libperf.a(trace-event-perl.o): In function `define_flag_value': ~/build/linux-2.6.36-gentoo-r4/tools/perf/util/scripting-engines/trace-event-perl.c:127: undefined reference to `PL_stack_sp' ~/build/linux-2.6.36-gentoo-r4/tools/perf/util/scripting-engines/trace-event-perl.c:131: undefined reference to `Perl_push_scope' [...] ~/build/perf/libperf.a(trace-event-python.o): In function `handler_call_die': ~/build/linux-2.6.36-gentoo-r4/tools/perf/util/scripting-engines/trace-event-python.c:53: undefined reference to `PyErr_Print' [...] collect2: ld returned 1 exit status make: *** [/home/narebski/build/perf/perf] Error 1 GEN perf-archive make: Target `all' not remade because of errors.
Google for "undefined reference to `Perl_push_scope'". Find Fail to install perf on slackware 13.1 on unix.stackexchange.com. Follow the advice in self answer, or to be more excat the diagnosis:
$ make O=~/build/perf -k NO_LIBPERL=1 NO_LIBPYTHON=1 Makefile:565: newt not found, disables TUI support. Please install newt-devel or libnewt-dev * new build flags or prefix CC ~/build/perf/perf.o CC ~/build/perf/builtin-annotate.o [...] CC ~/build/perf/util/probe-finder.o AR ~/build/perf/libperf.a LINK ~/build/perf/perf GEN perf-archive
Note that it is workaround rather than a solution (I have
libperl.so
).Check Makefile for default install destination: its
$(HOME)
. Installperf
in one's own home directory:$ make O=~/build/perf -k NO_LIBPERL=1 NO_LIBPYTHON=1 install Makefile:565: newt not found, disables TUI support. Please install newt-devel or libnewt-dev GEN perf-archive install -d -m 755 '~/bin' install ~/build/perf/perf '~/bin' [...] install scripts/python/bin/* -t '~/libexec/perf-core/scripts/python/bin'
Check that
~/bin
is in PATHCheck that
perf
works correctly (don't forget to cd in writable directory):$ cd $ perf record -f -- sleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (~61 samples) ]
The output is a bit redacted, replacing my home directory with ~
.
If you are distributing the computations with MPI, then using an MPI-aware tool would give you more sensible results: with a distributed application, you might have issues of load imbalance, where one MPI process is idle waiting for data to come from other processes. If you happen to be profiling exactly that MPI process, your performance profile will be all wrong.
So, the first step is usually to find out about the communication and load balance pattern of your program, and identify a sample input that gives you the workload you want (e.g., CPU-intensive on rank 0) For instance, mpiP is an MPI profiling tool that can produce a very complete report about the communication pattern, how much time each MPI call took, etc.
Then you can run a code profiling tool on one or more selected MPI ranks. Anyway, using perf
on a single MPI rank is likely not a good idea because its measurements will contain also the performance of the MPI library code, which is probably not what you are looking for.
Best Answer
What you can do with
perf
without being root depends on thekernel.perf_event_paranoid
sysctl setting.kernel.perf_event_paranoid
= 2: you can't take any measurements. Theperf
utility might still be useful to analyse existing records withperf ls
,perf report
,perf timechart
orperf trace
.kernel.perf_event_paranoid
= 1: you can trace a command withperf stat
orperf record
, and get kernel profiling data.kernel.perf_event_paranoid
= 0: you can trace a command withperf stat
orperf record
, and get CPU event data.kernel.perf_event_paranoid
= -1: you get raw access to kernel tracepoints (specifically, you canmmap
the file created byperf_event_open
, I don't know what the implications are).