Main Page

From Perf Wiki
(Difference between revisions)
Jump to: navigation, search
(Internals)
(perf: Linux profiling with performance counters: add Google summer of code section)
 
(27 intermediate revisions by 9 users not shown)
Line 1: Line 1:
<b><big><i><center>...More than just counters...</center></i></big></b>
+
== <tt>perf:</tt> Linux profiling with performance counters ==
 +
''...More than just counters...''
  
 +
=== Introduction ===
  
<big>'''Performance Counters for Linux Wiki'''</big>
+
This is the wiki page for the Linux <tt>perf</tt> command, also called perf_events. <tt>perf</tt> is powerful: it can instrument CPU performance counters, tracepoints, kprobes, and uprobes (dynamic tracing). It is capable of lightweight profiling. It is also included in the Linux kernel, under tools/perf, and is frequently updated and enhanced.
  
This is the wiki page for the perfcounters subsystem in Linux.
+
<tt>perf</tt> began as a tool for using the performance counters subsystem in Linux, and has had various enhancements to add tracing capabilities.
  
Performance counters are special hardware registers available on most modern
+
Performance counters are CPU hardware registers that count hardware events such as instructions executed, cache-misses suffered, or branches mispredicted. They form a basis for profiling applications to trace dynamic control flow and identify hotspots. <tt>perf</tt> provides rich generalized abstractions over hardware specific capabilities. Among others, it provides per task, per CPU and per-workload counters, sampling on top of these and source code event annotation.
CPUs. These registers count the number of certain types of hw events: such  
+
as instructions executed, cache-misses suffered, or branches mispredicted -
+
without slowing down the kernel or applications. These registers can also
+
trigger interrupts when a threshold number of events have passed - and can
+
thus be used to profile the code that runs on that CPU.  
+
+
The Linux Performance Counter subsystem provides rich abstractions over these
+
hardware capabilities. It provides per task, per CPU and per-workload counters,
+
counter groups, and it provides sampling capabilities on top of those - and more.
+
  
It also provides abstraction for 'software events' - such as minor/major page faults, task migrations, task context-switches and tracepoints.
+
Tracepoints are instrumentation points placed at logical locations in code, such as for system calls, TCP/IP events, file system operations, etc. These have negligible overhead when not in use, and can be enabled by the <tt>perf</tt> command to collect information including timestamps and stack traces. <tt>perf</tt> can also dynamically create tracepoints using the kprobes and uprobes frameworks, for kernel and userspace dynamic tracing. The possibilities with these are endless.
  
There is a new tool ('perf') that makes full use of this new kernel subsystem. It can be used to optimize, validate and measure applications, workloads or the full system.
+
The userspace <tt>perf</tt> command present a simple to use interface with commands like:
  
'perf' is hosted in the upstream kernel repository and can be found under: tools/perf/
+
* <tt>[[Tutorial#Counting_with_perf_stat| perf stat</tt>]]: obtain event counts
 +
* <tt>[[Tutorial#Sampling_with_perf_record | perf record</tt>]]: record events for later reporting
 +
* <tt>[[Tutorial#Sample_analysis_with_perf_report | perf report</tt>]]: break down events by process, function, etc.
 +
* <tt>[[Tutorial#Source_level_analysis_with_perf_annotate | perf annotate</tt>]]: annotate assembly or source code with event counts
 +
* <tt>[[Tutorial#Live_analysis_with_perf_top | perf top</tt>]]: see live event count
 +
* <tt>[[Tutorial#Benchmarking_with_perf_bench | perf bench</tt>]]: run different kernel microbenchmarks
  
== Getting Started ==
+
To learn more, see the examples in the [[Tutorial]] or how to do a [[Top-Down Analysis]].
  
Once you have installed 'perf' on your system, the simplest way to start profiling an userspace program is to use the "perf record" and "perf report" command as follows:
+
To ask questions, report bugs/issues mail the [https://lore.kernel.org/linux-perf-users/ mailing list] or use [https://bugzilla.kernel.org/buglist.cgi?bug_status=__open__&order=changeddate%20DESC%2Cpriority%2Cbug_severity&product=Tracing%2FProfiling&query_format=advanced bugzilla].
  
  $ <b>perf record -f -- git gc</b>
+
=== Wiki Contents ===
&nbsp;
+
Counting objects: 1283571, done.
+
Compressing objects: 100% (206724/206724), done.
+
Writing objects: 100% (1283571/1283571), done.
+
Total 1283571 (delta 1070675), reused 1281443 (delta 1068566)
+
[ perf record: Captured and wrote 31.054 MB perf.data (~1356768 samples) ]
+
&nbsp;
+
$ <b>perf report --sort comm,dso,symbol</b> | head -10
+
# Samples: 1355726
+
#
+
# Overhead          Command                            Shared Object  Symbol
+
# ........  ...............  .......................................  ......
+
#
+
    31.53%              git  /usr/bin/git                            [.] 0x0000000009804f
+
    13.41%        git-prune  /usr/bin/git-prune                      [.] 0x000000000ad06d
+
    10.05%              git  /lib/tls/i686/cmov/libc-2.8.90.so        [.] _nl_make_l10nflist
+
      5.36%        git-prune  /usr/lib/libz.so.1.2.3.3                [.] 0x00000000009d51
+
      4.48%              git  /lib/tls/i686/cmov/libc-2.8.90.so        [.] memcpy
+
  
For more examples of how 'perf' can be used see [[perf examples]].
+
* [[Tutorial]]
 +
* [[Top-Down Analysis]]
 +
* [[Todo]]
 +
* [[HardwareReference]]
 +
* [[perf_events kernel ABI]]
 +
* [[perf tools support for Intel&reg; Processor Trace]]
 +
* [[Useful Links]]: How perf tools work, examples of usage to solve real problems, observability articles, hardware manuals
 +
* [[Glossary]]
 +
* [[Latest Manual Pages]]
 +
* [[Development]]
  
== TODO list ==
+
=== Google Summer of Code ===
  
=== Perf tools ===
+
As part of the [https://www.linuxfoundation.org/ Linux Foundation] the perf tool has participated in the [https://summerofcode.withgoogle.com/ Google Summer-of-Code] since 2021. [https://wiki.linuxfoundation.org/gsoc/2024-gsoc-perf Check out the 2024 process].
 
+
* Factorize the multidimensional sorting between perf report and annotate (will be used by perf trace)
+
* Implement a perf cmp (profile comparison between two perf.data)
+
* Implement a perf view (GUI)
+
* Enhance perf trace:
+
** Handle the cpu field
+
** Handle the timestamp
+
** Use the in-perf ip -> symbol resolving
+
** Use the in-perf pid -> cmdline resolving
+
** Implement multidimensional sorting by field name
+
 
+
== Internals ==
+
 
+
* Performance Monitoring Units (PMUs)
+
** [[Nehalem | Intel(TM) x86 Nehalem PMU]]
+
** [[Montecito | Intel(TM) Itanium(TM) 2 PMU]]
+
* Performance Counters for Linux
+
** [[PCLstruct| PCL core kernel data structures]]
+
** [[PCL internals | PCL core kernel internals]]
+
** [[perf internals | perf tool internals]]
+
 
+
== Notes ==
+
 
+
* INSTALLATION:
+
** in order to get the documentation installed you'll need these packages:
+
** I used on RHEL5: yum install asciidoc xmlto
+
*** asciidoc
+
*** tetex-fonts
+
*** tetex-dvips
+
*** dialog
+
*** tetex
+
*** tetex-latex
+
*** xmltex
+
*** passivetex
+
*** w3m
+
*** xmlto
+
** Don't forget to go into tools/perf and do 'make install-man'
+
** without doing the above, you won't be able to run 'perf help <command>'
+

Latest revision as of 17:33, 23 January 2024

Contents

[edit] perf: Linux profiling with performance counters

...More than just counters...

[edit] Introduction

This is the wiki page for the Linux perf command, also called perf_events. perf is powerful: it can instrument CPU performance counters, tracepoints, kprobes, and uprobes (dynamic tracing). It is capable of lightweight profiling. It is also included in the Linux kernel, under tools/perf, and is frequently updated and enhanced.

perf began as a tool for using the performance counters subsystem in Linux, and has had various enhancements to add tracing capabilities.

Performance counters are CPU hardware registers that count hardware events such as instructions executed, cache-misses suffered, or branches mispredicted. They form a basis for profiling applications to trace dynamic control flow and identify hotspots. perf provides rich generalized abstractions over hardware specific capabilities. Among others, it provides per task, per CPU and per-workload counters, sampling on top of these and source code event annotation.

Tracepoints are instrumentation points placed at logical locations in code, such as for system calls, TCP/IP events, file system operations, etc. These have negligible overhead when not in use, and can be enabled by the perf command to collect information including timestamps and stack traces. perf can also dynamically create tracepoints using the kprobes and uprobes frameworks, for kernel and userspace dynamic tracing. The possibilities with these are endless.

The userspace perf command present a simple to use interface with commands like:

To learn more, see the examples in the Tutorial or how to do a Top-Down Analysis.

To ask questions, report bugs/issues mail the mailing list or use bugzilla.

[edit] Wiki Contents

[edit] Google Summer of Code

As part of the Linux Foundation the perf tool has participated in the Google Summer-of-Code since 2021. Check out the 2024 process.

Personal tools