From Perf Wiki
Revision as of 21:14, 4 May 2015 by Acme
- Enable callchains for guests (used by perf kvm). At least doing this for the guest kernel should be very possible.
- The feature tests should be performed only when a file that needs those tests, or at least only when some .c or .h file will be rebuilt
- An initial step would be for 'make install-doc' not to run the feature tests, there it is not needed at all.
- Packages needed for the build should be checked before we start building object files, such as bison (bpetkov)
- Forward port the page fault tracepoints and use it in 'trace':
- Use the highest precision level available by default, e.g.: cycles:pp
- Use Kconfig to allow selecting features and build minimal version of perf, e.g. one with just 'record' for use in embedded platforms.
- David Ahern prototyped this, dig those patches and update them.
- Make the instruction augmentation in the annotate browser platform specific.
- Right now they are x86 specific but are in the common code.
- Cherry pick the plugin support in libtraceevent.
- Add build id support in PERF_RECORD_MMAP, so that we can support long running sessions where update of components may take place.
- Allow automatic downloading of DSOs with richer symtabs and DWARF info from debuginfo servers such as darkserver (https://fedoraproject.org/wiki/Darkserver).
- Limit the size of the build id cache (~/.debug), in a way similar to how ccache manages its cache.
- Adopt Vince Weaver's suite of tests in 'perf test'.
- Add reference counters to the dso and thread structs, so that in tools like 'top' we can remove unused threads from the dead_threads list and also unload symbol tables not referenced by any maps.
- Accumulate callchain info in order to get cumulative period info like 'sysprof'.
- Systemtap SDT suppport in 'perf probe'
- Move build-id trimming from perf-record to perf-archive:
- Just write the build-id for all DSOs, without trying to process all samples at perf-record time to find out which DSOs had samples and thus should be included in the perf.data build-id header
- At perf archive time, process all samples and trim the result, so that the tarball is smaller.
- Perhaps even an heuristic to figure out if the savings would be worth the trouble of processing all samples, i.e. look at the build-id table and do the math to figure out the sum of file sizes, if it is below some threshold, don't process the samples, just pack those files straight away, doing the sample processing only if it is more than that threshold.
- Implement --initial-delay, already available in 'perf stat', on 'perf trace'.
- Fix 'perf top --stdio -g' to limit the number of lines displayed, as it is not considering the callchains, perhaps we need to wire this up with the logic for '--max-stack', that is already available for 'perf top'. The problem is that it scrolls the screen, we can't see the top entries.
- What I want is that if I am on bar*(), it annotates bar*(), no samples just the call site (obtained from the callchain) dissassembly. This is useful because in many cases there maybe multiple call sites within a function and there maybe inlines in between. Hard to track down if you cannot figure out the surrounding addresses of the call site. (Request made by Stephane Eranian)
- Factorize the multidimensional sorting between perf report and annotate (will be used by perf trace)
- Implement a perf cmp (profile comparison between two perf.data) (DONE, its called 'perf diff')
- Implement a perf view (GUI) (Partially done, see 'perf report --gtk')
- Enhance perf trace:
- Handle the cpu field
- Handle the timestamp
- Use the in-perf ip -> symbol resolving
- Use the in-perf pid -> cmdline resolving
- Implement multidimensional sorting by field name