Jolsa Howto Measure Child Functions Cycles

From Perf Wiki
(Difference between revisions)
Jump to: navigation, search
(Created page with "== Example program == * ex.c <pre> #define FOO(__f) \ static int __f(int i) \ { \ while (i--); \ } FOO(krava_a); FOO(krava_b); FOO(k...")
 
 
Line 67: Line 67:
 
<pre>
 
<pre>
 
$ perf report -p krava_g
 
$ perf report -p krava_g
Samples: 19K of event 'cycles', Event count (approx.): 17010569690                                                                        
+
Samples: 19K of event 'cycles', Event count (approx.): 17010569690
 
+  22.39%  ex  ex                [.] krava_d                              [other]
 
+  22.39%  ex  ex                [.] krava_d                              [other]
 
+  20.11%  ex  ex                [.] krava_c                              [other]
 
+  20.11%  ex  ex                [.] krava_c                              [other]

Latest revision as of 15:22, 20 June 2013

[edit] Example program

  • ex.c
#define FOO(__f)        \
static int __f(int i)   \
{                       \
        while (i--);    \
}

FOO(krava_a);
FOO(krava_b);
FOO(krava_c);
FOO(krava_d);

static void krava_g(void)
{
        krava_a(50);
        krava_b(50);
        krava_c(50);
        krava_d(50);
}

int main(int argc, char **argv)
{
        while (1) {
                krava_a(200);
                krava_b(200);
                krava_c(200);
                krava_d(200);
                krava_g();
        }
}
  • Makefile
ex: ex.o

[edit] Measurements

  • krava_[abcd] functions are taking 25% of the load
$ perf record  ./ex
^C[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 1.200 MB perf.data (~52446 samples) ]
./ex: Interrupt
$ perf report --stdio
...
# Overhead  Command      Shared Object                                Symbol
# ........  .......  .................  ....................................
#
    26.41%       ex  ex                 [.] krava_d                         
    25.16%       ex  ex                 [.] krava_b                         
    24.03%       ex  ex                 [.] krava_a                         
    23.95%       ex  ex                 [.] krava_c                         
     0.18%       ex  ex                 [.] krava_g                         

  • but how much krava_g function takes?
  • we need parent info - callchains, so let's record them with -g
$ perf record -g -- ./ex
^C[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.554 MB perf.data (~67889 samples) ]
./ex: Interrupt
  • and say we want to report parent symbol with -p krava_g
$ perf report -p krava_g
Samples: 19K of event 'cycles', Event count (approx.): 17010569690
+  22.39%  ex  ex                 [.] krava_d                              [other]
+  20.11%  ex  ex                 [.] krava_c                              [other]
+  20.00%  ex  ex                 [.] krava_b                              [other]
+  18.80%  ex  ex                 [.] krava_a                              [other]
+   4.90%  ex  ex                 [.] krava_a                              krava_g
+   4.71%  ex  ex                 [.] krava_b                              krava_g
+   4.39%  ex  ex                 [.] krava_c                              krava_g
+   4.28%  ex  ex                 [.] krava_d                              krava_g
+   0.14%  ex  ex                 [.] krava_g                              krava_g
+   0.08%  ex  ex                 [.] main                                 [other]

Previous:

  • parent symbol is displayed in the last column in above listing
  • you can see load gets separated by parent krava_g and the rest [other]
  • main based krava_[abcd] functions now eat only 20% now
  • krava_g based krava_[abcd] functions eat ~20% as well (+- other trash)

Next:

  • we can make the numbers more obvious and sort it by parent with -s parent
$ perf report -p krava_g -s parent --stdio
# Overhead  Parent symbol
# ........  .............
#
    81.54%  [other]      
            |          
            |--27.46%-- krava_d
            |          |          
            |          |--98.60%-- main
            |          |          __libc_start_main
            |          |          
            |           --1.40%-- __libc_start_main
            |          
            |--24.66%-- krava_c
            |          |          
            |          |--98.77%-- main
            |          |          __libc_start_main
            |          |          
            |           --1.23%-- __libc_start_main
            |          
            |--24.52%-- krava_b
            |          |          
            |          |--98.61%-- main
            |          |          __libc_start_main
            |          |          
            |           --1.39%-- __libc_start_main
            |          
            |--23.06%-- krava_a
            |          |          
            |          |--99.22%-- main
            |          |          __libc_start_main
            |          |          
            |           --0.78%-- __libc_start_main
             --0.30%-- [...]

    18.46%  krava_g      
            |          
            |--26.57%-- krava_a
            |          krava_g
            |          main
            |          __libc_start_main
            |          
            |--25.54%-- krava_b
            |          krava_g
            |          main
            |          __libc_start_main
            |          
            |--23.80%-- krava_c
            |          krava_g
            |          main
            |          __libc_start_main
            |          
            |--23.20%-- krava_d
            |          krava_g
            |          main
            |          __libc_start_main
            |          
            |--0.74%-- krava_g
            |          __libc_start_main
             --0.16%-- [...]
  • krava_g takes ~20% plus we can see the load distribution for its child functions
Personal tools