12 Commits

Author SHA1 Message Date
Paul Floyd
be26a1773e dhat: remove initial count value from access count histogram user requests
Based on feedback from Nick Nethercote.
2023-04-27 09:29:44 +02:00
Paul Floyd
424340403c Bug 464103 - Enhancement: add a client request to DHAT to mark memory to be histogrammed 2023-04-21 21:21:23 +02:00
Nicholas Nethercote
8c08253b89 Add support for copy and ad hoc profiling to DHAT. 2020-12-07 19:57:56 +11:00
Mark Wielaard
a489f40f78 docs: Make sure all elements that need it have an id tag.
When generating HTML it is useful if every element that can be referenced
has a stable id. If it doesn't a random one is generated which makes it
harder to link to parts of the manual on the website. It also generates
spurious diffs. Explicitly add an id tag for the sect2 and sect3 elements
in dh-manual, a unique id for each legalnotice element and for each
FAQ question and answer.
2020-06-09 11:23:46 +02:00
Mark Wielaard
555ddc4753 Use DTD DocBook XML V4.5 everywhere.
This makes the rule for xmllint easier since it doesn't need to
override the DTD to validate against. It also helps with other tools
tryinf to process the docbookx xml files.
2020-05-14 15:12:02 +02:00
Mark Wielaard
c0916494f6 docbook xml doesn't allow xref inside option, use link instead 2020-05-14 12:43:39 +02:00
Mark Wielaard
9a79b194f2 dh-manual.xml: Don't use computeroutput in title.
It seems legal docbook, but it crashes our xsltproc/pdfxmltex toolchain.
2020-05-13 16:43:51 +02:00
Mark Wielaard
53dd0183d8 dh-manual.xml: Put stray text before graphic in a para. 2020-05-13 15:52:32 +02:00
Mark Wielaard
7425c1bc96 dh-manual.xml: Remove duplicate dh-manual.options id.
Rename one to dh-manual.realloc.
2020-05-13 15:15:45 +02:00
Nicholas Nethercote
968bddcd4b Fix reads and writes counts in DHAT.
If you do `malloc(100)` followed by `realloc(200)`, DHAT now adds 100
bytes to the read and write counts for the implicit `memcpy`. This gives
more reasonable results.

I have long been surprised by low writes-per-byte values of around 0.35
for vectors that are grown by doubling. Counting the implicit `memcpy`
increases those numbers to well above 0.5, which is what you'd expect.

The commit also adds a section to the DHAT docs about `realloc`, because
there is some non-obvious behaviour, some of which confused me just a
couple of days ago.
2020-05-08 08:40:19 +10:00
Nicholas Nethercote
b71265fbc9 Mention --num-callers more in DHAT docs. 2019-04-08 10:18:38 +10:00
Nicholas Nethercote
441bfc5f51 Overhaul DHAT.
This commit thoroughly overhauls DHAT, moving it out of the
"experimental" ghetto. It makes moderate changes to DHAT itself,
including dumping profiling data to a JSON format output file. It also
implements a new data viewer (as a web app, in dhat/dh_view.html).

The main benefits over the old DHAT are as follows.

- The separation of data collection and presentation means you can run a
  program once under DHAT and then sort the data in various ways. Also,
  full data is in the output file, and the viewer chooses what to omit.

- The data can be sorted in more ways than previously. Some of these
  sorts involve useful filters such as "short-lived" and "zero reads or
  zero writes".

- The tree structure view avoids the need to choose stack trace depth.
  This avoids both the problem of not enough depth (when records that
  should be distinct are combined, and may not contain enough
  information to be actionable) and the problem of too much depth (when
  records that should be combined are separated, making them seem less
  important than they really are).

- Byte and block measures are shown with a percentage relative to the
  global count, which helps gauge relative significance of different
  parts of the profile.

- Byte and blocks measures are also shown with an allocation rate
  (bytes and blocks per million instructions), which enables comparisons
  across multiple profiles, even if those profiles represent different
  workloads.

- Both global and per-node measurements are taken at the global heap
  peak ("At t-gmax"), which gives Massif-like insight into the point of
  peak memory use.

- The final/liftimes stats are a bit more useful than the old deaths
  stats. (E.g. the old deaths stats didn't take into account lifetimes
  of unfreed blocks.)

- The handling of realloc() has changed. The sequence `p = malloc(100);
  realloc(p, 200);` now increases the total block count by 2 and the
  total byte count by 300. Previously it increased them by 1 and 200.
  The new handling is a more operational view that better reflects the
  effect of allocations on performance. It makes a significant
  difference in the results, giving paths involving reallocation (e.g.
  repeated pushing to a growing vector) more prominence.

Other things of note:

- There is now testing, both regression tests that run within the
  standard test suite, and viewer-specific tests that cannot run within
  the standard test suite. The latter are run by loading
  dh_view.html?test=1 in a web browser.

- The commit puts all tool lists in Makefiles (and similar files) in the
  following consistent order: memcheck, cachegrind, callgrind, helgrind,
  drd, massif, dhat, lackey, none; exp-sgcheck, exp-bbv.

- A lot of fields in dh_main.c have been given more descriptive names.
  Those names now match those used in dh_view.js.
2019-02-01 14:54:34 +11:00