Adds a script tests/python_test.sh that checks whether there is a python3
binary and that it supports python version 3.9 or higher. Use this script
in the various cachegrind/tests annotate vgtests as prereq.
Add a touch of the cgout files so that they are more recent than the
source file. git clone seems to sometimes timestamp the source
after the cgout files which generates a warning and a post failure.
At least with FreeBSD on ZFS.
Make other tools consistent with this as well
(using memcheck as the model). Also refactored
the DRD user req names to make it clearer which
are Valgrind user reqs, which are DRD public
user reqs and which are DRD internal user reqs.
Also put back the isFF flag initialization (used for FreeBSD
non-fixed RO ELF segmentd) . I had intended to delete it but
in the end kept it for traces but had already deleted the init code.
For all the changes I've made recently. And also various other changes
that occurred over the past 20 years that didn't previously make it into
the docs.
Also, this change de-emphasises the cache and branch simulation aspect,
because they're no longer that useful. Instead it emphasises the
precision and reproducibility of instruction count profiling.
By not configuring the caches in that case. This requires moving a few
assertions around, because they currently assume that the caches are
configured.
And deprecate the use of `cg_diff` and `cg_merge`.
Because `cg_annotate` can do a better job, even annotating source files
when doing diffs in some cases.
The user requests merging by passing multiple cgout files to
`cg_annotate`, and diffing by passing two cgout files to `cg_annotate`
along with `--diff`.
Most notable, the "Function summary" section, which printed one CC for each
`file:function` combination, has been replaced by two sections, "File:function
summary" and "Function:file summary".
These new sections both feature "deep CCs", which have an "outer CC" for the
file (or function), and one or more "inner CCs" for the paired functions (or
files).
Here is a file:function example, which helps show which files have a lot of
events, even if those events are spread across a lot of functions.
```
> 12,427,830 (5.4%, 26.3%) /home/njn/moz/gecko-dev/js/src/ds/LifoAlloc.h:
6,107,862 (2.7%) js::frontend::ParseNodeVerifier::visit(js::frontend::ParseNode*)
3,685,203 (1.6%) js::detail::BumpChunk::setBump(unsigned char*)
1,640,591 (0.7%) js::LifoAlloc::alloc(unsigned long)
711,008 (0.3%) js::detail::BumpChunk::assertInvariants()
```
And here is a function:file example, which shows how heavy inlining can result
in a machine code function being derived from source code from multiple files:
```
> 1,343,736 (0.6%, 35.6%) js::gc::TenuredCell::isMarkedGray() const:
651,108 (0.3%) /home/njn/moz/gecko-dev/js/src/d64/dist/include/js/HeapAPI.h
292,672 (0.1%) /home/njn/moz/gecko-dev/js/src/gc/Cell.h
254,854 (0.1%) /home/njn/moz/gecko-dev/js/src/gc/Heap.h
```
Previously these patterns were very hard to find, and it was easy to overlook a
hot piece of code because its counts were spread across multiple non-adjacent
entries. I have already found these changes very useful for profiling Rust
code.
Also, cumulative percentages on the outer CCs (e.g. the 26.3% and 35.6% in the
example) tell you what fraction of all events are covered by the entries so
far, something I've wanted for a long time.
Some other, related changes:
- Column event headers are now padded with `_`, e.g. `Ir__________`. This makes
the column/event mapping clearer.
- The "Cachegrind profile" section is now called "Metadata", which is
shorter and clearer.
- A few minor test tweaks, beyond those required for the output changes.
- I converted some doc comments to normal comments. Not standard Python, but
nicer to read, and there are no public APIs here.
- Roughly 2x speedups to `cg_annotate` and smaller improvements for `cg_diff`
and `cg_merge`, due to the following.
- Change the `Cc` class to a type alias for `list[int]`, to avoid the class
overhead (sigh).
- Process event count lines in a single split, instead of a regex
match + split.
- Add the `add_cc_to_ccs` function, which does multiple CC additions in a
single function call.
- Better handling of dicts while reading input, minimizing lookups.
- Pre-computing the missing CC string for each CcPrinter, instead of
regenerating it each time.
- Move it to `auxprogs/`, alongside `pybuild.sh`.
- Disable the annoying design lints, instead of just modifying the
values (which often requires modifying them again later).
Users shouldn't ever see this, but it's useful to distinguish this
malformed data file case from the missing symbol case (which is still
shown as `???`).
It's currently written in C, but `cg_annotate` and `cg_diff` are written in
Python. It's better to have them all in the same language.
The good news is that the Python code is 4.5x shorter than the C code.
The bad news is that the Python code is roughly 3x slower than the C
code. But `cg_merge` isn't used that often, so I think it's a reasonable
trade-off.
For all the same reasons I rewrote `cg_annotate` in Python.
The commit also moves the Python "build" steps into
`auxprogs/pybuild.sh`, for easy sharing.
Finally, it very slightly tweaks the whitespace in the output of
`cg_annotate`.
- Every section now has a heading with the long `----` lines above and
below.
- Event names are always shown below that heading, rather than within
it.
- Each Unreadable file now gets its own section, much like files that
lack any data.
Currently their width is mostly hard-wired in a quick and dirty fashion.
This commit does them properly, so:
- all columns are always the right width, even ones with really large
percentages
- things like `( 1.00%)` are now `(1.00%)`
- any percentages that would involve a division by zero now show as
`(n/a)` rather than `( 0.00%)`
Perl was a reasonable choice for `cg_annotate` in 2002, but not in 2023.
Also, the existing structure of the code is not good. These two things
make it hard to modify `cg_annotate` in any significant way.
Benefits of the change:
- Now written in a language that is (a) nice, and (b) not moribund.
- Easier to maintain, due to (a) abovementioned better language, (b)
better code structure, and (c) better language tooling, such as
formatters, type checkers, and linters.
- The new version is a little shorter.
- It runs about 2x faster.
- Argument handling is more standard. E.g. things like `--context 2`,
`--auto`, `--no-auto` are supported. (The old forms that require `=`
are still supported, though the `=yes`/`=no` forms are deprecated.)
The behaviour and output of the new version is identical for typical
uses, but there are some very minor changes for edge cases, which nobody
is likely to notice. For example:
- The file format is slightly changed: I removed support for '.'
counts, which had the same meaning as '0'. This was a feature that
Cachegrind never used, and the old script handled it inconsistently.
- The new version will abort on a malformed data line. The old version
would just print a warning and continue.
The commit also adds a new test `ann3` that tests many parts of
`cg_annotate` that weren't tested previously, and tweaks the existing
`ann2` test.
Both a.c and cgout-test are checked into the repository and
used in testcases. Make sure cgout-test is newer than a.c
before running the post script to prevent warnings liks:
@@ WARNING @@ WARNING @@ WARNING @@ WARNING @@ WARNING @@ WARNING @@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ Source file 'a.c' is more recent than input file
../../cachegrind/tests/cgout-test'.
@ Annotations may not be correct.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
When running `cg_annotate` on files produced with `cg_diff`, it's common
to get multiple occurrences of this pair of errors:
```
Use of uninitialized value $pairs[0] in numeric lt (<) at
/home/njn/grind/ws1/cachegrind/cg_annotate line 848.
Use of uninitialized value $high in numeric lt (<) at
/home/njn/grind/ws1/cachegrind/cg_annotate line 859.
```
This is because `cg_annotate` wasn't properly handling the case where no
source code lines have annotations, which never happens in the normal
case but does happen in `cg_diff` output.
Happily, it turns out that the warnings were harmless, the fix is
trivial, and it doesn't change the output at all.
Rust v0 symbols can have `#` chars in them, things like this:
```
core::panic::unwind_safe::AssertUnwindSafe<<proc_macro::bridge::server::Dispat
cher<proc_macro::bridge::server::MarkedTypes<rustc_expand::proc_macro_server::Rustc>> as proc_macro::bridge::server::DispatcherTrait>::dispatch::{closure#14}>, ()>
```
`cg_diff` currently messes these up in two ways.
- It treats anything after a `#` in the input file as a comment. In
comparison, `cg_annotate` only treats a `#` as starting a comment at
the start of a line.
- It uses `#` to temporarily join file names and function names while
processing.
This commit adjusts the parsing to fix the first problem, and changes
the joiner sequence to `###` to fix the second problem.
Files in the root directory
Several Makefile.am files that have dependencies on FreeBSD autoconf
variables. Included a few new filter files to act as placeholders
to create new freebsd subdirectories.
Updated NEWS with the FreeBSD bugzilla items plus a couple of other
items fixed indirectly.
manpages-index.xml is just to easily get at each individual man page
with xsltproc. It wasn't a complete docbookx xml file. Now that it is
we can validate it with xmllint. It doesn't fully validate, but we
are close.