Skip to content

Conversation

jacob-keller
Copy link

@jacob-keller jacob-keller commented Oct 21, 2025

I had to look at the kdoc test output for some patches recently and found
it to be essentially unusable for quickly resolving the warning. Its output
is a diff (not even unified context) format which includes every warning
that merely changed its line number. This results in a lot of messes, as
most warnings will have line numbers change when code is added, removed, or
rearranged.

In addition, the test can sometimes pass even when it should fail. In rare
cases where a patch remove more warnings that it adds, the end result will
be considered acceptable.

Rather than using the raw diff output, I think we can do much better. I
came up with the idea of using two files in parallel: the original file,
and one with line numbers removed. Diff the two stripped files to figure
out which lines changed, then use sed to extract the changed lines from the
original file.

To ensure the diff won't produce results if line numbers change, we sort
the output of kernel-doc first by the warning text, and then by the file
name without the line number. In addition, warnings of the form "... line:"
which reference a following source line are squashed to a single line.

The diff can then easily pick up which warnings are actually new. It also
allows the kdoc script to log how many warnings are removed (i.e. fixed) as
an added bonus.

One critical change that took a while to track down is that we need to use
regular pipes for the merge and sort, otherwise the sub-process might not
be finished before the next steps of the algorithm continue.

I'm also considering scrapping this diff-based approach and writing a
simple python program that takes the kernel-doc output and does its own
comparison instead of trying to implement this in shell. Suggestions
welcome for any other improvements.

Here's an example of the original output compared to the new output:

New warnings added
1c1
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:160 missing initial short description on line:
---
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:416 missing initial short description on line:
3,5c3,7
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:269 No description found for return value of 'ice_vc_parse_rss_cfg'
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:334 No description found for return value of 'ice_vf_adv_rss_offload_ena'
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:339 missing initial short description on line:
---
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:525 No description found for return value of 'ice_vc_parse_rss_cfg'
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:685 No description found for return value of 'ice_vf_adv_rss_offload_ena'
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:802 No description found for return value of 'ice_hash_remove'
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:1024 No description found for return value of 'ice_map_ip_ctx_idx'
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:1555 missing initial short description on line:
7,8c9,10
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:347 No description found for return value of 'ice_vc_handle_rss_cfg'
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:476 missing initial short description on line:
---
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:1563 No description found for return value of 'ice_vc_handle_rss_cfg'
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:1673 missing initial short description on line:
10,11c12,13
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:483 No description found for return value of 'ice_vc_config_rss_key'
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:523 missing initial short description on line:
---
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:1680 No description found for return value of 'ice_vc_config_rss_key'
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:1720 missing initial short description on line:
13,14c15,16
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:530 No description found for return value of 'ice_vc_config_rss_lut'
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:569 missing initial short description on line:
---
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:1727 No description found for return value of 'ice_vc_config_rss_lut'
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:1766 missing initial short description on line:
16,19c18,21
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:576 No description found for return value of 'ice_vc_config_rss_hfunc'
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:618 No description found for return value of 'ice_vc_get_rss_hashcfg'
< Warning: drivers/net/ethernet/intel/ice/virt/rss.c:657 No description found for return value of 'ice_vc_set_rss_hashcfg'
< Warning: include/linux/avf/virtchnl.h:1651 missing initial short description on line:
---
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:1773 No description found for return value of 'ice_vc_config_rss_hfunc'
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:1815 No description found for return value of 'ice_vc_get_rss_hashcfg'
> Warning: drivers/net/ethernet/intel/ice/virt/rss.c:1854 No description found for return value of 'ice_vc_set_rss_hashcfg'
> Warning: include/linux/avf/virtchnl.h:1701 missing initial short description on line:
21c23
< Warning: include/linux/avf/virtchnl.h:1662 No description found for return value of 'virtchnl_vc_validate_vf_msg'
---
> Warning: include/linux/avf/virtchnl.h:1712 No description found for return value of 'virtchnl_vc_validate_vf_msg'
Per-file breakdown

Vs...

New warnings added:
Warning: drivers/net/ethernet/intel/ice/virt/rss.c:802 No description found for return value of 'ice_hash_remove'
Warning: drivers/net/ethernet/intel/ice/virt/rss.c:1024 No description found for return value of 'ice_map_ip_ctx_idx'
Per-file breakdown:
      2 drivers/net/ethernet/intel/ice/virt/rss.c

The context description is also improved to include the new/removed count:

Errors and warnings before: 15 This patch: 17 New: 2 Removed: 0

Currently, the kdoc test fails when the number of lines of outout from
./scripts/kernel-doc increases. This is a decent heuristic but can fail in
some cases. For example, if a patch adds one warning but removes another,
the total line count may not change. This could result in false negatives
where some patches do not fail the test despite adding new warnings.

Upon detecting a difference in warning count, the test uses diff to compare
the old and new warnings. This output is not very useful to humans. The
diff comparison includes the line numbers which are impacted by code
movement, making relevant warnings difficult to spot when reviewing test
output.

An ideal kdoc test should determine the actual set of new warnings. To
achieve this, use the following algorithm:

1) Capture the kernel-doc output for the old commit
2) Capture the kernel-doc output for the new commit
3) Carefully join and sort output so that its stable w.r.t line number changes
4) Strip line numbers from the warning content for both old and new output
5) Diff the stripped output, using --*-group-format to just print line
   number ranges.
6) Use the difference line ranges to print from the unstripped output,
   generating a set of new and removed warnings.

By ensuring that the sorted input is stable even if line numbers change,
this correctly locates all of the new warnings, and additionally any
removed warnings, avoiding a false test pass.

In addition, the output is far more useful for humans as the output is 1)
just the new warnings, and doesn't contain diff syntax and 2) doesn't
contain warnings whose only change is the line number.

This is a bit more complicated, especially since it requires reading
through the input and sorting multiple times. However, the end result is
significantly more useful.

Signed-off-by: Jacob Keller <[email protected]>
@jacob-keller
Copy link
Author

I got this hacky version working, but we might be better off taking the time to properly implement the comparison I am thinking of in python or something. Let me know if you think thats worth doing vs merging this.

I also found the per-file breakdown is buggy right now, probably due to the python refactor of kernel-doc.py including "Warning:" output, which I think I fixed in this change.

@kuba-moo
Copy link
Contributor

This doesn't look too terrible. I started adding Python "helpers" (calling out to Python scripts to do harder validation) so that's definitely an option. Should I merge this and we can iterate on improving as time allows? Or do you prefer to take a stab at the Py version right away?

@jacob-keller
Copy link
Author

I started on a python version today. Let me take a stab at it and I'll get back in a day or so and we can merge this if I can't get it working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants