diff options
author | Mauro Carvalho Chehab <mchehab+huawei@kernel.org> | 2025-04-08 18:09:06 +0800 |
---|---|---|
committer | Jonathan Corbet <corbet@lwn.net> | 2025-04-09 12:10:32 -0600 |
commit | 094a4845789baade4df0ddccd5bd19a88af30b3f (patch) | |
tree | 17e4caa3b2a1a5339965baa4285d6c3696681170 /scripts/lib/kdoc/kdoc_files.py | |
parent | 33a92a5b2e6fae8d653faaa0271111b3237af1f3 (diff) |
scripts/kernel-doc.py: add a Python parser
Maintaining kernel-doc has been a challenge, as there aren't many
perl developers among maintainers. Also, the logic there is too
complex. Having lots of global variables and using pure functions
doesn't help.
Rewrite the script in Python, placing most global variables
inside classes. This should help maintaining the script in long
term.
It also allows a better integration with kernel-doc Sphinx
extension in the future.
I opted to keep this version as close as possible to what we
have already in Perl. There are some differences though:
1. There is one regular expression that required a rewrite:
/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/
As this one uses two features that aren't available by the native
Python regular expression module (re):
- recursive patterns: ?1
- atomic grouping (?>...)
Rewrite it to use a much simpler regular expression:
/\bSTRUCT_GROUP\(([^\)]+)\)[^;]*;/
Extra care should be taken when validating this script, as such
replacement might cause some regressions.
2. The filters are now applied only during output generation.
In particular, "nosymbol" argument is only handled there.
It means that, if the same file is processed twice for
different symbols, the warnings will be duplicated.
I opted to use this behavior as it allows the Sphinx extension
to read the file(s) only once, and apply the filtering only
when producing the ReST output. This hopefully will help
to speed up doc generation
3. This version can handle multiple files and multiple directories.
So, if one just wants to produce a big output with everything
inside a file, this could be done with
$ time ./scripts/kernel-doc.py -man . 2>/dev/null >new
real 0m54.592s
user 0m53.345s
sys 0m0.997s
4. I tried to replicate as much as possible the same arguments
from kernel-doc, with about the same behavior, for the
command line parameters starting with a single dash (-parameter).
I also added one letter aliases for each parameter, and a
--parameter (sometimes with a better name).
5. There are some sutile nuances between how Perl handles
certain regular expressions. In special, the qr operatior,
which compiles a regular expression also works as a
non-capturing group. It means that some regexes like
this one:
my $type1 = qr{[\w\s]+};
needs to be mapped as:
type1 = r'(?:[\w\s]+)?'
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/2fa671a9fb08d03a376a42d46cc0b1d3aab4ae3f.1744106241.git.mchehab+huawei@kernel.org
Diffstat (limited to 'scripts/lib/kdoc/kdoc_files.py')
0 files changed, 0 insertions, 0 deletions