New rexEx errors seen in objdump2itb script
In mk/Common.mk there is a target that invokes the bin/objdump2itb script. This seems to run fine under Ubuntu 22.04, but after recently updating to Ubuntu 24.04 (with Python 3.12.3), I see the following:
/opt/toolchains/corev-openhw-gcc-ubuntu2204-20240530/bin/riscv32-corev-elf-objdump \
-d \
-S \
-M no-aliases \
-M numeric \
-l \
/home/mike/GitHubRepos/openhwgroup/cv32e20-dv/master/sim/uvmt/dsim_results/default/hello-world/0/test_program/hello-world.elf | /home/mike/GitHubRepos/openhwgroup/cv32e20-dv/master/bin/objdump2itb - > /home/mike/GitHubRepos/openhwgroup/cv32e20-dv/master/sim/uvmt/dsim_results/default/hello-world/0/test_program/hello-world.itb
/home/mike/GitHubRepos/openhwgroup/cv32e20-dv/master/bin/objdump2itb:77: SyntaxWarning: invalid escape sequence '\S'
FUNC_PATTERN = "(?P<addr>[0-9a-f]{8}) <(?P<name>\S*)>:"
/home/mike/GitHubRepos/openhwgroup/cv32e20-dv/master/bin/objdump2itb:83: SyntaxWarning: invalid escape sequence '\s'
INST_PATTERN = "(?P<addr>[0-9a-f]{1,8}):\t*(?P<mcode>[0-9a-f]{4}([0-9a-f]{4})?)\s{2,}(?P<asm>[a-z].*)$"
/home/mike/GitHubRepos/openhwgroup/cv32e20-dv/master/bin/objdump2itb:89: SyntaxWarning: invalid escape sequence '\S'
SRC_FILE_PATTERN = "^(?P<dir>/\S+)/(?P<file>[^/\s]+):(?P<line>[0-9]*)$"
As indicated by the error message, the problem is a set of three regular expressions on lines 77, 83 and 89. In all cases the SyntaxWarning: invalid escape sequence '\S' error message indicates that a backslash \ followed by the character S or s that Python does not recognize as a valid escape sequence.
I have no idea why this wasn't a problem with earlier versions of Ubuntu and/or Python3.
In a local workspace, I made the following changes:
diff --git a/bin/objdump2itb b/bin/objdump2itb
index 072397e2..ef479af8 100755
--- a/bin/objdump2itb
+++ b/bin/objdump2itb
@@ -74,19 +74,22 @@ class CFunction:
# Regular expression to extract a function from objdump
# Example:
# 00000256 <end_handler_incr_mepc>:
-FUNC_PATTERN = "(?P<addr>[0-9a-f]{8}) <(?P<name>\S*)>:"
+# Note: raw string to treat \S literally for regex
+FUNC_PATTERN = r"(?P<addr>[0-9a-f]{8}) <(?P<name>\S*)>:"
FUNC_RE = re.compile(FUNC_PATTERN)
# Regular expression to extract an individual instruction
# Example:
# 264: 00a31363 bne t1,a0,26a <end_handler_incr_mepc2>
-INST_PATTERN = "(?P<addr>[0-9a-f]{1,8}):\t*(?P<mcode>[0-9a-f]{4}([0-9a-f]{4})?)\s{2,}(?P<asm>[a-z].*)$"
+# Note: raw string to treat \s literally for regex
+INST_PATTERN = r"(?P<addr>[0-9a-f]{1,8}):\t*(?P<mcode>[0-9a-f]{4}([0-9a-f]{4})?)\s{2,}(?P<asm>[a-z].*)$"
INST_RE = re.compile(INST_PATTERN)
# Regular expression to extract a source annotation for each instruction
# Example:
# /work/strichmo/core-v-verif/cv32e40x/tests/programs/custom/debug_test_trigger/debugger.S:47
-SRC_FILE_PATTERN = "^(?P<dir>/\S+)/(?P<file>[^/\s]+):(?P<line>[0-9]*)$"
+# Note: raw string to treat \S and \s literally for regex
+SRC_FILE_PATTERN = r"^(?P<dir>/\S+)/(?P<file>[^/\s]+):(?P<line>[0-9]*)$"
SRC_FILE_RE = re.compile(SRC_FILE_PATTERN)
parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
This resolves the error messages, and an itb file is written out as expected, but I have no idea whether the generated itb file is correct.