Syntax highlighting for ESCET-languages in AsciiDoc-generated documentation
Split off from #38, to address syntax highlighting specifically.
Tasks
-
Choose which syntax highlighter to use. -
Enable chosen syntax highlighter, if not configured already. (!774 (merged)) -
Enable or make all scanners/lexers/languages we need. (!774 (merged))
Languages
Languages we currently use for source
blocks:
- Non-ESCET: bnf, c, console, html, java, javascript, markdown, properties, shell, svg (= xml)
- ESCET: chi, cif, dsm, raildiagram, setext, tooldef
- Some have no language, which is valid.
- Note that
console
andshell
are different.
General info
General AsciiDoctor information about syntax highlighting:
-
https://docs.asciidoctor.org/asciidoc/latest/verbatim/source-highlighter/
- Explains how to enable syntax highlighting, what syntax highlighters are available/supported, etc.
Syntax highlighters
Possible syntax highlighting solutions supported by AsciiDoctor:
Highlighter | Supported by |
---|---|
CodeRay | Asciidoctor, AsciidoctorJ, Asciidoctor PDF |
highlight.js | Asciidoctor, AsciidoctorJ, Asciidoctor.js |
Pygments | Asciidoctor, Asciidoctor PDF |
Rouge | Asciidoctor, AsciidoctorJ, Asciidoctor PDF |
Solution 1: CodeRay
- Pros:
- Supports AsciiDoctorJ/HTML and PDF.
- Cons:
- Highlights only 22 languages. It has C, HTML, Java, JavaScript, and XML (for SVG). It lacks BNF, console, Markdown, properties and shell.
- CodeRay is old. The website looks ancient. As of 2024-01-11, the last release was on 2020-05-30, almost 4 years ago. And the last commit was in 2022, but only build-related change etc.
- Its use is now discouraged by AsciiDoctor PDF, which recommends Rouge.
- I can't find how to add your own language scanner and make CodeRay find it. I did find that it supports plugins, but I didn't find information on how that works.
Conclusion: CodeRay is old, and there seems to be no information on how to add your own language. If we could find such documentation, CodeRay could be reconsidered.
Solution 2: highlight.js
- Pros:
- Supported for AsciiDoctorJ/HTML.
- Last release was just a few months ago (2023-10-09). Last commit was only 2 days ago. Seems very much alive.
- It highlights 242 languages, including all the non-ESCET ones that we use.
- It seems to have quite extensive documentation (for the current version).
- A quick experiment showed that it is easy to add a new language and activate it in the generated HTML page.
- highlight.js can be reused for HTML that is not generated by AsciiDoc, such that the to-be-contributed SBE course, which also has a lot of CIF model snippets that would benefit from highlighting (#742 (closed)).
- Cons:
- Not supported for PDF.
- The version used by AsciiDoctor is version 9, which is end of life and end of support. I reported it to AsciiDoctorJ, but they consider it a problem of AsciiDoctor. AsciiDoctor has an issue for it open for 3 years. I asked them to consider the upgrade, but they responded by putting the burden on me to do the work. It seems we may be stuck on version 9, for which the documentation is also not online anymore.
- If we provide highlight.js version 11, the callouts don't work anymore. However, we don't use them anyway, so that is not a big problem.
- Notes:
- AsciiDoctor repo has in
asciidoctor-main/lib/asciidoctor.rb
a fixed versionHIGHLIGHT_JS_VERSION = '9.18.3'
, which is then used to construct base URLs inlib/asciidoctor/syntax_highlighter/highlightjs.rb
usingbase_url = doc.attr 'highlightjsdir', %(#{opts[:cdn_base_url]}/highlight.js/#{HIGHLIGHT_JS_VERSION})
. This means we need to upgrade by ourselves. This should be easy though, as we can provide our own folder for it. We likely want this anyway, as then we get a fixed version, and the generated documentation doesn't rely on a working Internet connection (for syntax highlighting at least).
- AsciiDoctor repo has in
Conclusion: Can be used for the SBE course as well. Adding our own newer version than AsciiDoctor ships seems easy. We'd have to live with no highlighting in PDF output.
Solution 3: Pygments
- Pros:
- Supported for PDF.
- Cons:
- Not supported by AsciiDoctorJ/HTML.
Conclusion: Given that it can't be used for HTML output, I didn't check it further.
Solution 4: Rouge
- Pros:
- Supported for AsciiDoctorJ/HTML and PDF.
- It is the preferred syntax highlighter of AsciiDoctor PDF.
- It highlights 219 languages, including all non-ESCET languages we use, except for BNF.
- Its last release was only a few months ago (2023-10-25), and its last commit only six days ago. It seems very much alive.
- It has extensive documentation.
- It is clearly explained how to add your own lexer for your own language. A quick experiment shows that it is easy to add your own lexer to the Rouge repo. They welcome you submitting your own lexer for including in Rouge.
- Cons:
- Rouge only supports loading lexers bundled with Rouge, not external ones. I reported this as an issue. I don't like this, as when we change a language (add a keyword), we first need a new Rouge release before we can highlight the new keyword in our own documentation. It makes us dependent on new Rouge releases.
- Rouge performs the highlighting during the compilation. This will add to build times. It will also be done for every build, wasting CPU cycles. (This unlike highlight.js, which only performs the highlighting in the user's browser, and only for actually visited pages.)
- Rouge is a full scanner, and the code must be valid. Some of our code snippets have things like
...
in it to indicate omitted parts. That can't be lexed until we add all such exceptions to the lexer. (For highlight.js you only specify what is to be highlighted.)
- Notes:
- It is possible to load custom Ruby Gems with the AsciiDoctor Maven Plugin, using the
gemPaths
option (see here). It is also shown with an example in this issue. It seems this way we could load Gems even from a local directory in our repo. Whether we can overriderouge
that way, I'm not sure though, as it is meant to load extra plugins, not override existing plugins.
- It is possible to load custom Ruby Gems with the AsciiDoctor Maven Plugin, using the
Conclusion: Seems well-supported and alive. But, they don't response to our issue. And we would be dependent on Rouge release cycle (given the lack of contributed lexers), and we'd have to add all non-compliant syntax exceptions to the lexer as well.
Conclusion
The easy solution for now is to just go for highlight.js. We can change what we need. We have to live then with no highlighting in PDF output, but that is not the main output anyway.
Addresses #38