Proposal to transition inline syntax from substitutions to a formal grammar

changed the description

To express the PEG grammar in the normative sections of the spec, we may want to evaluate PEGN. It's a dialect of PEG that was designed for use in specifications.

PEGN combines the best of PEG, ABNF, EBNF, and JSON. It’s closely based on the PEG example but with essential additions including ABNF specificity, limit syntax, generator hints, and a specific AST notation in JSON.

We can discuss this in issue #8 (closed). I just wanted to add a note of it here for reference.

changed the description

added area/grammar triagediscuss labels

changed title from Transition inline syntax from substitutions to a formal grammar to Proposal to transition inline syntax from substitutions to a formal grammar

mentioned in issue #22 (closed)

mentioned in issue #8 (closed)

mentioned in issue #7 (closed)

marked this issue as related to #22 (closed)

#22 (closed) describes how attribute references would need to be handled by switching to this parsing strategy. Since attribute references can modify the input being parsed, it's necessary to process them before parsing the input into a node structure.

added kindproposal label

added triagepassed label and removed triagediscuss label

added assessmentactive label

mentioned in issue #4 (closed)

added assessmentcomplete label and removed assessmentactive label

A formal grammar is the only reasonable and portable way to describe AsciiDoc's inline syntax. Search and replace using regular expressions leaves too many ambiguities and idiosyncrasies, and it does not afford enough context to account for numerous permutations. Inline parsing also turned out not to be as difficult as we had initially thought. Isolating the inline preprocessor made it even simpler (with the side effect of extra effort to track offsets). Most important, compatibility can be reasonably maintained.

Fortunately, the inline syntax in AsciiDoc actually lends itself well to a PEG grammar because it's naturally recursive, mostly due to limitations of the output formats (DocBook, HTML, etc). So writers were naturally writing markup that can be expressed with a PEG grammar.

During our research, what we learned is that attribute references and inline passthroughs sit outside the inline grammar and must be handled using an inline preprocessor. Once this phase was introduced, the parsing of the inline markup was drastically simplified.

There are three topics that need to be handled in separate issues:

lexical backslash escaping
boundaries of a boxed attrlist on an inline macro (is closing ] found before nested markup is parsed, like their block-level counterparts, or is the closing ] matched after the nested markup is parsed?)
are replacements part of the inline grammar or a post-processing step? and are autolinks handled the same way?

This issue needs to be supported by an SDR that describes the decision that was made. The completion of that SDR will close this issue as well as #22 (closed).

added statusactive label

assigned to @mojavelinux

changed milestone to %0.2.0 (milestone build)

mentioned in merge request !18 (merged)

removed statusactive label

closed with merge request !18 (merged)

closed with commit a74fd368

Proposal to transition inline syntax from substitutions to a formal grammar

Background

Current state and challenges

Proposed change

Consequences

Feasibility

Why PEG?

Exhibit

Designs

Child items ...

Activity

Proposal to transition inline syntax from substitutions to a formal grammar

Background

Current state and challenges

Proposed change

Consequences

Feasibility

Why PEG?

Exhibit

Relates to

Activity