Decide which grammar (or grammars) to use to formally describe the AsciiDoc language
The backbone of a proper specification for the AsciiDoc language and how it's defined is a formal grammar. This moves the details away from implementation-specific code into something that other implementations can reference and build on. We'll need to decide what grammar (or grammars) (aka notation system) we will use to formally describe the language.
Using a grammar to describe AsciiDoc as we know it is going to be challenging. It's easy, albeit naive, to assume that we can use EBNF to describe the language. The implementation-defined approach leading up to this specification effort has led to some aspects of the language being intimately coupled with how it's parsed and in many cases, made it context sensitive (as opposed to context free like many grammars require). Existing grammars and parsing formalisms may not apply well to the AsciiDoc language. We need to get to the bottom of it. That may mean handling different parts of the language with different grammars. It may also mean having to make hard decisions about what aspects of the language must change in order for it to be formally defined. So research will be needed.
The purpose of this issue is to select a strategy for using grammar to describe the AsciiDoc language. It should identify challenges in using existing formalisms, outline possible approaches, and give reasoning as to why the proposed strategy should be selected. This is also an opportunity to get our terminology correct around this aspect of the project.
We're working from the position of retaining compatibility with existing content, but we need to move towards a formal grammar that can describe and validate that compatibility, particularly across implementations.