Clarify how styled paragraphs are parsed and transformed
The AsciiDoc Language currently permits a paragraph to be promoted to a permissible named block by specifying that name as the block style. This is referred to as a styled paragraph. Consider this example:
[quote]
A quote block.
Now consider this example:
[source]
A source block.
The way this is implemented (according to the initial contribution) presents challenges for a language formalism for two reasons:
- The style influences how the ensuing lines are parsed
- The parser generates a named block which has lines instead of a child paragraph (when applicable)
The first challenge is not compatible with a language formalism. Allowing the style to modify the parsing rules is extremely difficult to express in a grammar. Thus, I'd like to propose a different approach which will be simpler to understand and have low risk for breaking compatibility with existing documents.
There are two parsing models for non-enclosed, non-marked lines: paragraph and literal paragraph. The former is a contiguous group of lines that ends at an empty or interrupting line. The later is a contiguous group of indented lines that ends at an empty line (no interrupting lines are possible). The later then drops the uniform indentation at the start of each line.
Thus, a styled paragraph is actually a transformation that occurs after parsing of the block is complete. A paragraph is parsed as a paragraph. A literal paragraph is parsed as a literal paragraph. Then the paragraph is promoted to a named block in the rule action. While it's not required that a styled paragraph for a verbatim block be written as a literal paragraph, it is required to avoid any interpretation of the lines. Thus, the second example above becomes:
[source]
A source block.
Which is equivalent to:
[source]
----
A source block.
----
That brings us to the second challenge. A styled paragraph is just a shorthand way of writing a named delimited block. Thus, once the parser has found a styled paragraph, it should generate the same result had it been written in the long form. For a verbatim block like a source block, this transformation is obvious since a verbatim block cannot have any child blocks. (You had one block, you get one block). However, a compound block like a sidebar block is more complex.
What we propose is that the paragraph be added as a child of the generated named block. (You had one block, you get two blocks). However, all metadata on the paragraph will get moved to the parent block. Thus, the paragraph will have no metadata (such as attributes or options). If the writer wants metadata on both the named block and the paragraph, it will be necessary to write it in long form (as a delimited block with a child paragraph). (Maybe we can hold back certain attributes/options, such as the lead role, text alignment roles, or the hardbreaks option).