Clarify syntax and parsing rules for continuing an attribute entry value across multiple lines
Most of the time, an attribute entry occupies a single line. For example:
:source-language: java
When the value is very long, the AsciiDoc syntax allows that value to be split across multiple lines by ending each previous line in a backslash, called an attribute continuation. This feature is inspired by shell interpreters, such as Bash. For example:
:description: This page is a migration guide. \
It only covers the migration between each LTS release.
The attribute continuation has never been very well defined beyond a basic example. This issue aims to resolve the syntax and parsing rules while also making the feature more robust and universal.
The attribute continuation serves two purposes. First, it tells the parser to append the next line to the value as long as that line is not an interrupting line. If the line is taken, the continuation and the newline that follows it are dropped. If the line is not taken, the continuation is preserved (meaning it remains as part of the value), but not the trailing newline.
Thus, the resolved value of the previous example is as follows:
This page is a migration guide. It only covers the migration between each LTS release.
NOTE: In addition to the interrupting lines for a paragraph, an attribute entry is interrupted by an adjacent attribute entry. Asciidoctor does not always get this requirement right. (Also, it's still unclear whether a list continuation should only be an interrupting line when inside of a list, or at any time).
Both Asciidoctor and its predecessor required the attribute continuation to be proceeded by a space. However, this is an unnecessary requirement and it makes it impossible to continue the the value without introducing a space. It should be possible to use the continuation directly at the end of the line.
:product-code: ISV-\
1234
This attribute entry would produce the value ISV-1234
.
Any time we rely on a character to have special meaning, especially a backslash, it should be possible to escape that character. Like with the inline preprocessor, we will want to apply contextual escaping here. What that means is that if there are an even number of backslashes at the end of the line, the last backslash does not act as an attribute continuation and those backslashes are reduced by half. If there are an odd number, the last backslash is an attribute continuation and the remaining backslashes are reduced by half. Escaped backslashes anywhere else in the line are not considered.
Here's an example of how to use a literal backslash at the end of a value:
:instructions: escape markup using \\
However, keep in mind that most of the time this won't be necessary. That's because the backslash is preserved if the attribute entry is interrupted, which it almost always is. So this is unlikely to affect existing documents. Consider this case:
:instructions: escape markup using \
{instructions}
Here's an example of how to use a literal backslash and then continue the value:
:instructions: escape an autolink using \\\
https://example.org
Again, these are pretty rare events, so we're just defining the rules for completeness.
An attribute continuation allows the continued value to be aligned with the value on the previous line. Yet, the indentation is dropped from the value. Consider this case:
:description: This page is a migration guide. \
It only covers the migration between each LTS release.
Shell interpreters also support this feature. In shell interpreters, the repeating spaces are always normalized to a single space. However, I don't think we want that behavior. Instead, all leading indentation should be removed and only the space to the left of the attribute continuation should be kept. That gives the user better control over where the space ends up in the resolved value.
Of course, we have to consider whether we even want to normalize the spaces at all or just keep them as entered. In other words, do we want to encourage this style of formatting in the AsciiDoc source, or should the wrapped line always start at the left margin?
The final point to consider is how to specify a hard wrap. Consider the case when the value of the attribute entry is going to be used in a verbatim block or a paragraph with the hardbreaks option. The author is going to want to be able to preserve the newlines in the attribute value so that they carry over. But this is not possible in AsciiDoc.
Asciidoctor offers a partial compromise by enhancing the attribute continuation to recognize a hard line break shorthand before the continuation. When Asciidoctor detects this case, it preserves the newline. Consider this case:
:lines: one + \
two + \
three
The resolved value would be as follows:
one +
two +
three
This is a not a general purpose feature, and thus I think we can do better. I see two possible ways to express that the newline should be preserved, and there's no need to link it to the hard line break shorthand.
The first option is to use a double attribute continuation offset by a space. For example:
:lines: one\ \
two\ \
three
The escaped space in front of the list continuation would tell the processor to keep the newline after the attribute continuation. This is not likely a syntax that would interfere with content. However, it may be costly to parse.
Another option is to take a page from YAML and use the |
character in front of the continuation as a hint to keep the newline.
:lines: one|\
two|\
three
However, the risk here is that the pipe character is used to separate table cells, so it could cause an AsciiDoc table cell to end prematurely. Though it could be escaped in that case.
Yet another option is to take a hint from Markdown and use multiple spaces in front of the continuation as a hint to preserve the ensuing newline.
:lines: one \
two \
three
This may be the safest and most portable option, and it's not terribly difficult to parse. It's rare that you need spaces at the end of a line, so we're able to take advantage of characters that would otherwise have no meaning. That's ideal for introducing a new feature. When newlines are preserved, indentation on wrapped lines is also preserved.
We can apply this to the earlier example of the partial syntax offered by Asciidoctor to see how it compares:
:lines: one + \
two + \
three
It's nearly the same syntax, but now it's not coupled to the hard line break shorthand.
- In summary, an attribute entry is interrupted by an adjacent attribute entry or paragraph interrupting line
- An attribute value can be continued to the next line by ending the line in an attribute continuation (trailing backslash)
- If the attribute continuation is unused, it is preserved at the end of the value
- Indentation is removed from wrapped lines
- The attribute continuation can be escaped using a backslash (any even number of backslashes at the end of the line)
- Newlines in an attribute value can be preserved by preceding the attribute continuation with two spaces