The task at hand is to determine the appropriate format and schema for the TCK input. Currently, the format being used is plain text, which contains only the content of the main AsciiDoc document. While this format may suffice for basic testing scenarios, it won't be suitable for testing more complex cases, such as an AsciiDoc document with includes. In such cases, it would be necessary to have an array of files, including their paths and contents.
It may also be useful to include contextual data in the input, such as the TCK version. To facilitate efficient processing of the input and output, it might be a good idea to consider using JSON as the format for the TCK input. This is because JSON is already being used as the output format for the TCK.
Using JSON for the input format also makes sense since we are considering adding a client/server interface (in addition to the existing stdin/stdout interface). JSON is a ubiquitous format in client/server data exchange. I believe that using JSON for the input and output format would make it easier to implement a TCK adapter/server (i.e., it's fairly common to implement a server that consumes JSON and returns JSON).
There are some open questions that need to be addressed when deciding on the format and schema of the TCK input. For instance, should the configuration be included, and should information about the test case (such as test name, test suite, specification reference, etc.) also be included?
Edited
Designs
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
it might be a good idea to consider using JSON as the format for the TCK input.
I like this idea.
For instance, should the configuration be included, and should information about the test case (such as test name, test suite, specification reference, etc.) also be included?
When writing test cases for a TCK, you usually want to provide as little information as possible so implementations are not tempted to cheat the test. And you certainly don't want to expose the expected data.
I think it's reasonable to provide enough information to develop a mock/stub adapter that can echo the expected data back. That sort of defeats the purpose of what I just stated, but perhaps we can provide just enough information that allows that adapter to rely on internal assumptions without making it super obvious how to find the ASG file for a real adapter.
When considering how to pass input files, I still think the harness should read them and provide the contents as a string. We don't want adapters reading files out of the tests folder.
What I'm thinking is that the primary file is represented using top-level keys, and additional files are stored in a separate key. That way, it's clear which file is the main file, which is all the processor needs to know.
What I'm not sure about is how to handle tests that are sensitive to the directory in which the processor is running. We may end up needing something like startDir to control that, but we'll have to see as those tests come up.
When considering how to pass input files, I still think the harness should read them and provide the contents as a string. We don't want adapters reading files out of the tests folder.
Furthermore, we shouldn't assume that an implementation/adapter can read files from the filesystem.
What I'm thinking is that the primary file is represented using top-level keys, and additional files are stored in a separate key. That way, it's clear which file is the main file, which is all the processor needs to know.
We should probably use path instead of file if we want to test relative paths:
We will also want a key for processor options, one of which is attributes to pass to the API:
That's a good point.
Processor options will be defined in the -config.json file?
I think it's still reasonable to pass the test name and test context for the purpose of the stub adapter:
What I'm not sure about is how to handle tests that are sensitive to the directory in which the processor is running. We may end up needing something like startDir to control that, but we'll have to see as those tests come up.
Processor options will be defined in the -config.json file?
Yes. Not all configuration options will be passed (since some might be to configure how the harness prepares the source for the test), but all configuration will be encapsulated in that file.
We might need to pass absolute paths:
I don't love that idea. But what we could consider is that they are root-relative paths...meaning paths starting from where the primary file is located. But since I haven't tried to make a test for includes yet, it's hard to imagine exactly what we need.
I think we should keep the AsciiDoc files in the 'tests' folder and encode them into JSON format within the TCK.
We have to be careful here because they do need to be stable/shippable artifacts. The reason they are static is because it represents a file we can point to and agree on that's the expected result. I have no problem automating this process to make updates, but I think these files do need to be committed.
We have to be careful here because they do need to be stable/shippable artifacts. The reason they are static is because it represents a file we can point to and agree on that's the expected result.
I'm referring to the input files but your point still stands.
I have no problem automating this process to make updates, but I think these files do need to be committed.
I'm OK with that. My point is that it's easier to understand a test case with AsciiDoc files (especially for advanced test cases with includes). If everything is encoded in a JSON file then it becomes a bit tedious to decipher the test case (since the AsciiDoc contents will be encoded as a single line of text).
Alternatively, we could provide a test case pretty print:
$ asciidoc-tck pp block/include/simplemain.adoc<<EOF= MainThis is the main document.include::chapter1.adoc[]EOFchapter1.adoc<<EOF= Chapter 1Some content.EOF
it's probably better to have a npm task to generate the JSON input file from the AsciiDoc file(s).
Oh, for sure. I should clarify that this is how I was making the test files. I'm just doing it with a different project, the parsing lab (because we actually need some sort of implementation to generate them).
I was giving this some more thought and I concluded that there are ready three ways to send a bundle of files to the implementation's adapter to test:
Use a stream with a proprietary header, body, and file terminator (cue old school line processing)
Bundle the files into a JSON packet
Write files to a temporary directory and pass that directory as the input
I think the first one puts too much burden on the implementation. Since the TCK expects JSON as output, the implementation will already be doing JSON things, so it's better to go with something standard. That's why I like (2). The third option breaks down if the TCK and implementation are on different machines, which is permitted by having a client/server architecture. And the implementation always has the option of writing the files from the JSON packet to a temporary directory, so it's not like (3) is unattainable. It's just not the correct choice over the wire.
An interesting question is how the files should be stored. In the repository, I think they should be stored as separate files. The TCK build would then prepare these files into the JSON packets. That way, as you said, it's easier to add tests. But we could always add tests now in JSON format, then come up with a way to generate that data from files in the future. We probably just need to go with what feels right because ultimately it becomes a maintenance issue.
I think the first one puts too much burden on the implementation. Since the TCK expects JSON as output, the implementation will already be doing JSON things, so it's better to go with something standard. That's why I like (2). The third option breaks down if the TCK and implementation are on different machines, which is permitted by having a client/server architecture. And the implementation always has the option of writing the files from the JSON packet to a temporary directory, so it's not like (3) is unattainable. It's just not the write choice over the wire.
I agree
An interesting question is how the files should be stored. In the repository, I think they should be stored as separate files. The TCK build would then prepare these files into the JSON packets. That way, as you said, it's easier to add tests. But we could always add tests now in JSON format, then come up with a way to generate that data from files in the future. We probably just need to go with what feels right because ultimately it becomes a maintenance issue.
Yes, I think we should store them as separate files.