[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
RE: Backus-Naur descriptions for STAR and CIF
- Subject: RE: Backus-Naur descriptions for STAR and CIF
- From: "Bollinger, John Clayton" <jobollin@xxxxxxxxxxx>
- Date: Wed, 17 May 2000 20:19:18 +0100 (BST)
Richard Ball wrote: > Isn't the 80 chars/line something that is only relevant for > CIF writing > (including dictionary creating) programs? The BNF spec. for > the parser > doesn't need to worry about it since the parsing of the > incoming file can be > record length independent. Once the stream has been tokenized > then such > additional restrictions as length of datanames or datavalues > can be applied > and error condtions raised. Or am I missing something? You are missing two things; or actually, two sides of the same thing. 1) Part of the purpose of BNF is to provide an authoritative description of exactly which constructs are valid "utterances" of the language described. It is not just a description of how to write a parser for that language that provides the expected answers in a particular software context. Why might we want such a thing? Exactly for the reason that Nick provided: English prose, though highly expressive, is open to misinterpretation. Forgetting completely about software for a moment, we humans have to be able to agree about exactly what is and what is not a valid CIF. If the full CIF specification could be expressed in BNF then there would be no room for doubt on any question of the validity of any CIF. Once the humans agree, the BNF has the secondary benefit of being useful as a guide for writing parsing software. 2) Because a putative CIF with one or more lines longer than 80 characters is not valid, a correct parser must be able to identify it as erroneous. If the line-length restriction cannot be formulated in BNF then no correct CIF parser can be written from any BNF-only description. It will be clear now that I have come around in my thinking about whether a full BNF description of CIF would be desirable. I definitely think it would be. For instance, I suspect that there may be some disagreement on the correct answers to these questions: if a line in a putative CIF contains a space character at position 81, then can it be a valid CIF? The accepted interpretation of the 80 char/line rule seems to allow line termination characters past position 80; are line termination characters special in that regard, or does the same apply to all whitespace characters? If the former, then does that mean that if a line in my file ends with a [CR][LF] pair, with the [CR] at position 81, then that file is a valid CIF on some platforms but not on others? A full BNF description would answer these questions. Or how about an example where the existing BNF provides an answer? If Nick's latest BNF for CIF were accepted as authoritative, then I could say with absolute certainty that vcif handles Brian's ciftest10 test file incorrectly by interpreting the ^Z at the end as a data value in the preceding loop. It is absolutely clear from that BNF that a ^Z character cannot be part (or all) of a data value. I would argue for that being the correct interpretation regardless, but clearly Brian thought otherwise -- he wrote comments to that effect into the file. As a side note, when it comes to checking line length restrictions after tokenizing the stream, you have to be exceedingly careful to get it right. You must pay attention to the positions of newlines, of course, but you must also count all the whitespace (in a way consistent with the correct answers to the above questions). It does not suffice to just look at lengths of data name and value strings, because it is without doubt the case that a line consisting of the tag '_t', 78 spaces, and the value '?' (for example) is not valid in a CIF. You also have to pay special attention to lines in a text block, which are no exception to the 80 char/line rule. Gee, this got pretty lengthy. Sorry about that. Regards, John
Reply to: [list | sender only]
- Prev by Date: Re: Backus-Naur descriptions for STAR and CIF
- Next by Date: RE: Backus-Naur descriptions for STAR and CIF
- Prev by thread: Re: Backus-Naur descriptions for STAR and CIF
- Next by thread: RE: Backus-Naur descriptions for STAR and CIF
- Index(es):