[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
RE: Revised draft of CIF 1.1 syntax document
- Subject: RE: Revised draft of CIF 1.1 syntax document
- From: "Bollinger, John Clayton" <jobollin@xxxxxxxxxxx>
- Date: Mon, 16 Sep 2002 17:50:46 +0100 (BST)
> (1) Is a useful purpose served by permitting the use of the > STAR word stop_ to indicate the end of a loop or loop header? > Since CIF does not use nested loops, its use for this purpose > is unnecessary. On the one hand it permits the general usage > of the STAR stop_ directive in CIFs and (arguably) is of help > in data recovery from broken CIFs; on the other hand it > requires parsers to accommodate a new directive and track the > context accordingly. stop_ is of no help in recovering broken CIFs if it is not used, and if it is not required then there is no reason to expect that many broken CIFs will use it. This would change if some widely-used CIF generating software started putting in stop_ automatically, of course. I think the issue of recovering broken CIFs is at best a wash, however, because introducing syntactic significance to stop_ also introduces new ways to break CIFs and new modes of ambiguity in broken CIFs. In a well-formed CIF the additional semantic content of a stop_ is absolutely zero, and in an invalid CIF the semantic content is (necessarily) undefined. Introducing it as an option requires compliant parsers to support it, for little or no additional value. I still fail to see why adding it is even being considered. > (2) Should the value of a semicolon-delimited text field > include the final end-of-line? If so, the following two cases > have identical values: 'foo' and ;foo ; If not, the values > are different: 'foo' in one case, 'foo\n' in the other. I still have not seen an answer to my question about what STAR specifies for the construct. Is this an open question on that front as well? If so, then can we assume that the same choice made for CIF will be made for STAR? As I wrote earlier, the 1994 STAR paper seems pretty clearly (to me) to indicate that the trailing newline is part of the quoted material. (It emphasizes that the construct quotes one or more _lines_ of text.) Are we contemplating a departure from STAR compatibility? Also, although apparently both interpretations have been used in various people's parsers, vcif includes the newline in the quoted material, and vcif is the closest to a reference 1.0 parser implementation that we have. I continue to view general-purpose multiline quoting to be the role of the -- now deferred -- square-bracket quoting mechanism. Semicolon-delimited text blocks do not need to serve that role. It might have been nice if they had originally been defined that way, but by my reading they were not, and it's too late to change that now. Thanks for the various clarifications and implementation notes. They help. [...] > Also the discrete reserved words loop_, stop_ and global_ are > itemised in a separate table from that describing forbidden > unquoted substrings at the start of a data value. So now these are only reserved as complete tokens rather than as initial substrings? Good. [...] > Para 42. Discussion of ways to handle machine-dependent <eol> > across common platforms is prefaced wit the header > "Implementation note:". Very well. I still read the original specifications to require that all three of the common line termination conventions be supported, but evidently that is not the prevailing opinion. That being the case, I do appreciate the specification of the implementation note. I assume that the same interpretation prevails for STAR? [...] > Para. 59. Copied productions for <Exponent>, UnsignedInteger> > and <Digit> as given in Appendix A summary table. In both the appendix A summary and paragraph 59, the productions for <Number> and <Float> are ambiguous (any text that matches <Number> also matches <Float>). In addition, text of the form 1e5 does not match <Number>, although it is valid in all programming languages I know that support scientific notation. Both issues would be resolved by changing the first alternative of the <Float> production from just <Integer> to <Integer><Exponent>. Regards, John Bollinger -- John C. Bollinger, Ph.D. Indiana University Molecular Structure Center jobollin@indiana.edu
Reply to: [list | sender only]
- Prev by Date: Re: Revised draft of CIF 1.1 syntax document
- Next by Date: Re: CIF-DEVELOPERS digest 39
- Prev by thread: Re: Revised draft of CIF 1.1 syntax document
- Next by thread: Re: Revised draft of CIF 1.1 syntax document
- Index(es):