[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: Problems with CIF BNF
- To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
- Subject: Re: Problems with CIF BNF
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Mon, 12 Mar 2007 22:12:08 -0400
- In-Reply-To: <45F5B86D.7090100@niehs.nih.gov>
- References: <45F5918A.4030602@niehs.nih.gov><a06230902c21b5216f07d@[192.168.10.211]><45F5B86D.7090100@niehs.nih.gov>
You may find the following links helpful: http://arcib.dowling.edu/cifiucr/ especially http://www.bernstein-plus-sons.com/software/ciftest/ and http://arcib.dowling.edu/vcif/ At 4:30 PM -0400 3/12/07, Joe Krahn wrote: >I realize that there are a few hacks in the BNF to deal with >context-dependence, like productions defined as multiple symbols, which >make it impossible to use as a working BNF. But, there are other >problems with grammar. With the end-of-line example, the lexer can do >something 'sensible', but it is still important to have a specific >definition of whether missing a terminal <eol> makes the CIF invalid. > >I can look at CBFlib to see an interpretation of the CIF grammar, but >someone else's parser may have a different interpretation. In fact, it >would be good to have a collection of unusual CIF files for parser >testing, with a consensus as to which ones are valid and which are invalid. > >Joe > >Herbert J. Bernstein wrote: >> Without a defined lexer, you cannot do CIF as a BNF; it is context >> sensitive in its use of whitespace. The question you are raising >> about EOF should be handled by the lexer, which should deal sensibly >> with the usual unix problem of disambiguating the case of a final >> line that ends with eof rather than eol-eof. There is a rather >> complete bison grammar in CBFlib working on the level of tokens >> after lexing the input. -- HJB >> >> >> At 1:44 PM -0400 3/12/07, Joe Krahn wrote: >>> Some parts of CIF are vague. I hoped that the BNF syntax would be a >>> precise syntax specification, but it has problems. It is central to >>> properly defining the CIF format, and should therefore be very accurate. >>> >>> First, there are some plain syntax errors, like unbalanced braces in the >>> production of <Float>, and an empty token in the TokenizedComments >>> production. >>> >>> There are also a few hacks like <noteol>, and the lack of rules for the >>> content of quoted strings. I think it is also a hack for a production >>> unit to be defined for two elements, like "<eol><UnquotedString>". >>> >>> Does EOF count as whitespace? Normally, a text file ends with an <eol> >>> on the last line, so it is not a problem. With Fortran, you may not be >>> able to distinguish between them, so it seems that EOF probably should >>> count as a whitespace token. >>> >>> There are also places where the grammar could be simplified, such as: >>> >>> { {'e' | 'E' } | {'e' | 'E' } { '+' | '- ' } } <UnsignedInteger> >>> >>> written as: >>> {'e' | 'E' } { '+' | '-' }? <UnsignedInteger> >>> >>> Also note the error in the first form copied from the web page: the >>> minus sign has a space included. >>> >>> Should the logical-OR symbol always be contained within braces? This >>> appears to be inconsistent, but maybe the rule is to require braces when >>> the members include a quoted character element. >>> >>> I will try to edit my own version of the BNF to produce what I think it >>> is supposed to mean. Answers to some of the above questions will be >>> helpful in getting it right. >>> >>> Thanks, >>> Joe Krahn >>> _______________________________________________ >>> comcifs mailing list >>> comcifs@iucr.org >>> http://scripts.iucr.org/mailman/listinfo/comcifs >> >> _______________________________________________ >> comcifs mailing list >> comcifs@iucr.org >> http://scripts.iucr.org/mailman/listinfo/comcifs >_______________________________________________ >comcifs mailing list >comcifs@iucr.org >http://scripts.iucr.org/mailman/listinfo/comcifs
Reply to: [list | sender only]
- Follow-Ups:
- Re: Problems with CIF BNF (Joe Krahn)
- References:
- Problems with CIF BNF (Joe Krahn)
- Re: Problems with CIF BNF (Herbert J. Bernstein)
- Re: Problems with CIF BNF (Joe Krahn)
- Prev by Date: Re: COMCIFS Annual Report for 2006 (draft)
- Next by Date: Re: Problems with CIF BNF
- Prev by thread: Re: Problems with CIF BNF
- Next by thread: Re: Problems with CIF BNF
- Index(es):