[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: parser validation tools
- Subject: Re: parser validation tools
- From: Brian McMahon <bm@xxxxxxxx>
- Date: Thu, 11 May 2000 09:59:43 +0100 (BST)
> Are datanames and datablock names really allowed to have the comment > indicator (#) as a valid character in the name (as indicated in the ciftest5 > file)? Yup. The published STAR BNF [Hall, S. R. & Spadaccini, N. (1994), J. Chem. Inf. Comput. Sci. 34, 505-508] has the following relevant entries: <data_heading> ::= data_<non_blank_char>+ <data_name> ::= _<non_blank_char>+ <non_blank_char> ::= ! shriek character -> ~ tilde character (ASCII 33 - 126) So the following is a valid STAR File: data_is_this_a_valid_#STAR_File? _the_answer_#_is 'yes' COMCIFS discussed some time ago whether restrictions should be imposed on non-alphanumeric characters in data names and datablock names within CIFs specifically. The conclusion was "no". (http://www.iucr.org/iucr-top/cif/comcifs/minutes/msg00017.html) Admittedly this does make life harder for regular-expression parsing, which is a useful tool in shell, perl, tcl, python and similar languages. For exaample, if you're matching regexps within a line of text in order to identify and discard a comment, you can't just scan for /#.*/ You need at least white space before the hash mark: / #.*/ But in fact you need also to check that the hash isn't a legitimate character within a text string, e.g. 'this is a # legal data value' I offer as a challenge to anyone who is interested the problem of supplying a regexp that will definitely match a comment on a line. Brian
Reply to: [list | sender only]
- Prev by Date: Re: parser validation tools
- Next by Date: Re: parser validation tools
- Prev by thread: Re: parser validation tools
- Next by thread: Re: parser validation tools
- Index(es):