[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
RE: Fine-tuning CIF dictionary regexes
- Subject: RE: Fine-tuning CIF dictionary regexes
- From: "Bollinger, John Clayton" <jobollin@xxxxxxxxxxx>
- Date: Mon, 18 Apr 2005 10:09:24 -0500
Regarding these two specific REs from mm_cif: > floating point numbers: > > '-?(([0-9]+)[.]?|([0-9]*[.][0-9]+))([(][0-9]+[)])?([eE][+-]?[0-9]+)?' This RE does not appear to agree with the CIF 1.1 formal grammar, which puts the standard uncertainty after the exponent rather than before it. (See the productions for <Numeric>, <Number>, and <Float>.) Which is right? > symmetry operations > '([1-9]|[1-9][0-9]|1[0-8][0-9]|19[0-2])(_[1-9][1-9][1-9])?' I think it's overkill to use the pattern to so specifically restrict the possible symop number. Which numbers are actually valid in any particular case (and to what specific operation they correspond) depends on other data in the CIF. Since there needs to be validation after the match anyway, then, making the RE a bit looser would allow a processor to recognize errors more specifically. I might write the symop RE like this: '[1-9][0-9]*(_[1-9]{3,3})?'. (That also happens to remove the alternation problem, though that was not my objective.) That way, if I accidentally write 244_555 instead of 24_555, a processor can tell me "bad symop number" instead of "unrecognized token". Regards, John Bollinger -- John C. Bollinger, Ph.D. Indiana University Molecular Structure Center jobollin@indiana.edu _______________________________________________ cif-developers mailing list cif-developers@iucr.org http://scripts.iucr.org/mailman/listinfo/cif-developers
Reply to: [list | sender only]
- Prev by Date: RE: Fine-tuning CIF dictionary regexes
- Next by Date: Re: Fine-tuning CIF dictionary regexes
- Prev by thread: RE: Fine-tuning CIF dictionary regexes
- Next by thread: Re: Fine-tuning CIF dictionary regexes
- Index(es):