[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: Opinions on comments as part of the content
- To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
- Subject: Re: Opinions on comments as part of the content
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Wed, 7 Mar 2007 20:31:55 -0500
- In-Reply-To: <7.0.1.0.0.20070307230140.0255fec0@cam.ac.uk>
- References: <45EDB89B.20907@niehs.nih.gov><7.0.1.0.0.20070307072016.02573858@cam.ac.uk><45EEF6E5.5050608@niehs.nih.gov><7.0.1.0.0.20070307175957.03ed2e28@cam.ac.uk><45EF1C71.70507@niehs.nih.gov><7.0.1.0.0.20070307230140.0255fec0@cam.ac.uk>
In practice, CIF has one "type" of undefined value -- it is usually called null. Just as the number 1 has many representations (e.g. 1.0, 1, .1e1, etc.), null has two representations: . and ? From the point of view of a program processing submitted data they have the same meaning -- the user has not provided a value. if you have a specified default, use it. If you don't have a specified default, you have an unspecified value. For the point of view of the user, they have two very different meanings. The "?" is an invitation to provide a value to be used in place of the default, if any. The "." is not such an invitation. The question of whether > > > loop_ >> > _foo _bar >> > a . >> > b . > > > >> > should be equivalent to >> > loop_ >> > _foo >> > a >> > b >> > depends on the dictionary. If _bar has been declared mandatory and has a default value, then the first construction is valid, while the second is not. If _bar has been declared implicit then the two constructs are equivalent and, from the point of view of an application, equivalent to > > > loop_ >> > _foo _bar > > > a ? > > > b ? While it may not be easy to deal with "missing" values in a column of numbers, it is important in many cases to be able to do so. I for one, not only think it practical to have two ways to represent an unspecified value, I think it to be essential. The distinction between "?" and "." has worked rather well as a way to be able to guide users in filling in appropriate "blanks" without tempting them to override defaults that should be left alone. -- Herbert > >>... >> > My own heuristics are: >> > _foo '?' >> > carries no useful information other than the author hasn't bothered >> > to remove it from the file >> > _foo '.' >> > is highly dangerous as the dictionary can contain default values >> > which most users have no idea of. Thus the default extinction >> > correction is (or certainly was) 'Zachariasen' and algorithmically >> > linking '.' to this value is certain to give misleading info. >> > > > > loop_ >> > _foo _bar >> > a . >> > b c >> > >> > has a null value for one cell - this is required to make a >> rectangular table > > > > > > loop_ >> > _foo _bar >> > a . >> > b . >> > >> > should be equivalent to >> > loop_ >> > _foo >> > a >> > b >> > > > > and this construct should be avoided >> > >> > loop_ >> > _foo _bar >> > a ? >> > b ? >> > >> > is almost certainly an unedited template and should be replaced by: >> > >> > loop_ >> > _foo >> > a >> > b >> > >> > and finally >> > loop_ >> > _foo _bar >> > a ? >> > b c >> > >> > is indistinguishable from >> > >> > loop_ >> > _foo _bar >> > a . >> > b c >> > >> > All these issues come into very sharp focus when processing CIFs - it >> > is not trivial to manage '.' in a column of otherwise real numbers. >> > >> > P. >>I take a similar approach. They both represent missing values, but >>missing for different reasons. If one really wants a default value in >>the dictionary, it should be "if not otherwise specified" and not "if >>the value is '.'". In that case, both still mean missing, just different >>reasons. >> >>Does ANYBODY really think it is practical to have two types of undefined >>values? >> >>Of course, CIF is just a text archive. There is nothing preventing the >>use of a string in the middle of an array of real numbers. > >If the CIF name occurs in a loop_ and is defined in a dictionary as a >NUMB then all values must be valid real numbers. If defined as CHAR >it can be sequence of legal characters (there may be length restrictions). > >>Some rules >>about numeric arrays would be helpful for practical use of CIF. > >P. > > >Peter Murray-Rust >Unilever Centre for Molecular Sciences Informatics >University of Cambridge, >Lensfield Road, Cambridge CB2 1EW, UK >+44-1223-763069 > >_______________________________________________ >comcifs mailing list >comcifs@iucr.org >http://scripts.iucr.org/mailman/listinfo/comcifs
Reply to: [list | sender only]
- Follow-Ups:
- Re: Opinions on comments as part of the content (peter murray-rust)
- References:
- Opinions on comments as part of the content (Joe Krahn)
- Re: Opinions on comments as part of the content (peter murray-rust)
- Re: Opinions on comments as part of the content (Joe Krahn)
- Re: Opinions on comments as part of the content (peter murray-rust)
- Re: Opinions on comments as part of the content (Joe Krahn)
- Re: Opinions on comments as part of the content (peter murray-rust)
- Prev by Date: Re: Opinions on comments as part of the content
- Next by Date: Re: Opinions on comments as part of the content
- Prev by thread: Re: Opinions on comments as part of the content
- Next by thread: Re: Opinions on comments as part of the content
- Index(es):