[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: CIF Infoset
- To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
- Subject: Re: CIF Infoset
- From: David Brown <idbrown@mcmaster.ca>
- Date: Wed, 18 Aug 2004 12:16:45 -0400
- In-Reply-To: <Prayer.1.0.11.0408181224360.11409@hermes-1.csi.cam.ac.uk>
- References: <Pine.LNX.4.44.0408181322570.18193-100000@mostaccioli.csse.uwa.edu.au><Prayer.1.0.11.0408181224360.11409@hermes-1.csi.cam.ac.uk>
I am also finding this interchange interesting. I have only a couple of short comments to add: >> Does an infoset for HTML that says >> >> <b><!--interpret hello as goodbye-->hello</b> is equivalent to >> <b>hello</b>? If so, wouldn't that be somewhat dangerous? > > > HTML chooses to make comments part of the infoset so those two > documents have different infosets. However the following *are* > equivalent at infoset level: > > <b>hello</b> > <b >hello</b> > <b>h<![CDATA[ell]]>o</b> Technically the comments are not part of the CIF, and in practice the CIFs I handle for Acta Cryst. only contain template comments that are designed to direct the author to include the requred information. When CIF editors become more widely used, these comments will not be needed. >> So if something is >> numb, you expect it to be a number, irrespective of the lexical eye >> candy >> provided by a variety of delimited string forms. If _cell_length is >> declared numb, then '12.1' and 12.1 are equivalent in interpretation (at >> the application level). > > > The CIF specification indicates that these have different semantics. > If this is now obsolete or deprecated it would make implementations > simpler. The quotes are important. The dictionary gives, I believe, the default type, but this can be overridden by the acutal type. Thus in the example given above '12.1' would be read as char and an application would have to decide whether it could convert this to numb. Quoting is important - for example in the dictionaries '_cell_length_a' is not a dataname, though _cell_name_a is. This might occur in a CIF if someone wrote: _exptl_special_details '_exptl_density_obs unobserverable' >> > > Q Does data_global have any semantics? I suspect that formally it >> does >> > not, but it seems in widespread use: >> >> >> data_global doesn't exist. > > > It does (frequently). (I appreciate that gloabl_ is different and > irrelevant to CIF/DDL1). data_gloabl is very frequently used as the > first block in a multiblock CIF to indicate information that (I > assume) the author wishes to apply to all blocks. I think it either > needs deprecating or accepting and formalising. One of the commonly used templates (I believe that supplied by SHELX) starts with a datablock called data_global but this is not a reserved dataname and has no significance beyond being a legitimate form of data_xxxxx. In the template it introduces a datablock that contains the text part of a paper, with the numerical information supplied in one or more additional blocks depending on how many structures are being described. Since formally each datablock in CIF is independent, there is no formal linkage between the data_global datablock and any of the other datablocks that follow. As Nick points out, global_ is defined in STAR, though not in the current version of CIF. The name is currently reserved in CIF in case we wish to use it later. > "." is worse because the spec can be interpreted as requiring the > implementer to insert the default value from the dictionary. At one > stage this would be interpreted to mean that unless specified all > extinstion corrections were, by default, Zachariasen. Defaults, and > their insertion, have to be explicitly specified. I agree there is a problem. In working through dictionary definitions we are trying to remove the default values and in my view "." should never be used to indicate a default - it should only mean 'this item has no physical meaning in the present context'. One good example of where defaults make sense is in _atom_site_occupancy. In a straightforward structure report this item may not be given, but it certainly does not imply that it is irrelevent or not known. It would be assumed by any application to have the value of 1.0 unless otherwise stated. A value of '.' for this item should not indicate the default - if the item is present in a CIF the value should be given explicitly even if it is the same as the default. A value of '.' says that it makes no sense to talk about the occupancy of this atom (it might occur if the atom in question was a dummy atom, which is allowed). > > >> >> I suspect apart from Syd and I, almost no one sucks in dictionaries to >> validate STAR/CIF file contents. Most just assume they know what they >> need >> to and hope the definition of the data item has never changed.. > There are two editor/browsers available and another that a couple of students are writing for me that do read in the dictionary before reading in the CIF and use the dictionary for validating. The validations are not complete, but at least they test the important items such as enumeration lists, type etc. It is a beginning, and I am trying (as an Acta editor) to educate users to preparing CIFs that will be accessible to the advanced software of the future. However, most users still see CIF as just a more complicated file structure that offers little more than the old formatted output files produced by the principal structure-solving packages. David -- Dr. I.D.Brown, Professor Emeritus, Department of Physics and Astronomy McMaster University, Hamilton Ontario, Canada
Reply to: [list | sender only]
- Follow-Ups:
- Provence and property rights (Peter Murray-Rust)
- Re: CIF Infoset (Dr P. Murray-Rust)
- References:
- Re: CIF Infoset (Nick Spadaccini)
- Re: CIF Infoset (Dr P. Murray-Rust)
- Prev by Date: Re: CIF Infoset
- Next by Date: Re: CIF Infoset
- Prev by thread: Re: CIF Infoset
- Next by thread: Re: CIF Infoset
- Index(es):