[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: CIF Infoset
- To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
- Subject: Re: CIF Infoset
- From: "Dr P. Murray-Rust" <pm286@cam.ac.uk>
- Date: Wed, 18 Aug 2004 08:42:15 -0000
- In-Reply-To: <Pine.BSF.4.58.0408171436510.48074@epsilon.pair.com>
- References: <4119295A.6090705@mcmaster.ca> <4119295A.6090705@mcmaster.ca><Pine.BSF.4.58.0408171436510.48074@epsilon.pair.com>
On Aug 17 2004, Herbert J. Bernstein wrote: > Peter asks some interesting questions. I do not propose to answer > them in detail here. However, I should point out that interpretation > of a given CIF may require 4 sets of documents: > > 1. The CIF itself. > 2. The dictionary or dictionaries defining the tags > used in the CIF > 3. The relevant DDLs > 4. The CIF specification: > http://www.iucr.org/iucr-top/cif/spec/version1.1/index.html I agree with this. A little while ago I was invited to work with Syd and Nick and spent 2 pleasant weeks looking at whether this could be managed in a self-consistent system. In theory, yes. In practice it was questionable whether it was worthwhile and would be used. It is almost isomorphic with the XML schema hierarchy: DDL-validates->DDL-validates->dictionary-validates->CIF i.e. the DDL is self-validating. The problem was that *any* changes to the DDL have repercussions down the line which multiply. In XMLSchema we have SchemaSchema -validates-> XSDSchema -validates-> instance The construction of slef-consistent schemas in XML has been anything but trivial and has caused much argument. It is unlikely that CIF will benefit from a rerun. So I have taken the pragmatic view that we have DDL2 and DDL1 as currently accepted and used. As my own interests are currently in DDL1 I have restricted my questions and conserns to CIF (i.e. not STAR) and built software for this. My architecture should be sufficiently modular thatif/when CIF extends to fuller STAR it can be enhanced. > > Many of Peter's questions are answered in the specification. The lexical questions are. I have used the syntax and semantics documents as reference. I have assumed these are formal abstractions of the original published article(s). If they are not, then it would be useful to abstract additional rules - I think that implementers need to know exactly what documents apply and what the rules are. > > The infoset concept is useful, but be warned that the appropriate > handling of information depends on the context within which you are > working, regardless of whether you are using CIF or using XML or > the PDB format. For an application intended to just get at the data, > comments may be discarded, while for an application intended to reformat > the presentation of the data, comments are highly significant > information. Similarly, the particular form of quoting, the > distinction between "." and "?", etc. may or may not be > signficant. If the application in question is, say, a > refinement program that just needs to read CIFs to extract > expected crystallographic data, then construction of the "infoset" > from a CIF is particularly simple. More demanding applications, > e.g. in CIF validation and publication suites, may need to deal > with more subtle data and metadata questions. > I am afraid I disagree! If the interpretation of a CIF depends on what program is to be used to process it then it is (IMO) not an abstract archive and transfer format. Peter M-R
Reply to: [list | sender only]
- Follow-Ups:
- Re: CIF Infoset (Herbert J. Bernstein)
- References:
- COMCIFS open meeting a Florence (David Brown)
- Re: CIF Infoset (Herbert J. Bernstein)
- Prev by Date: Re: CIF Infoset
- Next by Date: Re: CIF Infoset
- Prev by thread: Re: CIF Infoset
- Next by thread: Re: CIF Infoset
- Index(es):