[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: CIF formal specification
- To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
- Subject: Re: CIF formal specification
- From: John Westbrook <jwest@rcsb.rutgers.edu>
- Date: Mon, 07 Mar 2005 07:17:55 -0500
- In-reply-to: <Pine.BSF.4.58.0503070648400.40552@epsilon.pair.com>
- Organization: Protein Data Bank at Rutgers University
- References: <42277A92.6000905@rcsb.rutgers.edu><20050303200548.GC22354@emerald.iucr.org><42277A92.6000905@rcsb.rutgers.edu><5.1.1.6.0.20050307094055.0423b9d8@pop.hermes.cam.ac.uk><Pine.BSF.4.58.0503070648400.40552@epsilon.pair.com>
For DDL2 and all of the mmCIF applications: + row order is not guaranteed + column order in category sections is not guaranteed + category order within datablocks is not guaranteed The major organizational issue for DDL2/mmcif applications are the problems introduced if a category is repeated within a single datablock. This can introduce a variety of ambiguous merge/update/overwrite situations that are best avoided. I believe that it would be best to forbid this situation in the syntax description. Regards, John Herbert J. Bernstein wrote: > The "in general" on the row ordering means "in general". A careful > reading of the semantics of most CIF loops shows that they follow > SQL-like rules on row ordering -- the rows can be presented > in any order without changing the meaning of the table. However, > that is not an explicit rule of CIF syntax (as opposed to the > semantics constraining the way it is used). When possible, > I think it would be desirable to provide columns which allow > any and all order dependencies to be resolved from the content > of the rows, rather than from context (e.g. atom serial numbers > in atom lists), but I, for one, would be opposed to making > that a syntactic requirement, especially in view of the man > existing CIFs that do not comply. > > The table merge/split rules are a difference in semantics between > DDL1 and DDL2. Again, in most cases, SQL-like rules are followed, > so this is an infrequent problem. I think the current proposed > wording is a fair representation of facts on the ground. > > I understand the desire to have a parser that will be aware of > all the equivalences and symmetry violations, but as will any > powerful and evolving language (including XML for that matter), > some things need to be left to the applications. > > Regards, > Herbert > ===================================================== > Herbert J. Bernstein, Professor of Computer Science > Dowling College, Kramer Science Center, KSC 121 > Idle Hour Blvd, Oakdale, NY, 11769 > > +1-631-244-3035 > yaya@dowling.edu > ===================================================== > > On Mon, 7 Mar 2005, Peter Murray-Rust wrote: > > >>At 20:50 03/03/2005 -0500, Herbert J. Bernstein wrote: >> >>><snip/> >> >> >> >>> The proposed new wording is not accurate. There is significance to >>>the ordering of data names, but certain reorderings do not change >>>the meaning of the CIF. I would suggest the following combined rewrite >>>of 7: >> >>The following is very helpful. In essence it formalises the strategy that I >>have employed in CIFDOM - the contents of a CIF may be re-ordered in >>various ways without affecting any meaning. Of course this may surprise, >>and even upset, some humans and it may be important to provide tools that >>can reassure them - e.g. to display their tables in a favorite internal order >> >> >>>7. A given data name (tag) (see 2.4 and 2.7) may appear no more than >>> once in a given data block or save frame. A tag may be followed >>> by a single value, or a list of one or more tags may be marked by >>> the preceding reserved case-insensitive word loop_ as the headings >>> of the columns of a table of values. White space is used to >>> separate a data block or save frame header from the contents of >>> the data block or save frame, and to separate tags, values and >>> the reserved word loop_. Data items (tags along with their >>> associated values) that are not presented in a table of values >>> may be relocated along with their values within the same data >>> block or save frame without changing the meaning of the data block >>> or save frame. Complete tables of values (the table column headings >>> along with all columns of data) may be relocated within the same >>> data block or save frame without changing the meaning of the data >>> block or save frame. Within a table of values, each tag may be >>> relocated along with its associated column of values within the >>> same table of values without changing the meaning of the table of >>> values. In general each row of a table of values may also be >>> relocated within the same table of values without changing the >>> meaning of the table of values. >> >>I am not sure what "in general" means. It suggest that there could be some >>implied semantics (e.g. who is first author, that the symmetry operations >>are in a known order (- this is indeed the case). I would like to remove >>all such implied semantics with explicit tags (although there are clearly >>some current instances where it is a problem). >> >> >>> Combining tables of values >>> or breaking up tables of values would change the meanings, >> >>This is certainly true >> >> >>>and >>> is likely to violate the rules for constructing such tables >>> of values. >> >>I can see that this might violate some higher level semantics (e.g. >>references to components of tables) but I don't see that it violates >>anything in CIF or DDL1. >> >> >>>I apologize for the complexity of this, but it is actually harder to >>>specify the meaning of an unordered set than it is to specify the >>>meaning of an ordered tuple, since the former requires specification >>>of equivalence classes, while the latter does not. >> >>I agree that something of this formality is what is required. >> >>P. >> >> >>Peter Murray-Rust >>Unilever Centre for Molecular Informatics >>Chemistry Department, Cambridge University >>Lensfield Road, CAMBRIDGE, CB2 1EW, UK >>Tel: +44-1223-763069 >> >>_______________________________________________ >>comcifs mailing list >>comcifs@iucr.org >>http://scripts.iucr.org/mailman/listinfo/comcifs >> > > _______________________________________________ > comcifs mailing list > comcifs@iucr.org > http://scripts.iucr.org/mailman/listinfo/comcifs -- ****************************************************************** John Westbrook, Ph.D. Rutgers, The State University of New Jersey Department of Chemistry and Chemical Biology 610 Taylor Road Piscataway, NJ 08854-8087 e-mail: jwest@rcsb.rutgers.edu Ph: (732) 445-4290 Fax: (732) 445-4320 ******************************************************************
Reply to: [list | sender only]
- References:
- Re: CIF formal specification (John Westbrook)
- CIF formal specification (Brian McMahon)
- Re: CIF formal specification (Peter Murray-Rust)
- Re: CIF formal specification (Herbert J. Bernstein)
- Prev by Date: Re: CIF formal specification
- Next by Date: MIF and NMR_STAR dictionaries
- Prev by thread: Re: CIF formal specification
- Next by thread: COMCIFS Annual Report for 2004
- Index(es):