[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: Draft and analysis of proposed change to DDL1.4 to fix_atom_site_aniso_label
- Subject: Re: Draft and analysis of proposed change to DDL1.4 to fix_atom_site_aniso_label
- From: "Herbert J. Bernstein" <yaya@xxxxxxxxxxxxxxxxxxxxxxx>
- Date: Wed, 22 Jun 2005 17:47:27 -0400
- In-Reply-To: <42B98BDD.8080607@mcmaster.ca>
- References: <1119428522.31134.71.camel@anbf10> <42B98BDD.8080607@mcmaster.ca>
I think this would be a wonderful topic to discuss in Florence. One issue to consider is the very important need to be able to reliably load CIF data into a database, be able to update the information in the database, and then be able to extract views of the data from the database that serve the needs of different user communities. To this end, I would like to suggest that we embed within the DDL structure a clear clean path from any CIF to a fully normalized relational datbase structure. DDL2 is already very close to that ideal as is STARDDL. As changes are made in DDL1 or in merging DDL1 with STARDDL to make a new-DDL, it would be helpful if we did that there as well. This does not mean that every CIF has to be decomposed into its lowest level tables, but it should mean given any CIF and two new-DDL-compliant applications loading the same CIF into a fully normalized internal database, they will alway end up with the same table structure internally, because the relevant dictionary had made all the category and parent-child relationship clear. This means that the core-cif dictionary really should have an atom_site_aniso category and that all the atom_site_aniso tags should be moved into that category, and we should accept Nick's approach of allowing CIFs to be presented with tables made up from two or more categories by joins. There is a good example of this in the STARDDL documentation. As long as we know the full category structure this does no harm in the database load context and gives great flexibility to users and application programmers in working with denormalized versions of the data structure for the sake of efficiency. Yes, this does increase the burden on CIF software library developers. Nothing would need to change for software writing CIFs. Whether they write aone atom_site loop with atom_site_aniso tags mixed in or write two separate loops, their CIFs would be valid. But processors for CIF reading would have to accept the possibility that they might have to normalize a combined table by distributing its contents into multiple tables, or, if they wanted to work with a denormalized table, that they might have to do a join internally. I think it is time to take at least that much from the approach suggested by STARDDL. At 12:03 PM -0400 6/22/05, David Brown wrote: >I have been reading this correspondence with interest (as have other >members of the group), but I did not feel that I had much to offer >as James and Nick seemed to be sorting things out on their own. >James' resolution of the problem, suggesting additions to the DDL1 >dictionary, sounds like a good fix which certainly encapsulates the >essence of what we were thinking when we developed the _atom_site >section of the core dictionary. The separate aniso loop was put in >because many people seemed to like keeping the positional and >displacements parameters in separate tables, and this was the >convention adopted by SHELX and other software, presumably in order >to keep each row of the table on one 80 character line. The >_atom_site_aniso_label was added only because STAR did not allow >_atom_site_label to be repeated in the second loop. I should point >out, however, that COMCIFS does not have authority to change or >approve DDLs, it can only approve CIF dictionaries. I am not sure >who is in charge of DDLs, probably Nick and Syd. > >As Nick pointed out we developed CIF by the seat of our pants. The >first CIF dictionary was conceived as a typeset printed document and >it was only later that it was realized that it could be typeset by >storing the dictionary on a computer as a STAR document. Still >later it was realized that a STAR dictionary could be used to >validate CIFs and even later that it was realized that CIFs could >have a relational structure. Thus DDL was developed on the fly to >accommodate CIF dictionaries that were already well developed. >During this period Acta Cryst. was tooling up to accept structure >reports in CIF and decisions had to be made quickly at a time when >it was impossible to foresee all the implications of what we were >doing. There were also compromises that were thought necessary to >make CIF acceptable to the community, and it was in this spirit that >Acta Cryst. accepted many CIFs into its archives that were not >strictly CIF conformant. > >Software has taken a long time to catch up with the potential of >what was designed into CIF and its DDLs. Browser-editors that >validate coreCIFs against the dictionary have only appeared in the >last couple of years, more than a decade after the release of the >core dictionary, and even these do not validate the relational >structure. By hindsight (i.e., with ten years experience as well as >the appearance of XML) it is clear that we should have done some >things differently, and at Florence we need to review the whole >question of where CIF goes from here. We may decide that we need to >adopt starDDL which has been more carefully thought out, but there >will be a cost. All the dictionaries will need to be revised, the >changes will have to be sold to the community and the trauma of >transition will have to be minimized. It would, however, give us a >chance to get it right the second time. > >Apparently there has been a systems failure in Chester, which is why >there is been such a stunning silence from that quarter. > >David Brown > > >Attachment converted: Macintosh HD:idbrown 23.vcf (TEXT/ttxt) (0017382A) >_______________________________________________ >cif-developers mailing list >cif-developers@iucr.org >http://scripts.iucr.org/mailman/listinfo/cif-developers -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 Office: +1-631-244-3035 Lab (KSC 020): +1-631-244-3451 yaya@dowling.edu ===================================================== _______________________________________________ cif-developers mailing list cif-developers@iucr.org http://scripts.iucr.org/mailman/listinfo/cif-developers
Reply to: [list | sender only]
- References:
- Prev by Date: Re: Draft and analysis of proposed change to DDL1.4 tofix _atom_site_aniso_label
- Next by Date: Re: _atom_site_aniso_label is broken
- Prev by thread: Re: Draft and analysis of proposed change to DDL1.4 tofix _atom_site_aniso_label
- Next by thread: spilt lists in DDL 1.4
- Index(es):