[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
No Subject
- To: Multiple recipients of list <comcifs-l@iucr.org>
- From: "I. David Brown" <idbrown@mcmail.cis.McMaster.CA>
- Date: Thu, 12 Oct 2000 15:29:46 +0100 (BST)
The Chair's Report to Comcifs October 2000 It is now over a year since Comcifs met at the IUCr Congress in Glasgow and, while there have not been many postings to the Comcifs discussion list, the various Comcifs subcommittees have been busy. To keep all members of Comcifs informed of these activities, I would like to review the progress that has been made during the last year and draws attention to both the problems and the opportunities that lie ahead. Several groups have been spending the last few years constructing new dictionaries and their efforts are now beginning to bear fruit. A number of dictionaries have recently been submitted to Comcifs for approval and are in the process of being reviewed by the Dictionary Review Committee (Brown, McMahon, Westbrook) before being passed to Comcifs for formal approval. Version 2 of the mmCIF dictionary (Berman and Fitzgerald), which incorporates a number of suggestions made by Kim Hendrick of the European Bioinformatics Institute, was submitted for Comcifs approval last summer. Somewhat belatedly, this version, which is already in use as the archive for the Protein Databank, received formal Comcifs approval at the end of September. A dictionary (imgCIF/CBF, Hammersley, Bernstein and Sweet) designed for the transmission and archiving of images from array detectors, but also suitable for use with any multidimensional image, has now been reviewed and formal approval is expected soon. This dictionary differs from the others in that it describes files that can be written in two formats, a regular CIF format (imgCIF) and a binary format (Crystallographic Binary File, CBF). The latter is a binary representation of imgCIF designed to be used when conversion of a binary image to the ASCII format of CIF would require too large an overhead. The two files are identical in content and structure, the only difference being the format in which the image is stored. Two other dictionaries are also awaiting the attention of the Dictionary Review Committee, msCIF (Madariaga) designed for the reporting of modulated structures, and symCIF (Brown) containing a basic set of items for defining space group symmetry. Three further dictionaries are in an advanced state of preparation and are expected to be presented for Comcifs review and approval soon. These are the small angle scattering dictionary, sasCIF (Svergun and Malfois) endorsed by the IUCr Commission on Small Angle Scattering and already in trial use, the dictionary for reporting electron densities, rhoCIF (Mallinson), endorsed by the IUCr Commission on Charge and Momentum Density, and the dictionary for magnetic structures, magCIF (Sikora and Pytlik), prepared by the Database of Magnetic Structures Determined by Neutron Diffraction in Krakow. The dictionary for diffuse scattering, dsCIF (Proffen), has made less progress and no extensions have been submitted during the past year for either the core dictionary, coreCIF (Brown), which is now in regular use by Acta Cryst. B, C and E and other journals, or the powder diffraction dictionary, pdCIF (Toby), which has been adopted as the future standard of the Powder Diffraction File. All this activity in developing dictionaries is gratifying, particularly as it represents the adoption of the CIF standard by a number of the commissions of the IUCr. Within a year or so most of the major fields of crystallography will have dictionaries allowing them to archive or transfer information using CIF. Each of these dictionaries is a major project involving several years of hard work on the part of a number of contributors. Only the names of the leaders have been given above, but behind each leader is a team of experts in the field who have worked hard to provide the tight definitions required by CIF. All these people deserve our thanks. The macromolecular crystallography community has been successfully lobbying for the adoption of STAR (the file structure used in CIF) by other molecular biology groups in order to simplify data exchange between them. As a result of these efforts mmCIF has been recognised by the Object Management Group as providing the Common Object Request Broker Architecture standard (CORBA) for the exchange of macromolecular information between different databases. This work has been carried out by Westbrook and Doug Greer of UCSD. The rapid development of all these dictionaries is bringing into focus the need for software to read, write and manipulate CIFs. Most of the major crystallographic software packages can now either read or write CIFs, but these routines are geared to specific applications. As yet there is no coherent suite of CIF application software that can be used by people preparing crystallographic programs. With the widespread availability of dictionaries, the need for supporting software is becoming more urgent. While our community has the expertise and the commitment to devote to the challenging job of preparing dictionaries, we are not as well equipped to deal with the even more challenging task of developing good software. Comcifs is now developing a strategy to ensure that we have the software to exploit the potential of the dictionaries we have written. CIF is sometimes perceived as being nothing more than a convenient format for exchanging crystallographic information, but it is much more than this. The machine-readable dictionaries are a compendium of crystallographic information which can be read and used by programs that contain no hard-coded information about crystallography. A generic CIF editor, when it is written, will be able to read in a CIF together with the appropriate dictionaries and a template of items required for submission to a journal or database. Some of these items may be found in the input CIF. The remainder would automatically be requested from the user. The editor would authenticate any information entered by the user to make sure that it conformed to the requirements of the dictionary, and it would ensure that the output CIF was properly structured. For applications such as a generic editor it is important to be able to concatenate two or more dictionaries since it is not feasible to include all the items found in the core dictionary (for example) explicitly in each of the specialised dictionaries that are coming on line. Comcifs has adopted a protocol prepared by Bernstein, McMahon and Westbrook for creating a virtual dictionary by a run-time concatenation of real dictionaries. However, the software to perform this concatenation, like much of the other software, has yet to be written. There has been some activity in software development though much more is needed. Both the Protein Databank and the Cambridge Crystallographic Data Centre are developing CIF editors (Westbrook and Johnson respectively). Although these are designed primarily for the use of their own contributors, they will undoubtedly be useful to others. A software discussion list has been set up on the IUCr web site to allow software developers to share their ideas and problems. The IUCr has adopted a policy statement which points out that the Union's copyright of CIF is designed only to protect the CIF standard and does not imply ownership of either files written in CIF or software designed to read and write CIF. This statement is an essential part of Comcifs efforts to encourage new software. It seeks to assure software developers that the only requirement imposed by the Union is that when the term CIF is used it must refer to a file that conforms to the CIF standard approved by Comcifs. Recently a formal description of the CIF-STAR syntax in Backus-Naur Form (BNF) has been prepared by Nick Spadaccini to provide answers to those tricky software questions about what exactly is, and what is not, allowed in CIF. All of these activities are a necessary prelude to the development of CIF software. Another problem that is becoming more urgent is the existence of two incompatible Dictionary Definition Languages, DDL1 and DDL2, since dictionaries written in different languages cannot be concatenated. Until now this has not been a serious problem, though all the items in the coreCIF dictionary (written in DDL1) have had to be converted and incorporated explicitly into the mmCIF dictionary (written in DDL2). Help is on its way in the form of a new DDL which will be upwardly compatible with both the earlier versions. This new version, being developed by Hall and Westbrook, will have increased functionality, among other features the inclusion of machine-readable expressions which will allow items not present in the CIF to be calculated from those that are. A suite of such dictionaries will encapsulate all the relationships of crystallography. One can imagine a day when there will be no need for crystallographic relationships to be hard-coded into the software. A generic CIF program could, in principle, perform any calculation that is given in the CIF dictionary and, since private dictionaries can be concatenated with official CIF dictionaries, expressions could easily be added without having to alter other definitions or the software. CIF will then truly represent the language of crystallography. Realistically, however, this day is well in the future. Until we have found a way of developing an extensive base of CIF-handling software the potential that lies at the heart of the CIF system will not be fully realised. This review lists the CIF projects that have been undertaken during the past year by many individuals, only a few of whom I have mentioned by name. Details of most of the projects mentioned here are available on the IUCr web site. I have tried to put the work of CIF into some perspective in order to show the challenges that still face us, specifically the problem of developing software. I have also tried to look beyond the immediate problems to provide a view of the distant goal towards which we are moving, namely the evolution of CIF from a crystallographic file structure into a crystallographic computer language that has a functionality similar to human language. In this language, CIF dictionaries will not just define the terms used. They will be encyclopaedias of crystallography that contain all the important information about the discipline. David Brown Chair of Comcifs ***************************************************** Dr.I.David Brown, Professor Emeritus Brockhouse Institute for Materials Research, McMaster University, Hamilton, Ontario, Canada Tel: 1-(905)-525-9140 ext 24710 Fax: 1-(905)-521-2773 idbrown@mcmaster.ca *****************************************************
- Prev by Date: Re: Backus-Naur Form for CIF
- Next by Date: Re: Membership of pdCIF dictionary management group
- Prev by thread: Re: Revised statement of policy by IUCr on CIF and STAR
- Next by thread: No Subject
- Index(es):