Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Triennial report to the General Assembly

For the information of members of the list server, I enclose below a copy
of the 1999-2002 triennial report of Comcifs to the General Assembly.

				David Brown
				Chair of Comcifs

Report below  keep on scrolling
*****************************************************
Dr.I.David Brown,  Professor Emeritus
Brockhouse Institute for Materials Research,
McMaster University, Hamilton, Ontario, Canada
Tel: 1-(905)-525-9140 ext 24710
Fax: 1-(905)-521-2773
idbrown@mcmaster.ca
*****************************************************
Report of COMCIFS to the General Assembly of the International
Union of Crystallography for the triennium 1999-2002.

1 Purpose and Membership
     COMCIFS is the committee appointed by the Executive Committee of the
IUCr to oversee the Crystallographic Information File (CIF) project on
behalf of the Union.  It currently consists of a six voting members
appointed by the Executive Committee of the IUCr and an unlimited number
of non-voting members added at the discretion of the chair.  The
non-voting members comprise those with an interest in the development of
CIF who request to be placed on the COMCIFS list server.  Both kinds of
members are fully involved in the work of COMCIFS, but only voting members
approve CIF policies and dictionaries. The voting membership during the
current triennium comprise:

               David Brown (chair)
               Brian McMahon (secretary)
               Helen Berman
               Herbert Bernstein
               Syd Hall
               Gotzon Madariaga

Paul Edgington was appointed to COMCIFS in 1999 but resigned
during the course of the triennium.

2. Overview
     It is now over a decade since Acta Crystallographica adopted the
Crystallographic Information File (CIF) for the submission and archiving
of crystal structures.  When first adopted, CIF was intended as a medium
for authors to submit structure reports electronically to the journals
and, after publication, for the coordinates to be available to a user's
program.  Since then CIF has evolved from a simple transfer and archiving
format into a crystallographic language equipped with a dictionary that
can be interpreted by computer.  In the not too distant future computer
software will acquire its crystallographic knowledge from these
dictionaries rather than have it hard-coded into the programs.  A computer
will transparently combine the knowledge in the CIF dictionaries with the
numerical information in the crystallographic databases to generate
answers to a user's queries.  Two elements are needed to bring this vision
to reality.  The first is a set of dictionaries that capture the breadth
of crystallographic knowledge and the second is a set of programs that can
bring together the knowledge contained in the dictionaries with the
structural information contained in the databases.  No other discipline
has such a comprehensive set of dictionaries, but we currently lack the
software to exploit the potential that these dictionaries offer

3. Dictionary Definition Languages
     The names and properties of items that can appear in a CIF are
defined in dictionaries that are themselves structured using the same STAR
syntax as CIF.  The Dictionary Definition Language (DDL) defines the names
and properties of items that can appear in a dictionary.  The approved CIF
dictionaries are written in one of two versions of the DDL.  DDL2, being
more structured and less forgiving than DDL1, was designed to meet the
requirements of the molecular biology community where highly automated
procedures are needed to handle the rapid increase in experimental
information.  These in turn require a highly structured dictionary.

     Currently under development is a new dictionary language that will
both simplify and extend the capabilities of CIF and should, in the longer
term, remove the incompatibilities between existing dictionaries.  It will
include computer-readable definitions (algorithms) that will tell an
application how to derive any item of crystallographic information from
the basic experimental results contained in a CIF.  When fully functional
this will revolutionize crystallographic computing since a single program
will be able to calculate any item of crystallographic interest providing
a dictionary definition exists.  It will no longer be necessary to write
crystallographic program code, testing a new algorithm will be as simple
as adding a new definition to the dictionary.

4. CIF Dictionaries
     The strength of CIF lies in the extensive suite of dictionaries that
COMCIFS has developed.  Six dictionaries have now been approved and
advanced drafts of three more are being tested in their communities.

The Core CIF Dictionary is used to describe crystal structures with small
unit cells.  Minor additions to the core dictionary were approved in March
1999 and January 2001 (version 2.2) but a major review of this dictionary
is planned for the coming year in order to address the problems raised
following a review of the archives held by Acta Crystallographica and the
Cambridge Crystallographic Data Centre.

The Powder Diffraction CIF Dictionary is now routinely used for the
submission of Rietveld refined structures to Acta Crystallographica.
CIFs containing the powder patterns of these structures are forwarded to
the International Center for Diffraction Data for inclusion in the Powder
Diffraction File.

The Macromolecular CIF Dictionary is being used for the archive
of the Protein Data Bank (PDB) and software has been developed
for manipulating these CIFs, but it will be some time before all
the current macromolecular software is converted from the now
obsolescent PDB format.  Version 2.0 of the dictionary was
approved in September 2000.

The Image CIF/CBF Dictionary is designed for the transmission and
archiving of images, specifically from area detectors.  Because
these images can be very large, a Crystallographic Binary File
(CBF) has been defined to provide a binary representation of an
imageCIF.  Version 1.0 of this dictionary was approved in January
2001.

The Modulated Structure CIF Dictionary will be used for the
submission of incommensurate and modulated structure reports to
Acta Crystallographica. Version 1.0 was approved in July 2001.

The Symmetry CIF Dictionary provides a structured description of
crystallographic symmetry that will replace the symmetry items
defined in the current core dictionary.  The addition of advanced
symmetry concepts is planned.  Version 1.0 was approved in
December 2001.

The Electron Density Dictionary will be used for reporting
electron densities.  A draft endorsed by the IUCr Commission on
Charge, Spin and Momentum Density has been circulated to members of the
Commission for final evaluation before being presented to COMCIFS for approval

The Small Angle Scattering  Dictionary is sponsored by the IUCr
Commission on Small Angle Scattering and a draft is in trial use
prior to being presented to COMCIFS for approval.

The Magnetic Structures Dictionary is being prepared by the
Database of Magnetic Structures Determined by Neutron Diffraction
in Krakow.  It should soon be presented to COMCIFS for approval.

     After approval by COMCIFS, each dictionary is provided with
a Dictionary Maintenance Group appointed from people within the
discipline.  It monitors the use of the dictionary and proposes
revisions for COMCIFS approval.  Requests for additions or
changes to any of the dictionaries should be addressed to the
appropriate dictionary maintenance group.

5. Software
     CIF is a powerful crystallographic language with a well developed
vocabulary, but such a language is of little use without the software to
manipulate it.  Many existing crystallographic programs have been modified
to read and write CIFs, but so far few programs exploit CIF's full
potential, namely the ability to extract their knowledge of
crystallography directly from the dictionaries.  While the writing of such
programs is essential to the future of crystallography, it is not yet
considered a priority for those responsible for distributing
crystallographic resources.

     Among the programs that have been written are a generic CIF editor
(one that obtains its crystallographic knowledge from the CIF dictionary)
prepared by the Protein Data Bank for DDL2 dictionaries.  A CIF editor for
the core dictionary is soon to be released by the Cambridge
Crystallographic Data Centre, and a number of CIF toolkits are available
to help programmers interface their applications to CIF.  A software list
server on the IUCr web site encourages software developers to share their
ideas and problems.

     A possible short term solution to the shortage of software is to
interface CIF to XML (eXtensible Markup Language), a similar standard
developed by the information technology community.  Both CIF and XML store
knowledge about a discipline in dictionaries (called DTDs in XML).  CIF is
further advanced in terms of its knowledge base (dictionaries) though XML
currently has a wider range of software tools.  A CIF to XML conversion
program has been written

6. Publicity
     Because of the rapid pace at which information technology is
advancing, and the expected benefits of CIF to the
crystallographic community, education is an important aspect of
COMCIFS work.  All CIF dictionaries and COMCIFS discussions can
be inspected on the IUCr web site and reports on the development
of CIF appear in the IUCr Newsletter, but more work is needed to
prepare the community for the changes ahead.

7. Interoperability
     The rise of the World Wide Web and other Internet protocols
has fuelled new developments in information interchange and it is
important that CIF collaborate in these endeavours, particularly
with neighbouring disciplines.  Such initiatives include the
Chemical Markup Language (CML - a DTD for describing chemical
structures within SGML or XML documents), macromolecular
structure descriptions in terms of the Common Object Request
Broker Architecture (CORBA) objects and Resource Description
Framework (RDF) schemas.  Several other disciplines working with
macromolecules are developing STAR DDL2 dictionaries that can be
merged directly with CIF dictionaries, allowing CIF software to
access information in related fields.

8. Intellectual Property
      Ownership of CIF is vested in the IUCr in order to prevent
the development of incompatible CIF dialects.  However, because
the IUCr wishes to see the standard widely used without implied
threats of legal action for software that inadvertently fails to
follow the standard, COMCIFS is exploring friendly ways to ensure
that the CIF standard is understood by its users and that
archived CIFs properly conform to the standard.

9. Achieving the Vision
     The rapid development of CIF requires vigilance on several
different fronts.  The technical development of the standard is
well advanced, and the dictionaries that support it are
unequalled in their coverage, but software that can make use of
the advanced features of CIF has not developed at the same rate.
While the crystallographic community has accepted CIF as a
convenient medium for the exchange of crystallographic
information, it remains largely unaware how CIF will
fundamentally alter the way in which we manage information and
computing in crystallography.  These are problems that we will
try to address in the coming triennium.

10. Acknowledgements
     It is a pleasure to extend thanks to the many people who
have volunteered so much of their time and expertise to the work
of COMCIFS.  Particular thanks are extended to the IUCr staff in
Chester for their unstinting support.


David Brown
Chair