Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Annual report

For your information I enclose below the final version of the annual
Comcifs report for 2000 that has been submitted to the Executive
Committee.  There are only a few changes from the draft circulated earlier
for comment.

IDB

----------------------------------------------------

 Report of COMCIFS to the IUCr Executive Committee for 2000       
                         

Mandate
-------
COMCIFS is the committee appointed by the Executive Committee to
maintain the Crystallographic Information File (CIF) standard
owned by the IUCr.

Committee structure
-------------------
COMCIFS consists of a small number of voting members appointed by
the Executive Committee and a much larger number of non-voting
members appointed by the chair of COMCIFS.  The latter are on the
COMCIFS mailing list and are invited to comment on any COMCIFS
business.   Most business is carried out by email and, to ease
the load on the small number of COMCIFS voting members, much
detailed work is carried out by committees such as the Dictionary
Maintenance Groups, the Dictionary Review Committee, the
Publicity Committee, the Software Development Committee and the
Dictionary Definition Language Committee.  Many of these groups
run formal email discussions maintained by the staff in Chester.
COMCIFS committees collaborate with the IUCr Commissions as
appropriate and CIF users are normally welcome to join the
discussion list of any group in which they have an interest.  All
approved dictionaries, and some dictionaries close to approval,
are posted on the IUCr web site where many of the CIF discussions
can also be viewed

Membership
----------
Members are appointed following each General Assembly.  Current
voting members are:

     David Brown (Chair)
     Brian McMahon (Coordinating Secretary)
     Helen Berman
     Herbert Bernstein
     Sydney Hall
     Gotzon Madariaga

In 2000 Paul Edgington resigned from his position as a voting
member of COMCIFS and from his committees as a result of a change
in his position at the Cambridge Crystallographic Data Centre
(CDCC).  His place on the committees has been taken by Owen
Johnson of the CCDC.

Dictionaries
------------
Approval of new and revised CIF dictionaries continues to be a
major part of COMCIFS activities.  Each new dictionary is
compiled by a working group, often in conjunction with the
appropriate IUCr Commission, and each existing dictionary is
maintained by a Dictionary Maintenance Group.  Recommendations
from these groups are closely examined by a Dictionary Review
Committee to ensure CIF compliance before being passed to the
voting COMCIFS members for formal approval.  Compiling a
dictionary is a challenging and time-consuming occupation and
several drafts are usually exchanged between the Dictionary
Review Committee and the dictionary compilation group before a
new dictionary is ready for approval.  We are fortunate to have a
number of volunteers willing to contribute a significant amount
of their time to this effort.

In September formal approval was given to Version 2 of the mmCIF
dictionary which is now in use at the Protein Data Bank.  This
approval was combined with a formal lifting of the 80 character
line restriction for files written using this dictionary.

In November COMCIFS approved the imgCIF/CBF dictionary used to
record and transfer information on images, specifically the
images produced by 2-dimensional detectors.  This project broke
new ground for COMCIFS because it also contains a specification
for an equivalent binary file format: the Crystallographic Binary
File (CBF).  Approval of this dictionary was followed by the
appointment of a Dictionary Maintenance Group consisting of H.
Bernstein, R. Sweet and J. Westbrook, who were all closely
involved with the original version of this dictionary. This group
already has a draft of version 2 which can be found at http://
www.bernstein-plus-sons.com/software/CBF.doc/cif_img_1.1.3.html
and which is expected to be approved in 2001.

A number of minor changes in the coreCIF dictionary have also
been approved in the light of experiences gained in the
submission of reports of crystal structures to the primary
journals and databases.

Currently under final review by the Dictionary Review Committee
are the modulated structure dictionary (msCIF) and a dictionary
containing the basic symmetry concepts used in crystallography
(symCIF).

Draft dictionaries for electron density (rhoCIF), magnetic
structures (magCIF) and small angle scattering (sasCIF) were
submitted to the Dictionary Review Committee and are currently
undergoing revision to bring them into conformity with CIF
standards. 

The increase in the number of dictionaries, many of which draw on
definitions supplied in other dictionaries, has lead to the
development by B. McMahon, H. Bernstein and J. Westbrook of a
protocol for merging two or more dictionaries into a larger
virtual dictionary.  This protocol will also allow official CIF
dictionaries to be merged with local dictionaries to allow
individual laboratories to customize their CIF applications.

Software
--------
Developing the necessary software for manipulating CIFs is
currently a major concern.  While the crystallography community
has the expertise needed to prepare new dictionaries, it has a
relatively small pool of expertise in the type of sophisticated
software that can exploit the full potential of the dictionaries. 
One approach, pursued by H. Bernstein, has been to exploit the
information-handling techniques of extensible markup language
(XML) by writing programs to interconvert CIF and XML. However,
while XML is provided with a rich set of tools for managing and
manipulating document structure, it still has rather few
domain-specific applications, and is not an automatic candidate
for mining the full information content of CIFs. Nevertheless,
until there is more generic software available for processing
STAR files such as CIF, the CIF language will not be able to
achieve its full potential.  COMCIFS encourages writers of
crystallographic software to make full use of the capabilities
built into the standard.

Most of the software currently available for CIF is in the form
of toolboxes to help others write CIF applications.  However,
there is an urgent need to provide the user community with the
tools for preparing and editing CIFs.  The program enCIFer, to be
released by the Cambridge Crystallographic Data Centre in late
2001, has many features that crystallographers will find useful. 
These include a browser that provides clear error markup, an
alphabetic view of data names, data entry panes containing the
dictionary definitions, buttons for the special character
sequences frequently used in CIF text, spreadsheet loop displays,
text searching, a text editing window, user templates and a
crystal structure visualizer.  EnCIFer is based on the DDL1 core
dictionary and is designed for use by the small-structure
community.  A similar editor, ADIT, has been written for the DDL2
mmCIF dictionary and is designed primarily for users of the
Protein Data Bank.

Relationships with other bodies
--------------------------
The chair of COMCIFS sits ex-officio on the Commission on
Crystallographic Nomenclature who also appoint a member to
monitor COMCIFS activities.  Many of the dictionary committees
are either sponsored by or have close ties with the corresponding
IUCr commission.  Several members of COMCIFS are working on the
text for Volume G of International Tables for Crystallography,
the volume which describes the CIF standards.  The secretary of
COMCIFS has been appointed IUCr Representative to CODATA, and
gave a presentation on IUCr publishing and data activities
involving CIF at the CODATA conference at Lake Maggiore, Italy,
in October 2000.

In order to simplify data exchange, the macromolecular
crystallography community has been successfully lobbying other
molecular biology groups to adopt the STAR file structure, the
syntax used by CIF.  As a result of the efforts of J. Westbrook,
D. Greer and others, mmCIF has been recognized by the Object
Management Group as providing the Common Object Request Broker
Architecture standard (CORBA) for the exchange of macromolecular
information between databases.

Future developments
-------------------
Although CIF was originally developed as a simple file structure
for recording information on crystal structures, it is developing
into a fully-featured language for manipulating crystallographic
information. The purpose of the CIF dictionaries is to provide
computer access to that information.  Most of the attributes of a
data item described in the dictionaries, e.g., whether a
particular value is expressed as a number or as a character
string, can already be parsed by a computer. Computers are, of
course, unable to interpret the crystallographic definitions
which remain only accessible to humans.  However, the
relationships between different data items can be described in
machine-readable terms, and thus allow computers to build more
detailed models of complex crystallographic objects. One of the
current exciting extensions proposed for CIF is the development
of a Dictionary Relational Expression Language (dREL) which will
provide algorithmic expressions that allow values for each item
in the dictionary to be derived from other items, e.g., the
calculation of the density from the cell mass and cell volume. 
If values of these latter items are not present in the CIF, the
computer will use the algorithms in the dictionary to calculate
the cell volume from the lattice parameters and the cell mass
from the list of atoms.  A preliminary account of dREL has been
given by Spadaccini, Hall and Castledean (2000) J. Chem. Inf.
Comput. Sci. 40, 1289-1301.  While a dictionary written in dREL
is primarily intended to allow a computer to calculate derived
values not currently stored in a given CIF, it will incidently
provide precise definitions of, and relationships between,
crystallographic concepts, allowing it to be used as an on-line
crystallographic encyclopaedia.

IUCr Support
------------
It is my pleasure to express COMCIFS thanks to the IUCr office
for its support, particularly in supplying web sites and
discussion groups, and the services of Brian McMahon as our very
effective secretary.

Respectfully submitted by 
I.D.Brown, Chair


*****************************************************
Dr.I.David Brown,  Professor Emeritus
Brockhouse Institute for Materials Research, 
McMaster University, Hamilton, Ontario, Canada
Tel: 1-(905)-525-9140 ext 24710
Fax: 1-(905)-521-2773
idbrown@mcmaster.ca
*****************************************************