[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Report of IUCr Representative to CODATA 2002-2004
- To: epc@iucr.org
- Subject: Report of IUCr Representative to CODATA 2002-2004
- From: Brian McMahon <bm@iucr.org>
- Date: Fri, 17 Sep 2004 14:44:35 +0100
CODATA requires a report on activities from each member organization for presentation to its General Assembly (this will take place in Berlin in November). For your interest I append a copy of the report I have just sent to CODATA, only slightly beyond the deadline. Best wishes Brian ------------------------------------------------------------------------------ Report to CODATA of Activities of the International Union of Crystallography (IUCr) 2002-2004 The International Union of Crystallography (IUCr) is a scientific union adhering to the International Council for Science (ICSU). Its objectives are to promote international cooperation in crystallography and to contribute to all aspects of crystallography, to promote international publication of crystallographic research, to facilitate standardization of methods, units, nomenclatures and symbols, and to form a focus for the relations of crystallography to other sciences. Crystallographic Databases -------------------------- Several independent databases exist that store and manage the results of crystal structure determinations. Among the most important are - the Cambridge Structural Database for organic and metal-organic small-molecule structures and oligonucleotides (CSD); - the Protein Data Bank for protein and nucleic acid structures (PDB); - the Inorganic Crystal Structure Database for inorganic materials (ICSD); - the Metals Crystallographic Data File for metals (CRYSTMET). Other crystallographic databases store non-structural data, including: - the NIST Biological Macromolecule Crystallization Database and the NASA Archive for Protein Crystal Growth Data ; - the Powder Diffraction File. These databases are curated by independent organisations, but the IUCr monitors their development through a standing Database Committee (CCD) that reports directly to the Union's Executive Committee. Among the activities noted during the period by the CCD are: - Publication of a special issue of the journal Acta Crystallographica devoted to the crystallographic databases. The issue contained current descriptions of the major databases and their access software systems, together with a number of papers that reviewed their research applications across a very broad range of science. - The number of macromolecular structures deposited with the Protein Data Bank is growing at a high rate (18% during 2002); new deposits are being ably assimilated under the management of the Research Collaboratory for Structural Biology. By mid-July 2003 the total number of holdings was over 21500. An extensive program has been undertaken to standardize the archival holdings and export them in CIF and XML formats. - In 2003, the Cambridge Crystallographic Data Centre released Version 1.0 of the enCIFer program for CIF validation and editing, which is available for free download from their web site for bona fide research use. - The International Centre for Diffraction Data released PDF-4/Organics, a set of powder patterns calculated from the contents of the Cambridge Structural Database. Data Exchange ------------- Development continues on the Crystallographic Information File (CIF), the standard file format for archiving and exchanging crystallographic data. A new dictionary of data items for reporting accurate electron densities in crystals (rhoCIF) was released in August 2003, and revised versions were released during 2003 of the dictionaries of core data items in small-molecule structural crystallography (coreCIF) and of image-plate data, annotation and analysis (imgCIF). These complement the stable dictionaries of data items for use with powder diffraction (pdCIF), modulated structures (msCIF), macromolecular structures (mmCIF) and the description of crystallographic symmetry (symCIF). As part of its data standardization program, and in preparation for harvesting and annotating the expected large number of structural genomics results, the Protein Data Bank has developed an extensive dictionary of CIF data items complementing the mmCIF dictionary. These items will be tested and refined in a number of forthcoming procedures, including the development with the IUCr of a structure reports section in a new journal of the Acta Crystallographica family, and are likely to lead in due course to an expansion of the standard mmCIF data dictionary. The IUCr representative to CODATA has been privileged to collaborate with Professor S. R. Hall, the inventor of CIF, as Co-editors of a volume in the reference series International Tables for Crystallography that will provide a complete and authoritative documentation of the CIF standard. This Volume will be published in early 2005. Interoperability ---------------- CIF remains the data exchange format of choice within crystallography, but interest is growing in format translation into and out of XML. The Protein Data Bank now makes available macromolecular structure descriptions in XML (using a schema derived from the mmCIF data model). Collaboration continues with IUPAC on such topics as the development of a chemical identifier (the IUPAC-NIST Chemical Identifier INChI), and an IUCr working party on phase identifiers is producing a recommendation for the incorporation of crystal structure phases within INChI. Data Validation --------------- All structural data sets published in IUCr journals have since 1990 been checked for internal consistency by software capable of reading CIF submission or deposit files directly. The checking procedures are published on the web, and a public service to return a standard report on structures subjected to these checks has been established at http://checkcif.iucr.org. This service has begun to attract sponsorship from scientific publishers and databases, and has become accepted as a community standard for reviewing and assessing the consistency and quality of small-molecule and inorganic structure determinations. Electronic Publishing --------------------- The IUCr continues to publish six primary research journals in crystallography, and a seventh covering the technology, instrumentation and uses of synchrotron radiation. A new online-only journal of Structural Biology and Crystallization Communications will be launched in 2005, and will include macromolecular structure determinations. The experimental data sets discussed in these publications are freely available for download as supplementary files. Access to these files is not restricted to journal subscribers. The IUCr collaborates with the Cambridge Crystallographic Data Centre and the Inorganic Crystal Structure Database to check new submissions for prior publication, and is continuing to pursue its goal of establishing bidirectional hyperlinks between published literature articles and associated database records. Open Access to Crystallographic Data ------------------------------------ Although the IUCr has always provided open access to the crystallographic data associated with its publications, and has championed the open availability of structural data sets (e.g. in the recommendations of the InterUnion Bioinformatics Group co-sponsored by CODATA), it has become aware of community concerns that more is needed to secure this goal. Two community-led initiatives are currently taking shape. In one, voluntary submission by researchers of their individual data sets to a crystallography open database is being encouraged. In the other, service crystallography facilities are collaborating with national scientific computing grid funding agencies to collect and store data sets. The motivations behind the two approaches are different; one seeks to make visible data sets that accompany published articles but are either not held as supplementary material at all or are released only to journal subscribers; the other aims to provide early access for the existing databases to collected data. Both approaches make provision for the deposit of data that does not accompany published literature, and both make use of open-source open-access software for data harvesting, storage and searching. The IUCr has certain concerns regarding the critical evaluation of data collected in these ways, and the longevity of community-sponsored data repositories; it intends to work with the parties concerned to increase the value of these initiatives. Among aspects to be considered are the provision of critical analyses of data sets (for which checkCIF reports may be suitable), the establishment of portals or common search engines covering a multiplicity of such sites, and mirroring to safeguard long-term access. Long-Term Preservation of Digital Content ----------------------------------------- The IUCr's interest in long-term preservation of digital publications and data continues with a project undertaken for ICSTI to design and propagate a questionnaire to assess current practice within crystallography. The IUCr representatives to ICSTI and CODATA have worked together on the project, and are currently analysing data from over 600 individual and 20 institutional respondents. The results will be published in 2005. Brian McMahon IUCr 17 September 2004 ------------------------------------------------------------------------------ _______________________________________________ Epc mailing list Epc@iucr.org http://scripts.iucr.org/mailman/listinfo/epc
Reply to: [list | sender only]
- Prev by Date: The devil you don't know
- Next by Date: ICSTI: news items
- Prev by thread: News items
- Next by thread: ICSTI: Public Access to Science Act (Sabo Bill, H.R. 2613)
- Index(es):