The crystallographic information file, (CIF) was developed by the IUCr in the early 1990's as a standardized format to document single-crystal structure determinations and to exchange the results between laboratories. The data items used in CIF are described in a series of data dictionaries. The syntax used in these data dictionaries is described in a data definition language (DDL), of which there are two versions DDL 1.4 and DDL 2.1. The data dictionary for the original single-crystal information (now called the core) as well as the powder diffraction (pdCIF) dictionary use a somewhat updated version of the original syntax, DDL 1.4, while the macromolecular CIF (mmCIF) dictionary uses the more complex DDL 2.1.
The CIFEDIT program, described here, was created to view and edit CIFs, with particular emphasis on files that contain multiple blocks, as these are quite important in pdCIF. For DDL 1.4 version dictionaries, CIFEDIT can use information from the dictionaries to validate CIF data items, as well as display the definitions for these items.
Contents:
On Windows and Macintosh computers, CIFEDIT will usually be invoked by dragging a CIF onto the icon for CIFEDIT. Alternately, one can click on the icon and then locate the appropriate file via a file open window. One can also configure the operating system to associate a file extension (such as .CIF), so that double clicking (or in Windows left-clicking) on such a file will open that CIF in CIFEDIT. This configuration can be selected for Windows during installation. For Macintosh, the user must do this for themselves.The CIFEDIT program is started on Unix computers typically by typing
cifedit [file.cif]where optionally the name of a file, file.cif, to be read is listed on the command line. If no file is supplied on the command line, a file browser is opened to select a file to use.While the CIF is being read, a message such as the one to the right will be seen. After the read is complete, the main CIFEDIT screen (shown below) displays the CIF data block(s) and data items in a hierarchical format.
The minus sign (-) to the left of block name data_NISI_publ can be toggled using the mouse to hide all of the data items included in the block, as is shown to the left.
As is shown to the left, the "folder" icon indicates a data block, a single page indicates a CIF data item and an icon with a pair of pages indicates a data loop. The contents of loops is not shown, by default, but clicking on the plus sign (+) to the left of the loop causes the entry to be expanded to show the contents. As an example, contrast the difference between loop_0, and loop_2 to the left.
As is shown above, clicking on a CIF data item causes the value associated with that data item to be displayed in the right-hand side window. In the example shown above, the data name _cell_length_a has been selected. This is defined in the CIF dictionary as a number between 0 and infinity with units of Angstroms. These units (A) are displayed adjacent to the entry box.
If the mode is changed from "browse" to "edit" (see control in lower right), the value can be edited, as is seen below. When appropriate, input is validated to require that valid numbers in the allowed range are input and that standard uncertainties (esd's) are entered only where allowed. Likewise, if the CIF dictionary defines a enumerated list of values for a data name, a menu button is offered in place of an entry box. In this way, only a valid entry from the list can be selected.
CIF loops
CIF loops allow multiple values to be associated with one or more data items, in effect defining a table of data. Clicking on the entry for a loop causes all the data names in the loop to be displayed, as is shown below.
When a loop is displayed, extra controls appear, as are defined below. Note that only the first entry listed will be available in "Browse" mode.It is also possible to click on the data name for a item inside a loop, in this case, all entries for that data item in the loop are displayed (a column). This mode is not available for very large loops, as it would require too much memory to display all the entries. The maximum number of entries is controlled by variable CIF(maxRows), which can be customized.
- Loop element #
- The Loop element #" spinbox" is used to select which "row" from the loop is displayed. The up arrow advances to the next row, while the down arrow reverses by one entry. Numbers can also be typed into the entry box; the number is accepted when Enter is pressed. The keyboard up and down arrows can also be used to advance between entries. Other keys such as Page Up, Home, etc. advance in large increments.
- Add to loop
- In edit mode, a new row can be added to the end of a loop using the "Add to loop" button. The value for each new entry is initialized as "?" (meaning value unknown or unspecified.)
- Delete loop entry
- In edit mode, this deletes the current row from the loop. First select the row to delete with the "Loop element #" spinbox. Note that the values are displayed for confirmation before the delete operation is performed. It is not possible to delete all entries from a loop, so this button is disabled when a loop has only a single row defined.
CIF errors
Parse Errors
If errors are noted as a CIF is parsed, a special entry is listed immediately after the block name, labeled "Parse-errors", as is shown below. Note that the "go to line" button in the Show (Hide) CIF contents window can be very convenient for locating and repairing these errors.
Validation Errors
Many other types of errors can be determined by comparing data values against the definitions found in the appropriate CIF dictionary. For example, the core dictionary specifies that the only valid names for _diffrn_radiation_probe are x-ray, neutron, electron, and gamma. If a CIF has this value specified as proton, it will be flagged as an error. Likewise, _atom_type_number_in_cell is specified as number, zero or greater. An error condition will exist, if the value for this is specified as a negative number, or a string than is not a valid number (the special CIF values . and ? are valid, however). Pressing the "Validate CIF" button at the bottom of the main window causes the CIF to be scanned for errors in data values. If errors are located, a window will be displayed. Also, a entry with "Validation-errors" is added to the browser.
The actions available on the menu bar are listed below.
File Menu
Commands in the File menu consist of:
- Open
- Allows a new CIF to be read and displayed. If the current CIF has been edited, you are prompted to save or discard the edits.
- Save
- This option is available when the CIF has been edited and it saves the CIF, as edited back to the current file.
- Save As
- This option is used to save the current CIF under a new file name.
- Exit
- Exits the program. If the current CIF has been edited, you are prompted to save or discard the edits.
Edit Menu
- Undo
- As changes are made to the CIF template, they are recorded and can be reversed using this "Undo" menu command. There is no limit to the number of changes that are recorded. However, some actions (for example, adding new items to the CIF or opening the CIF for manual editing, saving the file to disk) cannot be undone and will cause the undo buffer to be cleared. When there are no actions to undo, the menu item is disabled.
- Redo
- If changes have been reversed with the "Undo" menu command, the changes can be reapplied using the "Redo" button. The list of changes available for "Redo" is cleared when a new edit is made or when the undo buffer is cleared.
- Validate CIF
- The "Validate CIF" menu command causes the CIF to be scanned for errors in data values. If errors are located, a window will be displayed. Also, an entry with "Validation-errors" is added to the browser for each block with errors.
- Mode
- CIFEDIT operates in two modes. One where the CIF is viewed but cannot be changed (browse) and the other where the CIF can be changed. Select between these modes using:
- Browse
- In browse mode, the CIF can be examined, but no changes can be made.
- Edit
- In edit mode, the contents of the CIF can be changed.
- Add
- This is used to add entries to the CIF. If there is more than one block in the CIF, you will be select the bottom to add to using a window like the one to the right. You may add a new loop or individual CIF entries, using one of these menu commands:
New Loop The "New Loop" option provides the window shown below for selecting data names. The window shows a list of categories, where clicking on the category shows the data names assigned to the category (note that a CIF loop may contain only data items from a single category). Select the desired items to include in the loop by clicking on them and then press the "Insert" button in the upper right to create the loop. Data items in DDL1.4 dictionaries that are not allowed to be used in loops are displayed in italics and cannot be selected.
Note that as data items are selected, their definitions are displayed in the CIF Definitions window, and can be viewed if this window is visible.
To allow data items to be located, it is also possible to search for the occurrence of character strings in data item names. After entering a string in the box press "Search" to locate matching data items, the first matching item will be displayed. The "Next" button will then advance to sunsequent matches. Note that data items will be matched if the entered string occurs anywhere in the data item name. So the string "_atom" will match both "_atom_site_fract_x" as well as "_chem_comp_atom.charge"; wildcards may be used in search strings, so that "chem*charge" will also match "_chem_comp_atom.charge". Case is ignored in searches so that "len_Q" and "len_q" locate the same matches.
Item by Category It is also possible to use the "category browser" to select individual data items to add to the CIF. The window is very similar to that above, except that only one item may be selected at a time. String searches work in the same fashion as in for adding loops.
Item by Name The other option for adding individual data items provides a "data name browser" (shown below) for locating data items. Data items may be listed in overall alphabetical order, or when "sort by dict." is selected, items are grouped by dictionary.
If a string is entered and the "search" button is pressed, only data items that match the search string are displayed. Note that data items will be matched if the entered string occurs anywhere in the data item name. So the string "_atom" will match both "_atom_site_fract_x" as well as "_chem_comp_atom.charge"; wildcards may be used in search strings, so that "chem*charge" will also match "_chem_comp_atom.charge". Case is ignored in searches so that "len_Q" and "len_q" locate the same matches.
Windows Menu
- Show (Hide) CIF contents
- The "Show CIF contents" button causes a window to be displayed that shows the text of the CIF, as shown below. As CIF data items are selected by clicking on data names or through use of the other buttons, the window is scrolled forward or backward to show the appropriate section. Note that it is possible to make editing changes directly to the CIF using this window, but this will clear the information in the Undo buffer. Also, if the "Open for Editing" button is pressed, it is assumed that the CIF has been changed.
After the "Show CIF contents" button is pressed, the label changes to "Hide CIF contents"; pressing the button again causes the window to be hidden.
- Show (Hide) CIF definitions
- As CIF data names are selected, their definitions are shown in the CIF Definitions window, as shown to the right. After the "Show CIF definitions" button is pressed, the label changes to "Hide CIF definitions"; pressing the button again causes the window to be hidden.
Options Menu
Note the the options in this menu may be saved in a file, which will then override the default setting on future occasions when the program is run. The name of the file is .cifedit_cfg on MacOS X and Unix or cifedit.cfg on Windows. The file is located in the user's home directory in Unix/OS X, or "C:\Documents and Settings\{username}" in Windows-2000 and -XP, but in some of the older versions of Windows (I'm not sure which), the file will be placed in C:\.
- Select Dictionaries
- The "Select Dictionaries" menu command is used to determine which dictionaries will be used to process the CIF. This is done using the window shown to the right. By default, CIFEDIT uses all dictionaries in the ciftools/dict subdirectory. However, additional dictionaries in any location on your computer may be opened with the program as well. The "Save current settings" button causes the current dictionary settings, as well as the other option values in this menu, to be saved as described above. The up and down arrows are used to change the order of a selected dictionary, as if the same definition is read from more than one dictionary, only the last definition is used. If this happens, a warning message will be generated, unless the "Show Duplicate Dict. defs." (see below) checkbutton is selected.
Note that when a dictionary is first opened, a index file is generated for that dictionary, which is written in the same location as the dictionary. These index files contain an array element for each CIF data name in the selected CIF dictionaries that includes a reference to the name of the dictionary file and a reference to where the definition for the data name is found, as well as well as the data type, units and validation ranges or lists allowed values. The reference for the definition is its location referenced by the number of bytes in the file that must be skipped over; this allows definitions to be accessed quickly. If the dictionary files change, these offsets likely need to change, so if a change in the modification date for a file is noted, the index files will be regenerated.
- Hide (Show) Button Bar
- This causes the button bar on the bottom of the browser window (see below) to be shown or hidden.
- Screen font
- The size of the fonts used in the program can be adjusted to suit the computer display and the user preferences.
- Show Duplicate Dict. defs.
- When dictionaries are read, a check is made to see if data items are defined in more than one dictionary. When this occurs, the last definition that is processed is the one that is used. When the "Show Duplicate Dict. defs." data item is checked, a warning message is displayed when duplicate dictionary definitions are encountered; the message is shown every time the dictionaries are read.
- Maximum data items
- The CIFEDIT program reads the entire CIF into memory. This means that if a very large CIF is read, the operating system can start to "gasp for breath" (what really happens is called "churning", where the computer spends all its time page-faulting and gets nothing done.) To prevent this from happening, there is a limit on the number of CIF data items (each value in a loop is counted as an item) that will be read from a CIF. If more than this number of items is encountered, processing of the CIF is stopped and a warning message is displayed.
The maximum CIF that can be handled will depend on the capabilities of your operating system, the amount of memory and the number of other programs that are running. By default, the maximum number of data items is set to 100,000 -- which seems fine for most computers and allows CIFs as large as 0.5Mb to be read. If you have a lot of memory and want to work with big CIFs, you can raise this limit or even disable it using the "Maximim data items" option.
- Save options
- The "Save options" menu items causes the current dictionary settings, as well as the other option values in this menu, to be saved as described above.
Help Menu
- About
- Displays the version number of the program.
- Web page
- Causes this page to be displayed.
The buttons and controls on the bottom of the main window have the following functions:
- Close
- The Close button causes the program to exit. If there are unsaved changes, the user is offered the chance to save the edits to disk.
- Show (Hide) CIF contents
- The "Show CIF contents" button causes a window to be displayed that shows the text of the CIF, as is shown above. As CIF data items are selected by clicking on data names or through use of the other buttons, the window is scrolled forward or backward to show the appropriate section.
Note that it is possible to make editing changes directly to the CIF using this window, but this will clear the information in the Undo buffer. Also, if the "Open for Editing" button is pressed, it is assumed that the CIF has been changed.
After the "Show CIF contents" button is pressed, the label changes to "Hide CIF contents"; pressing the button again causes the window to be hidden.
- Show (Hide) CIF definitions
- As CIF data names are selected, their definitions are shown in the CIF Definitions window, as is shown above. After the "Show CIF definitions" button is pressed, the label changes to "Hide CIF definitions"; pressing the button again causes the window to be hidden.
- Validate CIF
- Pressing the "Validate CIF" button causes the CIF to be scanned for errors in data values. If errors are located, a window will be displayed. Also, a entry with "Validation-errors" is added to the browser for each block with errors.
- Undo
- This button provides a quick way to access the "Undo" menu command, which allows some changes to the CIF to be reversed.
- Redo
- If changes have been reversed with the "Undo" menu command, the changes can be reapplied using this "Redo" button or the equivalent "Redo" menu command.
- Edit Mode
- The CIFEDIT program works in two modes:
- browse
- In browse mode, the CIF can be examined, but no changes can be made.
- edit
- In edit mode, the contents of the CIF can be changed.
- Save
- Changes made to the CIF are not saved to the disk file automatically. When changes have been made, but have not been saved to disk, this button is made active.
The pdCIFplot and CIFEDIT programs have been combined into a single distribution, CIFTOOLS. The basic code is the same on all platforms, however, for many common operating systems, a platform-specific Tcl/Tk shell is included in a platform-specific release. Please refer to the CIFTOOLS installation instructions for download and installation details.
This program has benefitted from comments of Brian McMahon of the IUCr. Richard L. Harlow first got me interested in the problem of a universal file format for powder diffraction data, leading eventually to my involvement with CIF and then this programming effort. I may someday forgive him.
The author of CIFEDIT is a U.S. Government employee, which means that CIFEDIT is not subject to copyright. Have fun with it. Modify it. Please add new features and make them available to the rest of the world.
Neither the U.S. Government nor the author makes any warranty, expressed or implied, or assumes any liability or responsibility for the use of this information or the software described here. Brand names cited herein are used for identification purposes and do not constitute an endorsement by NIST.
Comments, corrections or questions: crystal@NIST.gov
lastmod(); ?>
$Revision: 1.5 $ $Date: 2003/12/29 23:33:07 $