[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Phase ID Discussion Paper 7
- To: A Working Group of the IUCr Commission on Crystallographic Nomenclature <phase-identifiers@iucr.org>, Steve Stein <steve.stein@nist.gov>
- Subject: Phase ID Discussion Paper 7
- From: David Brown <idbrown@mcmaster.ca>
- Date: Fri, 20 Feb 2004 14:21:10 -0500
Dear Colleagues, This email contains PhaseID Discussion Paper #7. The previous discussion paper was the first draft of our report to the IUCr Commission on Crystallographic Nomenclature (CCN) that was circulated just before Christmas. Rather than circulate another complete draft of the report, most of which would be repetitive, I focus here on the items that need to be resolved before the final draft report is prepared for discussion and approval. Can you please respond by MARCH 26, after which I will put together a final draft of our report to the IUCr Commission on Crystallographic Nomenclature. The Discussion Paper follows. David Brown *************************************** I.David Brown, Professor Emeritus of Physics Brockhouse Institute for Materials Research McMaster University, Hamilton, Ontario Canada L8S 4M1 Tel: +905 525 9140 ext 24710 Fax: +905 521 2773 email: idbrown@mcmaster.ca *************************************** DISCUSSION PAPER # 7 This document contains the following sections: 1. Comments received on the first draft of the report 2. Construction of the IUPAC chemical identifier (IChI) 3. Proposed additions to IChI for phase identification 4. Incorporation of the phase identifier into the IUCr-CCN Phase Transition Symbol 5. The use of the phase identifier in databases 1. COMMENTS RECEIVED ON THE FIRST DRAFT --------------------------------------- I received comments from Pierre Villars and Sidney Abrahams in response to the first draft of our report. These are included here, organized by topic, with my (IDB) response as necessary. 1.1 GENERAL COMMENTS -------------------- PIERRE VILLARS wrote: I can fully agree with your chapter: 6. Recommendations Layer 5. State of matter: gas, liquid, crystal etc. Layer 6. The space group number Layer 7. Wyckoff Sequence SIDNEY ABRAHAMS wrote: I am in full agreement with your proposal that a unique comprehensive identifier for each chemical compound be formed by adding the crystal phase identifier to the IChI chemical identifier. In reading your first draft, however, it is striking that no mention is made of the proposed method(s) of implementing such a system, possibly because they seem obvious to the specialist. IDB response: The reason this information was missing in the first draft report is two-fold. I first wanted to get agreement on the principles before going into technical details, and secondly I did not at the time have a clear picture of the actual structure of IChI, since there is not yet a document giving a formal description. This detail has now been added and is given below. 1.2 MORE INFORMATION ON ICHI ---------------------------- SIDNEY: May I start by suggesting WG members may find the July 2002 presentation on preliminary thinking about IChI to be of interest, as given at: http://www.iupac.org/symposia/conferences/CIandXML_jul02/ ICHI_Stein_jul2002.pdf although the summary in your Section 5 shows that considerable progress has been made since that conference. However, members may well wish to see further details of the results agreed upon during the IChI workshop at NIST in November 2003. Are these expected to become available soon? IDB Response: I have not had any word about when a report will be forthcoming on the November meeting. However, details of IChI relevant to our work are given below. 1.3 HOW MANY ADDITIONAL LAYERS ARE NEEDED? ------------------------------------------ SIDNEY: I agree with your proposal to add three crystallographic layers to the four IChI chemical layers. The choice between single and multiple letter codes depends upon the answers given to the questions above [these questions can be found in section 1.5 below, IDB]. I also agree with use of the space group number for layer 6 and, if necessary, with the Wyckoff multiplicity and letters in layer 7. I doubt if use of the Bravais symbol in the identifier would be of value. PIERRE: No they [the Bravais symbol and reduced cell] are not needed. There exists quite some cases with same composition and same space group number and same Wyckoff Sequence (after standardization with STIDY and COMPARE). Niggli's Reduced cell is for further distinction not helpful for such cases, one possibility is to add to each point-set its Atomic Environment AET's (Coordination Polyhedra), see e.g. references: - J.L.C. Daams et al., J Alloys and Compounds, 1997, 252,110-142 - J.L.C. Daams et al., J Alloys and Compounds, 1994, 215,1-34 - J.L.C. Daams et al., J Alloys and Compounds, 1993, 197,177-196 - J.L.C. Daams et al., J Alloys and Compounds, 1993, 197,243-269 - J.L.C. Daams et al., J Alloys and Compounds, 1992, 182,1-33 IDB comment: At this point I don't think it is necessary to define the AET as an additional layer but Pierre may disagree. If during the course of use ambiguities are found, the working group could reconvene to discuss the need for a further layer. Since there were no other objections to the use of three additional layers, I am assuming that everyone else is in agreement and we can move on to the next step (see below). 1.4 INCLUSION OF THE IDENTIFIER IN THE IUCr-CCN PHASE TRANSITION NOMENCLATURE -------------------------------------------------------------- DAVID BROWN (earlier comment): The information given in the IUCr-CCN Phase Transition Nomenclature includes: 1. the common symbol used to identify this phase (e.g., alpha,II, etc.), 2. the temperature (and pressure) range in which it is stable, 3. the Hermann-Mauguin symbol and number of the space group (more than one space group may be given, or the Bravais symbol may be given if the space group is not known), 4. Z, the number of formula units in the conventional unit cell (though the formula unit is not defined within the symbol), 5. the ferroic properties and 6. the structure type. SIDNEY Addition of the comprehensive IChI identifier in a new field, probably the leading field, in the CCN phase nomenclature [see Acta Cryst. (2001). A57, 614-626 and Acta Cryst. (1998). A54, 1028-1033)] would be appropriate in database compilations. It would probably be inappropriate elsewhere. PIERRE: If the structure type assignment is properly done (after standardization with STIDY and COMPARE), and each prototype is defined by a unique combination of Space Group Number/Wyckoff Sequence/AET's all is included in item 6) the structure type. IDB response: Pierre is referring to the formal structure type as defined in the Pauling file. The structure type defined as item 6 in the CCN nomenclature can be any description chosen by the user and is therefore not suitable for machine searching. A new field for IChI would be better, but whether it should appear first or last in the sequence, and what format it should have, are questions best addressed by those who were on the working group that defined this symbol. See the proposed discussion for our report given in section 4 below. 1.5 APPLICATION OF THE PHASE IDENTIFIER IN DATABASES ---------------------------------------------------- PIERRE: Yes, it [the phase identifier as proposed in the first draft] is acceptable. The Pauling File has already included this information. SIDNEY: However, a number of questions are likely to arise in reading our Report and I suggest it would be of value to our readers if it contained a section that addressed these and related issues so that our recommendations are set in their fullest context. These issues include the following: 1. Once a unique identifier system has been agreed, must it be reduced to a single algorithm to avoid the introduction of variant identifiers? 2. If the latter is the case, then would it be advantageous to state or merely refer to the algorithm? 3. Must each database adapt the algorithm to match its specific contents or is that the responsibility of the user? To the extent possible, the new section should respond to these and similar questions. IDB response: These questions are now addressed in Section 5 below which is a draft section for our report. 1.6 ANOTHER COMMENT ------------------- SIDNEY: In the example of a material with a single crystal form, OsI3, I note it is not listed in the ICSD. Perhaps a better choice should be made? IDB response: This was taken from the Pauling file. I was looking for a non-trivial example of a compound in which only one phase was known. This is not simple to find. The Pauling file allowed me to perform that kind of search but there are not many examples - even NaCl is known in two phases under different conditions. 2. CONSTRUCTION OF THE IUPAC CHEMICAL IDENTIFIER ------------------------------------------------ With the acceptance of the idea of introducing additional layers into IChI to identify crystallographic phases, we are now ready to discuss how this should be implemented. I first give a description of IChI as it now exists, before making recommendations for the structure of the additional layers. 2.1 CURRENT FORMAT OF IChI -------------------------- The following is an example of an IChI: the slash, /, is used to separate the layers. 1.00Beta/C6H9N3O3/CT:7-4(10)1-2(5(8)11)3(1)6(9)12/H:1- 3H,(H2,7,10)(H2,8,11)(H2,9,12)/SC:1-,2-,3- /I:(1D)/SC:m/is:0/ST:abs The following is an explanation of the above IChI as far as I can figure it out. The important items are the first three or four which are easy to understand - the remainder in this example deal with a description of the stereochemistry and will not frequently be used. Most if not all of the work done in developing IChI has focused on organic molecules and resolving isomers, tautomers and enantiomers. The connectivity of infinite structures has not yet been addressed. This should not present a problem for devising an IChI for phase identification because if the composition and the space group are given (the two essential layers for any phase identification), the connectivity is not usually needed. 1.00Beta/ # Version of IChI C6H9N3O3/ # Sum formula CT:7-4(10)1-2(5(8)11)3(1)6(9)12/ # Basic connectivity H:1-3H,(H2,7,10)(H2,8,11)(H2,9,12)/ # Hydrogen connectivity SC:1-,2-,3-/ # Stereocenters, sp3 I:(1D)/ # Isotopes (H1 is deuterium) SC:m/ # ? is:0/ # Inverted stereo (absolute stereo only) ST:abs # Abs (absolute), rel (relative) or rac (racemic) All but the first two items (which are required) are introduced by one of the tags listed below: "CT:"; /* connectivity */ "H:"; /* H-atoms */ "C:"; /* charge */ "DB:"; /* double bond stereo */ "SC:"; /* stereo centers sp3 */ "is:"; /* mark sp3 inverted stereo */ "SR:"; /* mark sp3 racemic stereo */ "ST:"; /* abs, rel, rac */ "I:"; /* isotopic atoms */ "fH:"; /* fixed H -- first item in non-taut */ "N:"; /* orig. at numbers in canonical order */ "NT:"; /* non-tautomeric orig. at numbers */ /* in canonical order -- first item */ /* in non-tautomeric aux info */ "E:"; /* atoms equivalence */ "tE:"; /* tautomeric groups equivalence */ "iC:"; /* inverted (stereo) Centers */ "iN:"; /* inverted sp3 stereo orig. atom */ /* numbers in canonical order */ "NI:"; /* isotopic orig. at numbers in */ /* canonical order */ /* first item in isotopic aux info */ "TR:"; /* transposition of components in */ /* non-tautomeric representation */ "CRV:"; /* charges, radical, valence*/ "XYZ:"; /* xyz-coordinates */ An XML version of IChI has also been defined, but this is a straightforward coding of the text version described above. It is highly verbose and somewhat opaque. It is designed for computers and is best left for computers to read. CIF versions of IChI would use the canonical form shown above. I assume that the chemical element symbols are case sensitive so as to distinguish between CO and Co, but it may be that the current testing of IChI has not extended to element symbols composed of two letters. 3. PROPOSED ADDITIONS TO IChI FOR PHASE IDENTIFICATION ------------------------------------------------------ The remaining text in this document is designed to be part of the final report. Comments that are not part of the report are indicated by text enclosed between ******* strings. 3.1 NEW TAGS The following is a list of additional tags required for phase identification expressed in the form of an IChI. These would be used in conjunction with existing IChI tags, in particular the IChI version number and the composition: "PH:" /* phase or state of matter. Allowed values are: */ /* gas, liq, amp, sol, xtl, lxl, qxl */ "SG:" /* Space group number, integers between 1 and 230 */ "WS:" /* Wyckoff sequence, any lower case letter */ /* or & (for alpha) */ 3.2 FORMAL DEFINITIONS: COMPOSITION: The composition layer in IChI for a crystalline phase must give the contents of the formula unit of the crystal. This is a unit in general no smaller than the crystallographic asymmetric unit and no larger than the primitive unit cell. It is NOT the same as the formula of the molecule of interest unless the molecule is the only component of the crystal. Other components, including solvents of crystallization, must be explicitly included. Wherever possible the formula unit is chosen so that the multipliers of the elements are integers with no common divisor, but this is not always possible. In cases where one or more of the multipliers is non-integral, the size of the formula unit is indeterminate and only the relative multipliers are meaningful. Testing should be carried out in this case by normalizing the multipliers, e.g., by converting the largest multiplier to 1.00 and the others in proportion. When non-integral multipliers are encountered, searches should include a tolerance factor to allow for experimental uncertainties or to retrieve related compounds of the same phase having a similar but not identical composition. The tolerance should be large enough to recognize that phase identifiers that include trace elements are equivalent to identifiers in which the trace elements have been omitted either because they were not determined or because they were not considered to be important. PH: This layer gives the phase or state of matter. Seven flags are defined. Others could be formally added to this list if a need is demonstrated. gas liq liquid amp amorphous sol solid of unknown form xtl crystal lxl liquid crystal qxl quasi-crystal Only if the value of PH is 'xtl' will the following two layers be meaningful. SG: This is a number between 1 and 230 inclusive, being the number of the space group of the crystal as given in International Tables for Crystallography Vol A. The following space group pairs are identical except for their chirality: 76=78, 91=95, 92=96, 144=145, 152=153, 169=170, 171=172, 178=179, 180=181, 212=213. The chirality is often not determined and is only significant if the crystal contains a chiral molecule. Since molecular chirality is already described elsewhere in IChI, only the lower space group number of each pair should be used. However, one of the forbidden numbers may be inadvertently used and software should be prepared to convert it to its legal equivalent. There are many cases where the true space group is not known, or the structure is incommensurate. Different approximate space groups might be assigned by different workers in which case a valid match would be missed, but there seems little that can be done to overcome this situation. WS: The Wyckoff sequence is an alphabetic list of the Wyckoff symbols (letters) of the occupied special positions, with each letter followed by the number of crystallographically distinct atoms that occupy the site if this number is different from 1. International Tables for Crystallography Vol. A lists the Wyckoff letters for all special position, that is, all sites having a crystallographically distinct site symmetry. Before determining the Wyckoff sequence, the structure must be normalized according to the algorithm used in the program STRUCTURE TIDY, details of which are given in Parthe, E., Gelato, L.M. (1984). Acta Crystallogr. A40, 169-183, Parthe, E., Gelato, L.M. (1985). Acta Crystallogr. A41, 142-151. and Gelato, L.M., Parthe, E. (1987). J. Appl. Crystallogr. 20, 139-143. The allowed letters in this layer include all the lower case letters (as defined in the ASCII coding) and the character '&' representing the Greek letter alpha which appears in space group 47. 3.3 EXAMPLES Rutile IChIversio-x/TiO2/PH:xtl/SG:136/WS:af2 ******* We could use some additional examples, particularly of organic crystals ************ 4. INCORPORATION OF THE PHASE IDENTIFIER INTO THE IUCr-CCN PHASE TRANSITION SYMBOL ------------------------------------------------------------ The IUCr-CCN Phase Transition Symbol [Acta Cryst. (2001). A57, 614-626 and Acta Cryst. (1998). A54, 1028-1033)] is composed of six fields defined as follows: 1. the common symbol used to identify this phase (e.g., alpha, II, etc.), 2. the temperature (and pressure) range in which it is stable, 3. the Hermann-Mauguin symbol and number of the space group (more than one space group may be given, or the Bravais symbol may be given if the space group is not known), 4. Z, the number of formula units in the conventional unit cell (though the formula unit is not defined within the symbol), 5. the ferroic properties and 6. the structure type. The formats of the fields in this symbol are not tightly structured and may contain non-ASCII characters as the symbol was not intended for computer use. Given the different purposes and structure of the IUCr-CNN Phase Transition Symbol and IChI, it is arguable whether any purpose is served by incorporating the IChI Phase Identifier into the IUCr-CCN Phase Transition Symbol. However, the complete IChI Phase Identifier symbol could be included as one or more additional fields. Because both the IUCr-CCN Phase Transition Symbol and the canonical form of the IChI Phase Identifier both uses slashes as field separators, the IChI Phase Identifier must either be incorporated as a series of different fields, or the slash separator in the IChI Phase Identifier must be converted to some other symbol. ******** Members of the present working group who also served on the CCN Phase Transition Symbol Working Group are invited to suggest the best way in which this could be done in the spirit of the original symbol. Otherwise we can leave the matter unresolved in our final report ******************* 5. THE USE OF PHASE IDENTIFIER IN DATABASES ------------------------------------------- Since the IChI Phase Identifier is parsable, each of the layers can be reformatted in any way that suits the needs of a particular database. Most crystallographic databases will already have fields containing the sum formula and the space group number, and adding a field for the Wyckoff sequence should present no difficulty. The 'state of matter' field, PH, would not need to be present since it must have the value 'xtl' if the phase is in a crystallographic database. ******** Is this true for the Protein Data Bank? ******** Software designed to search the database for examples of a target phase would need to extract from the database the information identified in this and other IChI documents. Even if the IChIs are given in their canonical form, they must still be parsed and compared layer by layer, since two different identifiers may not contain the same number of layers, or the search may not be carried out at its full depth if, for example, chirality or isotopic content were not important. All the proposed fields can be searched by looking for identical bit sequences, except for the SG field which should be screened for illegal numbers, and the composition field in cases where non-integral multipliers are given. In the latter case, the composition must be normalized as discussed above and compared with a predetermined tolerance. -----end of file-------end of file-----------end of file----- _______________________________________________ phase-identifiers mailing list phase-identifiers@iucr.org http://scripts.iucr.org/mailman/listinfo/phase-identifiers
Reply to: [list | sender only]
- Follow-Ups:
- RE: Phase ID Discussion Paper 7 (S. C. Abrahams)
- RE: Phase ID Discussion Paper 7 (S. C. Abrahams)
- Prev by Date: RE: *[SPAM]* Draft Phase Identifier Report version 1
- Next by Date: RE: Phase ID Discussion Paper 7
- Prev by thread: Phase ID reminder
- Next by thread: RE: Phase ID Discussion Paper 7
- Index(es):