[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
further proposal
- To: Multiple recipients of list <phase-identifiers@iucr.org>
- Subject: further proposal
- From: "I. David Brown" <idbrown@mcmail.cis.mcmaster.ca>
- Date: Mon, 2 Dec 2002 20:40:19 GMT
Dear Colleagues, Thank you, Pierre, for your description of the phase identifier used in the PAULING project. You have obviously had some experience with the problem and we would do well to build on that experience. In this email I comment on your suggestions, make a new proposal and test the proposal on the phases in the Pb-Sb-S system. You outline 5 fields in your symbol which are similar to the six fields proposed in our earlier email. In the following text I include your descriptions as indented paragraphs. 1. We introduced and used during the last 7 years the following 'Phase Identifier', covering non-organic ordered single phase materials: i) Chemical System (alphabetically sorted), e.g. Al-Ge-Pb This is a form of giving the composition, but without specifying the relative abundances of the different elements. One way in which this could be made more specific would be to order the elements in decreasing order of their frequency in the chemical formula. Where two elements have the same frequency the ordering would be alphabetic. Here are some examples: Formula Abundance Alphabetically ordered ordered --------------------------------------------------------------- Na2SO4 O-Na-S Na-O-S Na2SO4 doped with K O-Na-S-K K-Na-O-S VO O-V O-V V2O3, VO2, V2O5 V-O O-V Mg3Al2Si3O12 O-Mg-Si-Al Al-Mg-O-Si Mg3Al2(SiO4)3 O-Mg-Si-Al Al-Mg-O-Si POCl3 Cl-O-P Cl-O-P Giving them in abundance order would increase the information content and would allow for searches for close matches as well as exact matches. Possible close matches would include: a) The same elements present in any order; b) ignoring minor constituents by matching only the elements in the shorter of the two strings. In the list above O-Na-S-K and could be considered a match with O- Na-S which would be difficult to do using alphabetic ordering. c) interchanging two adjacent elements, useful when a given phase has a range of compositions. 2. ii) Structure Type (using the standardization program STIDY and the concept used in Gmelin's TYPIX HBs), e.g. CsCl type, Al4Ba type This could be seen as an extension of item b in the earlier proposal, giving the phase type indicating liquid, amorphous etc, but 'xtl' (crystal) could be replaced with one of the standard structure types where applicable. An enumeration list of allowed values for this field would be needed. Clearly not all structures can be classified into a particular structure type and perhaps a value, say 'oxtl' (for other crystal type) would be needed for a crystal that cannot be assigned into one of the enumerated structure types. 'xtl' would continue to apply to any crystalline compound and would match any of the structure types as well as 'oxtl'. 3. iii) Pearson Symbol (using Pearson's definition, but replacing A, B, C by S (side-centered)) iv) Space group number The earlier proposal used the crystal system, the space group number and the atom count in the unit cell as separate items equivalent to the Pearson symbol. The lattice centring is redundant if the space group number is given. I omitted this item because of the danger that it can be misassigned. For example the lattice centring symbol of space group 2 is P even if the author describes the space group as I-1, but someone not fully familiar with space group theory might be tempted to assign a value of I. The danger also arises if the identifier is assigned by an unsophisticated computer program. You have chosen to use S for single side-centred cells (and this is logical) but we should follow the current International Tables convention which is to use E. I agree that listing the number of occupied sites, i.e., treating them all as if their occupancy is 1.00, is better than using the actual cell contents as it should always be an unambiguous integer. When calculating the number of occupied sites we will need to decide whether to adopt the hexagonal or rhombohedral setting for rhombohedral space groups. Good arguments can be given for either choice. My suggestion is to use the hexagonal cell as this corresponds to a choice of an R centred cell. 4. v) Formula + modification as unique name within a chemical system (to make it computer friendly we gave each combination within a chemical system a number) In the earlier draft proposal the unique phase identifier of the kind described above was described as an 'external' identifier because it has to be assigned by an external agency, while internal identifiers are those that can be assigned from properties internal to the material. The choice of the names 'external' and 'internal' was, perhaps, unfortunate as you might regard an internal identifier as one used internally within a data center. The difficulty in using such an arbitrary identifier is finding an agency willing to maintain a registry of such numbers in the public domain. In the first instance we were interested in seeing if we could manage without such an arbitrary identifier. Otherwise we would have to find an agency (or agencies) willing to maintain the phase identifier list. 5. The most ambiguous item is v), but it was necessary to introduce it. There we use a standardized way to sort the chemical elements, and the most often occurring groups [SO4], .... I agree. Items i to iv will give many false matches because they are not sufficient to identify a phase uniquely. The use of special symbols for complex ions such as SO4 must be used with care. A complex such as S2O7 might be described as O(SO3)2 where S2O7 and SO3 might both be named complexes. My preference would be to avoid any structural chemical interpretation as these depend so much on the approach of the person making the description (see the two alternative formulae for garnet given in the table above). Such a scheme can be made to work within a data center where there is someone to enforce conformity in ambiguous cases, but our identifier must follow rules that permit of only one possible construction. Perhaps something like the sum formula could be used. My earlier proposal was to use a reduced sum formula, recognizing that the formula does not change by multiplying all the abundances by the same factor, e.g., HgCl and Hg2Cl2. The solution might be to allow any sum formula to be used but require the searching algorithm to match only the relative abundances with some user- determined tolerance allowed (see the example below). This was the intent of the earlier complex set of rules for writing the chemical formula. Writing the formula for organic crystals presents its own problems. Would someone from CCDC or elsewhere with experience in organic structures like to comment? 6. p.s. I have attached a file with the chemical element (functional group) sorted used to formulate the unique formula (please install first the MPDS font to see the numbers as subscripts) This file ended up appended to the email distributed from the list-server rather than attached. It is best to send items to the list-server as text included in the main message, or arrange some other means of distributing non-ASCII files (e.g. from a web site). Perhaps we should not worry about these details until we have the broader framework defined. 7. For cases where the information i)-iv) is not known, we replaced it by a '*'. This agrees with the earlier proposal that not all items need to be included. 8. As a help for the PAULING FILE editors we have created an 'internal' DISTINCT PHASES TABLE', which contains the following information: i)-v), and additional info like: Dm, Tm, color, common name, info about T/p- stability, info about chemical property, ..... I agree with you that the additional info are not good as 'Phase Identifier', but they helped us many times to add to an e.g. physical property entry the 'Phase Identifier' (as very often the structure type is not mentioned, but they write e.g. a green cubic phase,....). With this described approach we were able to give to all entries a 'Phase Identifier'. At present the PAULING FILE contains already about 100,000 structure/diffraction entries, about 65,000 physical property entries and about 20,000 phase diagrams, so our experience is already based on many practical examples. 9. e) CAS number: Is very bad, this has nothing to do with a phase. In PAULING FILE we have in average 15 publications dealing with the same phase. Agreed that the CAS number is tricky at the best of times and I don't favour this as a primary key, but some people might use it and it could helpful for organic compounds where it is already widely used and, at least for a pure molecular compound, is unambiguous. For other cases it can be ambiguous. The CAS number of CuSO4 might be used to refer to the solid CuSO4. 5(H2O) which (I assume) has its own CAS number. Thus a correct match might be discarded because the two keys used different CAS numbers. In any case we have no control over how they are assigned and used or even whether these numbers would be available in the public domain. 10. As unique document name I would recommend: CODEN, year, volume, (part), first page, last page (part only given if no volume given). I agree, but it is not particularly relevant to the assignment of a phase identifier unless we intend to include a bibliography as part of the key. I do not see any value in including the bibliographic reference. REVISED PROPOSAL: Combining Pierre's suggestions with our earlier proposal, I recommend the following refined list of items for the identifier. These are ordered with the most important elements first, though the example given below suggests this may not be the best order. 1: Chemical system: gives the elements present ordered in decreasing order of abundance. Alphabetical ordering is used for elements with the same abundance. 2: Phase type, including the structure type if known. 3. Space group number (conventions needed for enantiomorphic pairs e.g. P41 and P43). 4: Crystal system (redundant if the space group is known but useful if it is not and therefore it should always be given even when the space group is known). 5: Lattice centring in the standard setting (like the crystal system this is redundant if the space group is known). 6: Number of occupied sites in the conventional unit cell defined by 5. This is an integer and is not necessarily the same as the number of atoms in the unit cell. It differs from the definition in the Pearson symbol. (What should we choose as the conventional cell of a rhombohedral crystal?) Items 4, 5 and 6 can be concatenated into a Pearson-like symbol. 7: Chemical sum formula (starting with C and H and then alphabetically ordered. Only the relative element abundances are used). 8: Mineral name 9: Colour (useful if known for otherwise poorly characterized materials. Not a primary key as some materials come in a variety of colours.) EXAMPLE I have chosen the PbS - Sb2S3 system as an example. In this case the phase depends on composition rather than temperature or pressure. it contains has a variety of different phases, but some of the more ephemeral intergrowth phases are not included in the list, and some of those listed may only be stable in the presence of impurities which I have not noted. Note that the space group is not as good an indicator as one might expect because it is easily misassigned and there is no easy way of spotting closely related space groups, e.g., Pbn21 (33) and Pbnm (62), though group-subgroup relations could be built into a sophisticated search algorithm. Of course there is a difficulty in allowing a match between closely related space groups since many phase transitions involve the loss of a single symmetry element, so the distinction between 33 and 62 may be highly significant. The chemical formula comes out looking quite good as an identifier in this example. For convenience I have given the ratio of Sb to total cation (as calculated from the formula) on the right. The examples are mostly taken from the ICSD (but see also Acta Cryst. (1994) B50, 524-538 and references there). To fit each ID onto a single text line I have concatenated the components above, separating them with =. # Proposed ID Sb/(Sb+Pb) ----------------------------------------------------------------- 1 S-Pb-Sb=liq 2 S-Sb-Pb=liq 3 Pb-S=NaCl=225=cF8=Pb-S=Galena 0.00 4 S-Pb-Sb=oxtl=19=oP96=Pb7-S13-Sb4=* 0.36 5 S-Pb-Sb=oxtl=62=oP*=Pb3-S6-Sb2=* 0.40 6 S-Pb-Sb=oxtl=62=oP80=Pb5-S11-Sb4=Boulangerite 0.44 7 S-Pb-Sb=oxtl=14=mP160=Pb5-S11-Sb4=Boulangerite 0.44 8 S-Pb-Sb=oxtl=62=oP84=Pb4.82-S11-Sb4.11=Boulangerite 0.46 9 S-Pb-Sb=oxtl=62=oP96=Pb9-S22-Sb9=Boulangerite 0.50 10 S-Pb-Sb=oxtl=15=mC152=Pb9-S21-Sb8=Semseyite 0.47 11 S-Pb-Sb=oxtl=62=oP36=Pb2-S5-Sb2=* 0.50 12 S-Pb-Sb=oxtl=55=oP38=Pb4-S11-Sb4=* 0.50 13 S-Sb-Pb=oxtl=15=mC136=Pb7-S19-Sb8=Heteromorphite 0.53 14 S-Sb-Pb=oxtl=2=aP50=Pb5-S14-Sb6=* 0.55 15 S-Sb-Pb=oxtl=12=mP46=Pb4-S13-Sb6=Robinsonite 0.60 16 S-Sb-Pb=oxtl=1=aP46=Pb4-S13-Sb6=Robinsonite 0.60 17 S-Sb-Pb=oxtl=15=mC120=Pb5-S17-Sb8=Plagionite 0.62 18 S-Sb-Pb=oxtl=173=hP76=Pb18-S81-Sb42=Zinkenite 0.70 19 S-Sb-Pb=oxtl=173=hP72=Pb1.6-S7-Sb3.4=Zinkenite 0.68 20 S-Sb-Pb=oxtl=15=mC104=Pb3-S15-Sb8=Fueloeppite 0.73 21 S-Sb=Sb2S3=62=oP20=S3-Sb2=Stibinite 1.0 22 S-Sb=Sb2S3=15=mC104=S15-Sb9.8=Stibinite 1.0 23 S-Sb=Sb2S3=47=oP40=S3-Sb2=Stibinite 1.0 24 S-Sb=Sb2S3=31=oP20=S3-Sb2=Stibinite 1.0 NOTES. (The number following the phase number in the following list is the number of different structure determinations reported in the ICSD). # Comment --------------------------------------------------------------- 1 In the liquid phases the symbol indicate which metal predominates 2 3 (11) 4 (1) 5 This phase is not well characterized 6 (2) Note the different space groups and compositions reported for Boulangerite 7 (1) 8 (1) 9 (1) 10 (1) 11 (2) 12 (1) This composition is not electroneutral 13 (1) 14 (1) Pearson symbol given as aI100 in ICSD 15 (1) Pearson symbol given as mI92 in ICSD 16 (1) Wrong assignment of space group in this structure determination 17 (1) 18 (1) Pearson symbol given as hP71 in ICSD 19 (1) Zinkenite assigned different compositions and site occupancies 20 (4) 21 (5) 22 (1) Pearson symbol given as mC99 in ICSD may correspond to actual cell contents 23 (1) Probably incorrect space group 24 (1) Probably incorrect space group CONCLUSIONS 1. Although the same space groups appear in different phases, no two phases with the same space group have the same number of occupied atomic sites. 2. The chemical formula is a better distinguisher of phases than the space group because of the number of wrong space group assignments. Along with the space group error, go the errors in the crystal system and lattice type. 3. If the formula is known, the chemical system is redundant and we need not include it. If the chemical system is known but the formula is not, the abundances in the formula could be replaced by *. Do we need the chemical system? Your comments and suggestions are welcome. Best wishes David ***************************************************** Dr.I.David Brown, Professor Emeritus Brockhouse Institute for Materials Research, McMaster University, Hamilton, Ontario, Canada Tel: 1-(905)-525-9140 ext 24710 Fax: 1-(905)-521-2773 idbrown@mcmaster.ca *****************************************************
Reply to: [list | sender only]
- Prev by Date: Phase Identifiers
- Next by Date: AW: further proposal
- Prev by thread: AW: further proposal
- Next by thread: Phase Identifiers
- Index(es):