[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Send comment to list secretary]
[Reply to list (subscribers only)]
CoreCIFchem Discussion #6
- To: coreCIFchem@iucr.org
- Subject: CoreCIFchem Discussion #6
- From: David Brown <idbrown@mcmaster.ca>
- Date: Mon, 04 Oct 2004 14:51:16 -0500
Dear Colleagues, I have attached a text file containing the latest discussion paper (#6) on the creation of a chemical description of a structure in CIF. I have taken note of Howard's suggestion and produced what should be a good version for fine tuning. There is only one outstanding problem - the description of the geometry for infinitely connected structures, but this can probably be sorted out without too much difficulty. Otherwise the present draft presents a simpler and more flexible format than the earlier drafts. The attached file runs to 35 pages for which I apologize, but there are unfortunately no shortcuts. I would like to work on the next version (or start preparing dictionary copy) after Dec 31, so I would like to have all your comments by then. If this does not give you enough time, let me know. The deadline can be changed to suit your timetables. Best wishes David Prof. David Brown Brockhouse Institute for Materials Research McMaster University, Hamilton, Ontario, Canada L8S 4M1 Fax 905 521 2773
DISCUSSION PAPER #6 For the sake of keeping the project moving I am setting DECEMBER 31, 2004 as the deadline for responses, but this deadline can be extended if you need more time. PLEASE SEND YOUR COMMENTS TO coreCIFchem@iucr.org David +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ CONTENTS OF THIS DISCUSSION PAPER #6 ----------------------------------- 1. COMMENTS RECEIVED ON DISCUSSION PAPER #5 2. PREAMBLE TO THE REPORT TO THE CORE DICTIONARY MAINTENANCE GROUP 3. SAMPLE CIFS 3.1 CIF FOR TNT 3.2 CIF FOR CaCrF5 4. COMPARISON OF THE ABOVE PROPOSAL WITH mmCIF 5. SAMPLE CIFS WITH COMMENTS REMOVED PLEASE SEND YOUR COMMENTS TO coreCIFchem@iucr.org before DECEMBER 31, 2004 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ COMMENTS RECEIVED ON DISCUSSION PAPER #5 Each of these comments is followed by my response. Other more specialized comments are placed within the sample CIFs. Comments by Howard D. Flack ----------------------- I've only really considered David's comments and the TNT example. I did not work in detail at the CaCrF5 example. I've made a new skeleton CIF of the TNT example which to me is simpler and easier to read. I guess it breaks many CIF rules of syntax. IDB Response ------------ I have used Howard's skeleton as the basis for the present proposal which is more CIF conformant than Howard's draft and, I hope, simpler. I have reworked the CaCrF5 CIF using the same scheme. HDF on tecton v molecular unit -------------------------------------- I've used the word 'tecton' to mean a general building block instead of molecular_unit. I heard it used in a talk by Guy Orpen but Guy has written to me to say he did not invent. He has sent me a few references which I have not yet had time to read. IDB reponse ----------- I have adopted this terminology to refer to a collection of bonded atoms whose topology we describe. These may not conform the concept of a tecton used elsewhere, so I hope this does not cause confusion. HDF on unique atom identifiers ------------------------------ One of the aspects of David's implementation as seen in the TNT example which troubles me, is the necessity for each atom in each molecule to have an unique identifier as coded in '_molecular_unit_atom_mu_id'. As far as I remember there are already several million molecules that are known and giving every atom in each molecule a unique identifier is cumbersome to say the least. I definitely think that we will see molecular libraries come into existence either locally or globally. New molecules can be added to a library by editing, cutting and pasting in bits from other molecules so again atom identifers containing the molecular name are very heavy in use. One perverse aspect of using unique atom identifiers over a set of molecules is that it does not per se ensure molecular integrity. In defining the bond topology it is quite possible to do the stupid thing of defining a bond between two atoms which are not in the same unit. May be this is yet another thing that I have not really understood about the CIF syntax. David is suggesting the construction of a unique reference item by the concatenation of two others. Why not just use the initial pair of reference items together as a unique pointer in its own right. This is what I have done in my 'improved' CIF. IDB response ------------ The atom name only needs to be unique within the CIF so most of Howard's concerns are not a problem. However, we should distinguish between the atom name that is used daily by the chemist and the 'list-reference' required by CIF. Every list (i.e., loop) in a CIF must have a list-reference which is a string that serves as a unique address for a particular line in the list. In the _atom_site list, for example, the list reference is the _atom_site_label which also serves as the common name of the atom used by the crystallographer, e.g., C1, N21. There are two views that one can take. The first says that since the crystallographer is not going to assign the same label to two different atoms, the loop can be kept simple by also using the chemical name (_*_atom_site_label) as the CIF list-reference. The second view argues that it is best to keep the list-reference, which is required for CIF file management, separate from items that convey chemical or crystallographic information. The underlying philosophy of CIF is to separate the syntax (grammar) of the file from the semantics (the information it contains) so that a program can manipulate the file without knowing anything about crystallography. This philosophy favours the second view. The implications of these two views are brought out clearly in the examples we are discussing. If there is more than one tecton in the list of atoms that define the topologies, the same chemical name may be used more than once. For example, in the TNT case C1 appears in the description of the topology of the TNT molecule as well as in the description of the topology of the benzene ring. In order to provide a unique address for each line, the list-reference must distinguish between these, i.e., it must also include some identification of the tecton. This means that two items are required to give a unique address which leads to unnecessary complications in programming. In effect the syntax (the number of items required for the list reference) is dependent on the type of information in the loop. The extreme example of this approach is seen in the torsion (dihedral) angles loop where the list-reference would have to involve five separate items, the tecton_id and the labels of the four atoms that define the angle! This requires that programs that make use of the relational structure of CIF must be able to handle list-references that consist of many items. If, on the other hand, we assign unique list- references that carry no chemical information, such as assigning sequential numbers to the different lines (as I did in the previous discussion paper), each dihedral angle is identified by a number, and the four atoms that define the dihedral angle are each identified by the number that represents their address (their id) in the atom list. While this makes programming easier since each list-reference consists of a single item, it makes the CIF more difficult to read since the reader must check back to the atoms list to find out which atom wears the number 35 (say). The list reference does not have to be a number - it can be any string provided that it is unique within the list, so one way to reconcile these two approaches is to construct a composite list-reference from chemically meaningful terms, e.g., for the atom list one might choose A.C1, A.C2, B.C1, etc. where A and B distinguish the different tectons. These tecton_atom_ids would be parents to items that appear in other lists where they are used to identify the atoms that form bonds, angles etc. CIF treats the list-reference strings as unparsable and assumes they contain no chemical information; the author of a CIF can use any desired string. However, to help us discuss these sample CIFs I have used composite list-references in the places where they will make our life easier. Elsewhere I have used sequential numbers or letters. In summary, the list-reference is always a string that has no semantic content but is used solely for file management (e.g., locating particular lines in a list). Such strings are never parsed by the computer. The chemical information always resides in other items on the line. HDF proposes that all bonds and angles be defined in terms of atoms ------------------------------------------------------------------- In defining the geometry of a tecton David's uses two atoms to define a geometric bond, two topological bonds to define an angle and worries what should be the correct way of doing a dihedral angle either by way of atoms or bonds. I maintain that the only correct way to define the geometry is in all three cases to use a set of atoms: 2 for distances, 3 for angles and 4 for dihedral angles. The reason is as follows: the geometry section allows interatomic distances to be specified but nothing requires that the two atoms concerned form a bond as defined by the topology; similarly the three atoms used to specify an angle may or may not be forming bonds as specified by the topology; etc. One fairly frequently specifies angles by specifying interatomic distances between atoms which are not bonded as defined by the topology. IDB response ------------ I agree. This simplifies the CIF since (almost) all bonds, distances, angles and mappings use the atom_ids that are defined in the tecton_topology_atom loop. HDF on avoiding unnecessary repetition -------------------------------------- Another aspect which seemed rather heavy in David's TNT example was the repetition of certain information. As I see it there are four conformers (aa, bb, ab, ba in David's nomenclature) all of which correspond to the same molecule, meaning to the same molecular topology. To improve this state of affairs it seems natural to input a unit defining only its topological features - the TNT molecule - and then reuse this unit several times adding in either the minimum or complete geometric information necessary to distinguish the 4 conformers. So I ended up with four units of identical topology but differing geometry and one unit defining only the topology. This makes the relationship between the conformers and the parent molecule easy to perceive in the file. I felt that it was essential to be able to define all four conformers. Although I well understand that often one can not make an unequivocal assignment of molecules or conformers in the case of a disordered crystal structure, it is certainly also the case at present that models of disorder in molecular crystal-structure determinations are being used which have no possible interpretation in terms of the (assumed) constituent molecules. IDB response ------------ This problem has been addressed in the present draft. My solution adds an extra layer but it is simpler than Howard's. The topology and geometry are kept separate and the tecton is defined at the topology level while the conformers are only introduced at the geometry level. Each of the distances and angles are given only once and each is flagged so that it can be identified with one or more of the conformers. In this way all four conformers are defined without the need to repeat any of the distances or angles. Howard mentions crystallographic models of disorder that have no atomic description. A method is currently being developed by the CIF Core Dictionary Maintenance Group whereby the number of electrons in a diffuse patch of electron density will be indicated by including dummy atoms in the _atom_site loop, representing atoms presumed to be present in the crystal. A direct mapping from the topology to the real and dummy atoms in the atom_site loop should be possible even in these cases even though the positions of the dummy atoms are not defined. HDF comment on compatibility with INChI --------------------------------------- I would like to have some reassurance that the molecular data structures we are trying to define in CIF are as compatible as possible with those used in the IUPAC project for producing unique chemical identifiers (I've forgotten its name yet again). IDB response ------------ The name is now INChI, the IUPAC-NIST CHemical Identifier. I don't think there is a problem at the topology level, but I don't know if there are problems in dealing with conformers. I will have to look into this. HDF comment on dangling bonds ----------------------------- In the chemical sub-groups like the 1,2,4,6 benzene and nitro groups, it makes sense to me to include the dangling bonds. I'm also very much in favour of including ALL the atoms especially the hydrogen atoms. IDB response ------------ Fine. I have retained this feature of Howard's proposal, but see the note below about the difficulties of mapping dangling bonds. HDF comment on mapping ---------------------- I was disturbed by David's use of the word 'map'. In mathematics it has a very precise meaning [If you map set A on to set B then you have to assign one single element of B to every element in A. This means every element in A has to have a unique son in B although several different elements in A can lead to the same element in B. Also whereas every element in A must have a son in B, not every element in B has to be the son of an element in A.] Especially in the relation between the tectons and the crystal structure, these criteria were not being obeyed. IDB response ------------ While it would be good to adhere to the mathematical practice, I wonder how well it would be observed by crystallographers most of whom are unaware of the mathematical rules, particularly as the kind of 'mapping' we do in this file does not, in general, follow the mathematical rules. Perhaps we should use a different word, but I have not been able to think of a good substitute. I have continued to use the word 'map' in the present draft in the absence of a suitable alternative even though the mathematical rules of mapping are not followed. HDF on Disorder in molecular crystals: -------------------------------------- As Greg makes clear, it may well be that there are several interpretations (mappings) which relate the topological definition of a molecule to the atomic coordinates determined from crystal-structure analysis. We must be sure that we provide a mechanism to encode these alternate molecular interpretations and associated geometry. It seems to me that the needs for journal publishing (i.e. checking) disordered structures and the way that they are subsequently entered into a database are somewhat different. For the publishing/checking side of the business one needs to provide a mechanism to evaluate the structural sense of the molecules including the disordered part of the structure. I've seen too many papers where the disordered part of the structure makes absolutely no molecular sense at all. (One of my colleagues in inorganic chemistry recently received a paper to referee in which about 50% of the electron density was modelled through Ton Spek's BYPASS procedure with no attempt at any molecular interpretation of the disordered region. To our minds in that case the structure analysis could have been improved so we recommended reanalysis.) On the other hand I think that for the data bases, tentative interpretations of disordered regions have much less use and probably what is required is that although the topology of the complete molecule be defined, the mapping (atoms and bonds) between the topology and the crystal structure relate only to those parts of the molecule which are well ordered in the crystal structure. > since it does not require the author to specify how the disordered atoms > sites are combined in the individual molecules, I'm very suspicious of that. One must provide a mechanism that allows multiple mappings of the topological definition of the molecule onto the atoms seen in the crystal-structure analysis. Of course it's not for coreCIFchem to 'require' such information but I certainly see that it could be put to very good use for the purposes of checking a crystal-structure analysis. IDB response ------------ The present proposal has a remarkable flexibility. The topology can be mapped to the crystal structure with or without reference to the conformation, but if the conformers are specified, they can be combined in any desired way that matches the known or supposed molecular structure of the crystal. Similarly different isomers may be mapped to the crystal in any combination. (N.B. isomers differ at the topological level, conformers have the same topology but differ at the geometry level). HDF on the need for all the atoms in the bond graph to be in the crystal ------------------------------------------------------------------------ > It is not necessary that the molecular units (tectons) account for all > the atoms found in the crystal structure, nor that the crystal structure > contain all the atoms specified in the molecular units. I have no trouble with the first part of the sentence but the second part after 'nor' leaves me somewhat perplexed. I expected that all of the atoms specified in the molecular units would be in the crystal structure even if one could not see them clearly. Could you give examples of what you have in mind here. IDB response ------------ The inherent structure of the CIF does not require that every atom in the crystal map onto an atom in the tecton and vice versa. To require such a restriction seems unnecessary and difficult to enforce in any automatic way. Someone may wish to define a tecton for which no crystal structure has been reported, or for which only the unit cell is known. They may wish to define a monomer and a dimer as two tectons, where only the monomer appears in the crystal. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2. PREAMBLE TO THE REPORT TO THE CORE DICTIONARY MAINTENANCE GROUP ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ The crystallographic information given in a CIF consists of the atomic coordinates in the asymmetric unit, the symmetry operations needed to generate all the atoms in the crystal, the lattice parameters and some interatomic distances. The present proposal provides a means of adding a chemical description of the structure in the form of a bond graph (2-D structure) that identifies how the atoms are arranged in molecules and complexes (tectons), and which interatomic distances correspond to chemical bonds. It also allows the ideal molecular geometry to be specified for comparison with the observed geometry in the crystal. The atoms in the bond graph are identified with the corresponding atoms reported in the crystal so that a database search on the topology of the molecule graph can retrieve its crystallographic structure, or alternatively, the distance between two atoms in the crystal can be identified with a chemical bond in a particular molecule. With this addition, the CIF may include a description of the bonding topology of one or more tectons, a tecton being defined as a group of atoms linked by bonds, usually representing a molecule, complex or functional group. There is no limit to the number of tectons that may be described in a given CIF and there is provision for mapping the atoms of one tecton onto the atoms of another, as well as identifying the atoms in a tecton with the atoms in the crystal. A chemical description of the contents of a crystal, distinct from the crystallographic description, serves a variety of uses. It describes the contents of the crystal in the language of chemistry rather than the language of crystallography. This allows the structure determinations of crystals containing particular molecules (tectons) to be located by searching on their bond topology, permitting the coordinates of the atoms that form the tecton to be retrieved. Further the crystal structure determination does not itself identify which atoms are bonded. A topological description of the bonding network supplements the information given in the crystal structure report by identifying which interatomic distances correspond to chemical bonds. The chemical description can be used to identify the different molecules that compose a crystal, or the crystallographically distinct copies of the same molecule. Finally the proposed chemical description allows the ideal geometry and conformation of a the tectons to be specified - information which can be used during the refinement of a crystal structure (e.g., in defining rigid groups) or for validating the experimental bond distances and angles. The proposed chemical description of a tecton is given in the CIF in a number of groups of related categories or loops. The first group identifies the different tectons described and provides a description of their bond topologies, i.e., the list of atoms and the list of bonds that link these atoms. The second group gives the ideal geometries of the tectons and identifies the different possible conformers. The final group allows the tectons to be mapped onto each other and identifies the atoms and bonds in the tectons with the atoms and interatomic distances in the crystal. The topologies of the tectons are described in the tecton_topology categories in the form of a list of atoms and the bonds between them. TECTON_TOPOLOGY Lists the different tectons described TECTON_TOPOLOGY_ATOM Lists the atoms in each tecton TECTON_TOPOLOGY_BOND Lists the bonds between the atoms The topological description does not include any information on the geometry of the tecton but it does distinguish between isomers. The decision as to what constitutes a tecton is left to the author but a tecton would normally correspond to a molecule or, in the case of the infinitely bonded solids typically found in inorganic compounds, the tecton would normally be chosen as the formula unit, the smallest group of atoms that contains all the chemical elements in the same proportions as they are found in the crystal. The conformation and geometry of the tectons are given in the tecton_conformer and tecton_geom categories, the former identifying the different conformers that may be present, the latter defining their geometry. TECTON_CONFORMER Lists different conformers and their properties TECTON_CONFORMER_EQUIV Defines the geometry labels of conformer TECTON_GEOM_ATOM Gives coordinates of ideal geometry TECTON_GEOM_DIST Gives ideal interatomic distances TECTON_GEOM_ANGLE Gives ideal bond angles TECTON_GEOM_TORSION Gives ideal torsion angles The geometry may be given either by specifying atomic coordinates or by supplying bond lengths, angles and torsion angles. The tectons are mapped onto each other in the category: MAP_TECTON_ATOM Maps atoms of one tecton onto those of another and the atoms of the tecton are identified with atoms in the crystal in the categories: MAP_TECTON2CRYSTAL_ATOM Identifies tecton with crystal atom MAP_TECTON2CRYSTAL_BOND Identifies tecton bond with crystal distance The way in which CIF describes the tectons and their mappings are illustrated by two sample CIFs, one of a disordered organic molecule, the other of an infinitely connected inorganic solid. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3. SAMPLE CIFS ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ At this stage in the development of the tecton description we are not attempting to write dictionary definitions, but only create two sample CIFs to ensure that the file structure is organized in a way that can handle both organic and inorganic crystals in the simplest possible way. The first CIF describes the structure of the molecule trinitrotoluene, TNT. It shows how a molecule with a finite bond graph is handled when the molecule lies on a Wyckoff special position and two of the nitro groups are disordered. By way of illustration, tectons corresponding to several subunits of the molecule are also defined and are mapped onto the molecule itself. The second CIF describes the structure of CaCrF5 which has an infinite bond graph and a formula unit that spans more than one asymmetric unit. [Editorial comment: Data names may be changed in the final report and dictionary definitions will eventually be needed. Suggestions for better names are welcome. Items marked as 'list-reference' are required for the management of the CIF's relational file structure and must be unique for each line in a list. The list-reference item in one loop is frequently parent to similarly named items in other loops. There is at least one serious unresolved problem (in the geom categories of the second CIF). Its solution is deferred to the next draft. The following sample CIFs contain extensive comments to explain how the CIF is to be interpreted. The same CIFs with the comments stripped out appear at the end of this file so that one can see more clearly what they look like.] +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3.1 FIRST SAMPLE CIF -------------------- TRINITROTOLUENE O CH3 0 | | | O --- N2 C1 N6 --- O \ / \ / C2 C6 | | H -- C3 C5 -- H \ / C4 | N4 / \ O O In the fictitious crystal structure I have invented for the purposes of this illustration, the molecule contains a crystallographic mirror plane that passes through the methyl group and the N4 nitro group and is perpendicular to the plane of the molecule. The N2 and N6 nitro groups are related by the mirror plane and are disordered with the two components each having occupation numbers of 0.5. Because of the disorder the crystallographic structure does not define the point group of an individual molecule. By choosing one combination of the disordered nitro groups the molecule would have Cs symmetry, but by choosing a different combination the individual molecules would have C1 symmetry. Either or both combination may of course be present in the real crystal but x-ray diffraction cannot distinguish between them. ############# Beginning of first CIF ############# # # data_disordered_TNT # # The first set of loops define the topology of the TNT molecule # (tecton 1) and two subunits of the molecule (tectons 2 and # 3). The subunit definitions likely would not often be used but are # included here to show that it can be done. # # If a crystal contained molecules of more than one compound, or more than one # isomer of a compound, each would be described by a separate tecton. # If the crystal contained more than one copy of the same molecule in the # asymmetric unit (Z'>1) the topology of the tecton would be given only once # but it would be mapped onto all the crystallographically distinct copies. # # According to CIF practice, all the items in a loop belong to the same # 'category'. The category name forms the first part of the datanames of all # items in the loop. # # The list-reference items in each loop are unique to the line in which they # appear and constitute an address that a program can use to establish # relationships between the items described in the different loops. Each loop # must have a list-reference. # # The part of the CIF describing the crystallographic structure is omitted in # this first example, but its contents should be self-evident in the map # loops. # ############################################################ # DEFINING THE TECTONS # # The first loop lists the different tectons being defined # together their properties. We may wish to define other properties, such as # net charge and formal charge carried by the tecton. # # I have made some fairly drastic changes to Howard's proposal here. Howard # listed both the tectons and their conformations in the same loop and # therefore included both topological and geometric information in the same # loop as follows: # # loop_ # _tecton_id # List reference / short name # _tecton_name # name e.g. full IUPAC name # _tecton_formula # _tecton_Zprime # _tecton_geometric_class # See below # _tecton_graph_automorphism_group # See below # _tecton_chirality # time-averaged if no geometry given # _tecton_type # TNTaa 'aa 2,4,6 trinitrotoluene' 'C7 H5 N3 O6' ? Cs ? achiral conformer # TNTbb 'ab 2,4,6 trinitrotoluene' 'C7 H5 N3 O6' ? Cs ? achiral conformer # TNTab 'ab 2,4,6 trinitrotoluene' 'C7 H5 N3 O6' ? C1 ? *TNTba conformer # TNTba 'ba 2,4,6 trinitrotoluene' 'C7 H5 N3 O6' ? C1 ? *TNTab conformer # TNT '2,4,6 trinitrotoluene' 'C7 H5 N3 O6' 1 C2h ? achiral molecule # subnz '1,2,4,6 benzene ring' 'C6 H2' 1 C2h ? achiral moiety # nitro 'nitro group' 'N O2' 3 C2h ? achiral group # # Need to find out about IUPAC rules for naming conformers # *TNTba means chiral being the enantiomer of TNTba # # I have separated the isomer (tecton) from the conformer and treated each in # separate categories. # # HDF comments # ------------ # _tecton_graph_automorphism_group encodes the symmetry of the graph as a # group of permutations (of atoms). If I understand correctly there are no # standard symbols for these automorphism groups although it seems that in a # fair number of cases they are isomorphic to a point group in three # dimensions. So often one could use a Schoenflies symbol. [Even that is # equivocal - point groups Cs, Ci and C2 are isomorphic] The TNT graph should # be given the symbol C2v for its _tecton_graph_automorphism_group. [C2v is # isomorphic to D2 and C2h. Why is it I prefer C2v? Am I (are we) too geometry # oriented? # I think we (David?) need to call John Rutherford again.] What one does in # the case that the graph automorphism group is not isomorphic to a 3D point # group, I do not have the least idea apart from writing a ? . # # IDB response # ------------ # Including this item which expresses the symmetry of the graph of the tecton # (but not its geometry) is an interesting idea, but in the absence of a # recognized system of symbols it is difficult to know what should be given # here. For this reason I have not included this in the draft below. # # HDF comment # ----------- # >_molecular_unit_details # Note especially that I think that it should be possible to retrieve the # 'molecular' information (topology and geometry) from a data bank / data # base. Each 'molecule' should stand in its own right. So the sort of comment # that David has in his _details "This is the whole molecule, A portion of the # TNT molecule, A group that appears three times in the TNT molecule" should # not be included in the above loop as this renders the information dependent # on a particular instance. # # IDB response # ------------ # Howard has not included a _*_details item in his draft though I would argue # that it is useful for someone looking at the CIF. These 'details' cannot be # computer-interpreted during retrieval from a databank so they clearly do not # influence the way a computer would search and retrieve information. # # The CIF dictionary already contains instructions for drawing a 2-D molecular # diagram in the group of chemical_conn categories. Although the # chemical_conn categories also describe the topology of a molecule they are # not a substitute for the tecton categories because 1) they are restricted to # organic molecules, 2) they are designed only to display a molecular diagram, # 3) only one molecule can be described and 4) the atoms are not mapped onto # the atom_sites in the crystal. # It would, however, be possible to include an item # _tecton_atom_conn_atom_number as a child of # _chemical_conn_atom_number in the following loop to allow the tecton # to be mapped to chemical_conn and hence plotted as a 2-D diagram. # Alternatively we could supply 2-D coordinates directly for our atoms in the # tecton_topology_atom loop. This would be preferable because it avoids the # limitations of the chemical_conn categories listed above. # # I have replaced _tecton_type with _tecton_special_details. It is not clear # if _tecton_type is just a free text field for the benefit of human readers # (in which case a _*_details field works as well) or whether there would # be an enumeration list, which would require us to define what is a molecule, # what is a group and what is a moiety. I would rather not go there! # loop_ _tecton_topology_id # List-reference _tecton_topology_name # Name e.g. full IUPAC name _tecton_topology_formula # Numbers of atoms in the tecton _tecton_topology_Zprime # Number of symmetry independent copies of the # tecton in the crystal _tecton_topology_special_details TNT '2,4,6 trinitrotoluene' 'C7 H5 N3 O6' 1 molecule BNZ '1,2,4,6 benzene ring' 'C6 H2' 1 moiety NITRO 'nitro group' 'N O2' 2 group # # The tecton_topology_atom loop that follows next defines the atoms in the # tecton and their chemical properties. # # A _tecton_topology_atom_id has been added as the list-reference and this # item is the parent to many items found in subsequent loops. For # convenience in this discussion I have constructed it out of the first # letter of _tecton_topology_id followed by _tecton_topology_atom_label. This # ensures uniqueness in the list while making it clear which atom is referred # to. # # I have removed the valence as something that needs more thought, if indeed # it is needed at all. # # I have added an item _tecton_topology_atom_chirality which is not needed in # this example, but is needed in chiral structures to identify any atom that # serves as a chiral center. Chirality is not captured by the topology, but # it is, like topology, a feature of the structure that can only be changed by # breaking and making bonds. It is included here because it is more closely # related to the topology than to the geometry which can be changed without # breaking any bonds. I will defer to others what values should be associated # with this item - presumably some letter like R or S. # loop_ _tecton_topology_atom_id # List reference, parent of many items _tecton_topology_atom_tecton_id # Child of _tecton_topology_id _tecton_topology_atom_label # For human use only _tecton_topology_atom_type_symbol # Child of _atom_type_symbol _tecton_topology_atom_coord_number # Number of bonds formed by this atom _tecton_topology_atom_chirality _tecton_topology_atom_details T.C1 TNT C1 C 3 . ? T.C2 TNT C2 C 3 . ? T.C3 TNT C3 C 3 . ? T.C4 TNT C4 C 3 . ? T.C5 TNT C5 C 3 . ? T.C6 TNT C6 C 3 . ? T.C7 TNT C7 C 4 . ? T.H3 TNT H3 H 1 . ? T.H5 TNT H5 H 1 . ? T.H71 TNT H71 H 1 . ? T.H72 TNT H72 H 1 . ? T.H73 TNT H73 H 1 . ? T.N2 TNT N2 N 3 . ? T.O21 TNT O21 O 1 . ? T.O22 TNT O22 O 1 . ? T.N4 TNT N4 N 3 . ? T.O41 TNT O41 O 1 . ? T.O42 TNT O42 O 1 . ? T.N6 TNT N6 N 3 . ? T.O61 TNT O61 O 1 . ? T.O62 TNT O62 O 1 . ? B.C1 BNZ C1 C 3 . 'benzene ring' B.C2 BNZ C2 C 3 . 'benzene ring' B.C3 BNZ C3 C 3 . 'benzene ring' B.C4 BNZ C4 C 3 . 'benzene ring' B.C5 BNZ C5 C 3 . 'benzene ring' B.C6 BNZ C6 C 3 . 'benzene ring' B.H3 BNZ H3 C 1 . 'benzene ring' B.H5 BNZ H5 H 1 . 'benzene ring' N.N1 NITRO N1 N 3 . 'nitro group' N.O1 NITRO O1 O 1 . 'nitro group' N.O2 NITRO O2 O 1 . 'nitro group' # The next loop defines the bonds in each of the tectons, again giving # just the topological properties of the bonds, not their geometries. # # In the TECTON_BOND category the atoms are identified by two children of # _tecton_topology_atom_id. # # Howard has added dangling bonds to show the full coordination around all the # atoms in the tecton as well as indicating the points at which the tecton is # attached to other species. The dummy atoms are indicated by the default '.' # meaning that this atom cannot be defined. It would make sense to include # the dangling bonds when, for example, the benzene ring is mapped onto TNT in # the map_tecton loop. However, this requires that the atom at the far end of # the dangling bond be given a name, which in turn means that the name must be # added to the tecton_topology_atom list in order to preserve the parent-child # relations. It would be necessary to identify such atoms as dummies, which # could be done by assigning them a non-existant atom_type such as X though # this would in turn have to be defined in the atom_type loop. It all seems a # little convoluted. A simpler scheme may be possible. # To avoid these problems the dangling bonds are not mapped and the CIF is # fully compliant. # # Changes from Howard's proposal: # ------------------------------ # 1. A separate list-reference item (_tecton_topology_bond_id) has been added # to avoid having to combine three items to define the list reference. Since # this item is not currently the parent to any further item in this CIF, a # simple number works well (but see the different situation in Sample CIF #2). # # 2.The atom_labels have been replaced by _tecton_topology_bond_atom_ids that # provide the computer-readable link to the tecton_topology_atom list. # # 3. The direct link to the tecton_topology_id has been removed as this # information can be recovered by referring to the tecton_toplogy_atom list. # # 4. For the bond type I have adopted the conventions used in chem_conn_bond # which are those suggested by CCDC. # loop_ _tecton_topology_bond_id _tecton_topology_bond_atom1_id # Child of _tecton_topology_atom_id _tecton_topology_bond_atom2_id # Child of _tecton_topology_atom_id _tecton_topology_bond_type 1 T.C1 T.C2 arom # TNT benzene ring 2 T.C2 T.C3 arom 3 T.C3 T.C4 arom 4 T.C4 T.C5 arom 5 T.C5 T.C6 arom 6 T.C6 T.C1 arom 7 T.C3 T.H3 sing 8 T.C5 T.H5 sing 9 T.C7 T.C1 sing # TNT Methyl group 10 T.C7 T.H71 sing 11 T.C7 T.H72 sing 12 T.C7 T.H73 sing 13 T.N2 T.C2 sing # TNT N2 nitro group 14 T.N2 T.O21 delo 15 T.N2 T.O22 delo 16 T.N4 T.C4 sing # TNT N4 nitro group 17 T.N4 T.O41 delo 18 T.N4 T.O42 delo 19 T.N6 T.C6 sing # TNT N6 nitro group 20 T.N6 T.O61 delo 21 T.N6 T.O62 delo 22 B.C1 B.C2 arom # 1,2,4,6 substituted benzene ring 23 B.C2 B.C3 arom 24 B.C3 B.C4 arom 25 B.C4 B.C5 arom 26 B.C5 B.C6 arom 27 B.C6 B.C1 arom 28 B.C3 B.H3 sing 29 B.C5 B.H5 sing 30 B.C1 . sing # A dangling bond 31 B.C2 . sing 32 B.C4 . sing 33 B.C6 . sing 34 N.N1 N.O1 delo # Nitro group 35 N.N1 N.O2 delo 36 N.N1 . sing # # HDF on 'delocalized' # ------------------- # Between C1 and C2: # (sigma) there is a sigma bond due to the overlap of a lobe of an sp2 # hybrid on C1 with a lobe of an sp2 hybrid on C2 with consequent sharing of # electrons. That part of the 'bond' is not delocalized. # (pi) participation in a localized pi bond due to overlap of the pz # orbitals and consequent sharing of electrons. # I don't think the C1-C2 interaction should be described as 'delocalized'. # Only a part of the bond could be so described. # # IDB response # ------------ # I have adopted the convention used by CCDC. # ########################################################### # DEFINING THE TECTON CONFORMERS AND GEOMETRY # # The disordered nitro groups can be combined in four different ways, a-a and # b-b (both with Cs symmetry), and a-b and b-a (both with C1 symmetry, one # being the enantiomer of the other). # # These different combinations give rise to different conformers which have # the same topology but different geometries. # The definition of the different conformers is thus related to the # description of geometry, rather than topology. # The topology in this case is indicated by the name TNT, the conformers by # names such as TNTab. # # The following loop appears in Howard's draft as a way of allowing the # geometry common to all conformers to appear only once. In his draft TNTaa # etc. as well as TNT were defined as tectons in a previous loop (see above). # # loop_ # _tecton_topology_combine_id # Child of _tecton_id # _tecton_topology_combine_source_id # Child of _tecton_id # TNTaa TNT # Means any information about TNT also applies, as such, to # #TNTaa # TNTbb TNT # TNTab TNT # TNTba TNT # MAY BE THIS CAN BE DONE LEGALLY WITHIN CURRENT CIF SYNTAX BY SAVE FRAMES ??? # [Save frames are used in dictionaries but are not yet part of CIF - IDB] # # In the draft presented here the combining of the geometries is achieved in a # different way which, I believe, is more appropriate for CIF, is more # flexible and would make programming simpler. The above loop therefore is # not part of the current draft. # # The first loop in this group of categories is one that identifies the # different conformers, but if only one conformation is present this loop may # be omitted unless one wished to give properties of the geometry as a whole # such as the point group of the tecton. # # Since in the TNT example the ideal geometries of the conformers differ only # in the torsion angles, the remaining geometry of the molecule is # common and need only to be given once. This means that each conformer is # described in part by items that give the common geometry and in part by # items that give the distinctive geometry of the conformer (the torsion # angles in this case). Each distance or angle in the geom loops is assigned # a conformer_label (e.g. aa, ab, all) to identify which conformer (or group # of conformers) it describes. The second loop in this group # (tecton_conformer_equiv) associates each conformer_id with the appropriate # conformer_labels. # Then follow the tecton_geom loops which define the atomic coordinates, the # interatomic distances, the angles and the torsion angles. # # Howard's draft gives the symmetry of the conformer using the dataname # _tecton_geometric_class rather than _tecton_conformer_point_group. He # explains this items thus: # "_tecton_geometric_class is the orientation-independent specification of the # tecton point group according to its geometry. It's probably best to use a # Schoenflies symbol as there is no choice of basis implicit in the geometry # as given by interatomic distances, interatomic (dihedral) angles. I suppose # that if the geometry is given as a set of coordinates, it might be # justifiable to specify a point group with a H-M point group symbol for the # tecton orientation corresponding to the atomic coordinates." # loop_ _tecton_conformer_id # List-reference _tecton_conformer_tecton_id # Child of _tecton_topology_id _tecton_conformer_point_group # Schoenflies point group symbol of conformer _tecton_conformer_chirality # We need to define allowed symbols _tecton_conformer_details TNTaa TNT Cs achiral 'TNTaa conformer' TNTbb TNT Cs achiral 'TNTbb conformer' TNTab TNT C1 +x 'TNTab conformer' TNTba TNT C1 -x 'TNTba conformer' # The next loop associates each conformer with one or more of the # conformer_labels used in the tecton_geom loops below. # The full geometry of a conformer can be found by extracting only those geom # items marked with a label listed opposite that conformer. loop_ _tecton_conformer_equiv_id # List-reference _tecton_conformer_equiv_conformer_id # Child of _tecton_conformer_id _tecton_conformer_equiv_label # Parent to #_tecton_geom_atom_conformer_label, etc. 1 TNTaa all 2 TNTaa aa 3 TNTbb all 4 TNTbb bb 5 TNTab all 6 TNTab ab 7 TNTba all 8 TNTba ba # Next follow the tecton_geom_atom, dist, angle and torsion loops that # give the geometry. # # All bonds and angles are defined in terms of child links # from _tecton_topology_atom_id. This uniquely links the geometry to the # atoms in the bond graph (however there are problems in the second example # CIF). # # I have shortened some of Howard's names and the structure of Howard's CIF # has been significantly changed to produce a consistent and simple # description that obeys CIF rules. # # The first loop defines the atoms in terms of their coordinates. This loop # is not needed if coordinates are not used since it adds nothing to the atom # properties given in the description of the topology (c.f. Sample CIF #2). # # _tecton_geom_atom_id is a child of _tecton_topology_atom_id and since it is # unique in the geom list as well as the topology list it can also serve as # the list-reference. # # By way of illustration the geometry of the benzene ring in this example is # defined by atomic coordinates, but the remaining geometries are defined by # their bonds and angles. # # Note that since the benzene ring geometry is common to all conformers, it # carries the conformer_label 'all' which is defined above as indicating a # geometry that is common to all four conformers. # loop_ _tecton_geom_atom_id # List-reference, child of _tecton_topology_atom_id _tecton_geom_atom_conformer_label # Child of _tecton_conformer_equiv_label _tecton_geom_atom_coord_x # Coordinates of atom in Angstrom _tecton_geom_atom_coord_y # _tecton_geom_atom_coord_z # _tecton_geom_atom_details T.C1 all 0.037 0.146 -0.124 ? T.C2 all 1.378 0.562 0.134 ? T.C3 all 1.846 1.421 0.204 ? T.C4 all 2.567 1.834 0.304 ? T.C5 all 1.745 1.563 0.245 ? T.C6 all 0.962 0 498 0.103 ? T.H3 all 2.13 1.72 0.24 ? T.H5 all 1.84 2.05 0.36 ? # # Distances are defined in the next loop in terms of the two terminal atoms # which are identified through children of _tecton_topology_atom_id. # # These distances do not have to correspond to the bonds # defined in the topology. They may represent non-bonding contacts or # distances between atoms well removed from each other. # # These are notional (ideal) distances, not those observed in the crystal # loop_ _tecton_geom_dist_id # List-reference _tecton_geom_dist_conformer_label # Child of _tecton_geom_equiv_label _tecton_geom_dist_atom1_id # Child of _tecton_topology_atom_id _tecton_geom_dist_atom2_id # Child of _tecton_topology_atom_id _tecton_geom_dist_distance # Distance atom1-atom2 in Angstroms 1 all T.C7 T.C1 1.54 # TNT methyl group 2 all T.C7 T.H71 1.05 3 all T.C7 T.H72 1.05 4 all T.C7 T.H73 1.05 5 all T.N4 T.C4 1.43 # TNT N4 nitro group 6 all T.N4 T.O41 1.18 7 all T.N4 T.O42 1.18 8 all T.N2 T.C2 1.43 # TNT N2 nitro group 9 all T.N2 T.O21 1.18 10 all T.N2 T.O22 1.18 11 all T.N6 T.C6 1.43 # TNT N6 nitro group 12 all T.N6 T.O61 1.18 13 all T.N6 T.O62 1.18 # The angles are defined in terms of the atom_ids of three defining atoms. # The angle is given in degrees and is formed at atom2. # loop_ _tecton_geom_angle_id # List-reference _tecton_geom_angle_conformer_label # Child of _tecton_geom_equiv_label _tecton_geom_angle_atom1_id # Child of _tecton_topology_atom_id _tecton_geom_angle_atom2_id # Child of _tecton_topology_atom_id _tecton_geom_angle_atom3_is # Child of _tecton_topology_atom_id _tecton_geom_angle_angle # Angle in degrees 1 all T.C1 T.C7 T.H71 109 # TNT Methyl group 2 all T.C1 T.C7 T.H72 109 3 all T.C1 T.C7 T.H73 109 4 all T.H71 T.C7 T.H72 109 5 all T.H72 T.C7 T.H73 109 6 all T.H73 T.C7 T.H71 109 7 all T.O41 T.N4 T.C4 117 # TNT N4 nitro group 8 all T.O42 T.N4 T.C4 117 9 all T.O41 T.N4 T.O42 126 10 all T.O21 T.N2 T.C2 117 # TNT N2 nitro group 11 all T.O22 T.N2 T.C2 117 12 all T.O21 T.N2 T.O22 126 13 all T.O61 T.N6 T.C6 117 # TNT N6 nitro group 14 all T.O62 T.N6 T.C6 117 15 all T.O61 T.N6 T.O62 126 # # In the torsion angle loop given next the four conformers are # differentiated. # # One of the torsion angles is common to all conformers. This is indicated by # the value of tecton_geom_torsion_conformer_label having the value of 'all' # as defined in the tecton_conformer_equiv loop. # # I have change Howard's 'dihedral' into 'torsion' to conform with usage in # core_CIF. # loop_ _tecton_geom_torsion_id # List-reference _tecton_geom_torsion_conformer_label # Child of _tecton_geom_equiv_label _tecton_geom_torsion_atom1_id # Child of _tecton_topology_atom_id _tecton_geom_torsion_atom2_id # Child of _tecton_topology_atom_id _tecton_geom_torsion_atom3_id # Child of _tecton_topology_atom_id _tecton_geom_torsion_atom4_id # Child of _tecton_topology_atom_id _tecton_geom_torsion_angle # Torsion angle in degrees 1 all T.C3 T.C4 T.N4 T.O41 90 2 aa T.C1 T.C2 T.N2 T.O21 10.5 3 aa T.C1 T.C6 T.N6 T.O61 10.5 4 bb T.C1 T.C2 T.N2 T.O21 -10.5 5 bb T.C1 T.C6 T.N6 T.O61 -10.5 6 ab T.C1 T.C2 T.N2 T.O21 10.5 7 ab T.C1 T.C6 T.N6 T.O61 -10.5 8 ba T.C1 T.C2 T.N2 T.O21 -10.5 9 ba T.C1 T.C6 T.N6 T.O61 10.5 ############################################################ # MAPPING THE TECTONS ONTO EACH OTHER # # The next loop maps the atoms of the subunit tectons onto the main tecton. # It can of course map any tecton onto any other tecton provided the regions # of the tectons mapped are isomorphic. # # This mapping will not often be needed but is included to show how it can be # done. # # It is only necessary to map the atoms, since there is no ambiguity # about where the bonds occur as long as bond graph is finite. # # Howard remarks that: # AT LEAST SOME OF THE FOLLOWING "MAPS" ARE NOT MAPS IN THE STRICT # MATHEMATICAL SENSE. MAYBE ANOTHER WORD IS NEEDED. # (Any suggestons?) # # The following loop proposed by Howard is not needed. It is implicit in the # mapping of the atoms (see below). # # loop_ # _map_tecton2tecton_id # List reference / Map Name # _map_tecton2tecton_source_id # Child of tecton_id # _map_tecton2tecton_image_id # Child of tecton_id # MAPsubnz2TNT subnz TNT # MAPnitro2TNTnitroN2 nitro TNT # MAPnitro2TNTnitroN4 nitro TNT # MAPnitro2TNTnitroN6 nitro TNT # # The changes made from Howard's draft of the next loop are: shortening and # simplifying the name, replacing the first item (the map name - see above) by # a list-reference (given here as a number). The map name (which indicates # which tectons are being mapped onto which) is implicit in the atom_ids (and # in this example is made explicit to the reader by the use of children of # _tecton_topology_atom_id to identify the atoms being mapped). # # This example is a true mapping in the sense described by Howard since each # subunit is mapped onto the main molecule, though the syntax does not prevent # the mapping being presented the other way around (the main molecule 'mapped' # onto the subunits which violates the strict mathematical rules of mapping), # unless it is made clear that atom1 is always mapped onto atom2. # I don't see how the mathematical rules can be expressed in machine-readable # form in the dictionary, or even if we should insist on a mathematically # correct definition. Maybe 'map' is not the right term to use, but I cannot # think of anything better. It is more a kind of equivalencing or # identification. # # This loop is now much simpler and very flexible. # # See the note above tecton_topology_bond re dangling bonds. # loop_ _map_tecton_atom_map_id # List reference _map_tecton_atom_atom1_id # Child of _tecton_topology_atom_id _map_tecton_atom_atom2_id # Child of _tecton_topology_atom_id 1 B.C1 T.C1 # mapping 1,2,4,6 benzene moiety onto TNT 2 B.C2 T.C2 3 B.C3 T.C3 4 B.C4 T.C4 5 B.C5 T.C5 6 B.C6 T.C6 7 B.H3 T.H3 8 B.H5 T.H5 9 N.N1 T.N2 # mapping the nitro group onto the TNT N2 group 10 N.O1 T.O21 11 N.O2 T.O22 12 N.N1 T.N4 # mapping the nitro group onto the TNT N4 group 13 N.O1 T.O41 14 N.O2 T.O42 15 N.N1 T.N6 # mapping the nitro group onto the TNT N6 group 16 N.O1 T.O61 17 N.O2 T.O62 # ############################################################## # MAPPING THE TECTONS TO THE CRYSTAL # # The next loop maps two of the conformers to the crystal (or vice versa, the # strict mathematical definition of mapping does not work here). # # As before I have changed Howard's draft by changing the first item from the # map name to a numerical list-reference. # # The atoms of the isomer are identified by _tecton_topology_atom_id, the # conformer is identified by the conformer_label. The conformer label may be # omitted if it is not known which conformers are present in a disordered # structure. # # I have arbitrarily assumed that the crystal contains equal numbers of the # two Cs conformers, though I could as easily have included all four in # various proportions. # # The occupation number indicates how much of each conformer (or isomer) is # present. The occupation numbers of the atoms in the crystal are defined in # the atom_site loop and must not be less than the sum of the corresponding # occupation numbers of the conformers. # # Columns 2 to 4 in the list give information about the atom in the tecton, # Columns 5 and 6 identify the atom in the crystal, the first identifying # the atom in the atom_site list, the second the symmetry operation applied to # the coordinates of that atom. # # The letters a and b distinguish the two positions of the disordered nitro # groups in the crystallographic asymmetric unit, each having an occupancy of # 0.5. # # I have assumed that the crystallographic mirror operation that relates the # two halves of the TNT molecule has the _space_group_symop_id of 2. Lattice # translations of the symmetry operations are not needed and are therefore not # included in this example (but see sample CIF #2). # # As before, the following loop proposed by Howard is not needed as the # information on the conformers is already define above in the # tecton_conformer sections. # loop_ # _map_tecton2crystal_id # List reference / Map Name # _map_tecton2crystal_tecton_id # Child of tecton_id # # CRYSTAL ID ???? # # OCCUPATION PARAMETER? # MAPTNTaa2crystal TNTaa # MAPTNTbb2crystal TNTbb # MAPTNTab2crystal TNTab # MAPTNTba2crystal TNTba # MAPTNT2crystal TNT loop_ _map_tecton2crystal_atom_id # List-reference _map_tecton2crystal_atom_atom_id # Child of _tecton_topology_atom_id _map_tecton2crystal_atom_conformer_label # Child of _tecton_conformer_equiv_label _map_tecton2crystal_atom_occup_number # Occupation number of tecton atom _map_tecton2crystal_atom_atom_site_label # child of _atom_site_label _map_tecton2crystal_atom_symop_id # child of _space_group_symop_id 1 T.C1 all 1 C1 1 2 T.C2 all 1 C2 1 3 T.C3 all 1 C3 1 4 T.C4 all 1 C4 1 5 T.C5 all 1 C3 2 6 T.C6 all 1 C2 2 7 T.H3 all 1 H3 1 8 T.H5 all 1 H3 2 9 T.C7 all 1 C7 1 10 T.H71 all 1 H71 1 11 T.H72 all 1 H72 1 12 T.H73 all 1 H71 2 13 T.N4 all 1 N4 1 14 T.O41 all 1 O41 1 15 T.O42 all 1 O42 1 # SIDE CHAINS 16 T.N2 aa 0.5 N2a 1 17 T.O21 aa 0.5 O21a 1 18 T.O22 aa 0.5 O22a 1 19 T.N6 aa 0.5 N2a 2 20 T.O61 aa 0.5 O21a 2 21 T.O62 aa 0.5 O22a 2 22 T.N2 bb 0.5 N2b 1 23 T.O21 bb 0.5 O21b 1 24 T.O22 bb 0.5 O22b 1 25 T.N6 bb 0.5 N2b 2 26 T.O61 bb 0.5 O21b 2 27 T.O62 bb 0.5 O22b 2 # # Howard remarks: IF THE SIDE CHAINS WERE LONGER, EVEN THE ABOVE SYNTAX WOULD # BE TOO CUMBERSOME. # IDB: But there is no duplication - the list could not be shorter. # ############ End of first CIF ################ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 4.2 SECOND SAMPLE CIF ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ######### Beginning of second CIF ############# # # # EXAMPLE OF A STRUCTURE WITH AN INFINITE BOND GRAPH # # CaCrF5 is chosen to illustrate how infinite bond graphs are treated. # # CaCrF5 consists of chains of corner-linked CrF6 octahedra running along the # c axis of a crystal belonging to space group C2/c. The Cr and the linking F # atom (F3) reside on 2-fold axes that are perpendicular to c. The Ca atoms # lie between the chains on the same 2-fold axes. # # The crystal structure of CaCrF5 is represented by an array of atoms linked # by bonds into an infinitely connected network with translational symmetry. # A finite graph, which retains all the local properties of the atoms, can be # extracted from the infinite graph as follows: # One first extracts one formula unit (in this case the seven atoms in the # chemical formula). This requires that fourteen bonds linking the formula # unit to the rest of the infinite network be broken, but such broken bonds # always occur in pairs since they are necessarily related in pairs by one of # the translational symmetry operations of the space group (translations, # glides or screws). The broken bonds of each pair are then connected to each # other, adding (in this case) seven further bonds to the finite bond graph. # Therefore in some cases a pair of atoms in the finite graph may be linked # by more than one bond. This is indicated in the graph by a double or triple # line, etc. In CaCrF5 three such pairs of atoms are linked by two bonds as # shown in the bond graph below. The inclusion of two lines between a pair of # atoms in the graph does NOT indicate a double bond (a bond of order 2), but # rather two different bonds whose bond order is not specified. Where two (or # more) bonds are shown as linking the same two atoms in the finite graph, # they connect two different pairs of atoms in the infinite graph and in the # crystal structure as can be seen from the map_tecton2crystal_bond loop. # # Information on the long-range order is lost when the infinite graph is # reduced to a finite graph, but the short-range order, i.e., the nearest # neighbour environments that contains the chemical bonds, is preserved. # A crude representation of the finite graph showing the six bonds between Cr # and F, and the seven bonds between Ca and F, is given below. In the crystal # F1 and F4 are related by a crystallographic 2-fold axis, as are F2 and F5. # # |------------ F2 -------------| # | | # |------------ F1 =============| # | | # Cr1 -|============ F3 -------------|- Ca # | | # |------------ F4 =============| # | | # |------------ F5 -------------| # # data_Ca_Cr_F5 # # In this example a complete CIF is given including the atomic coordinates and # the symmetry operations. The description of the tecton is followed by the # mapping between the tecton and the crystal structure. As there is only one # conformer, the conformer loops are not used. # ############################################################### # DEFINITION OF THE CRYSTALLOGRAPHIC STRUCTURE # # Based on Wu and Brown (1973) Mat. Res. Bull. 8, 593-8. # _chemical_formula_sum 'Ca Cr F5' _cell_length_a 9.0050 _cell_length_b 6.4720 _cell_length_c 7.5330 _cell_angle_alpha 90.00 _cell_angle_beta 115.85 _cell_angle_gamma 90.00 _cell_formula_units_Z 8 _space_group_name_H-M_alt 'C 2/c' _space_group_name_Hall '-C 2yc' loop_ _space_group_symop_id _space_group_symop_operation_xyz 1 ' X, Y, Z' 2 '-X, Y,-Z+1/2' 3 '-X,-Y,-Z' 4 ' X,-Y, Z+1/2' 5 ' X+1/2, Y+1/2, Z' 6 '-X+1/2, Y+1/2,-Z+1/2' 7 '-X+1/2,-Y+1/2,-Z' 8 ' X+1/2,-Y+1/2, Z+1/2' loop_ _atom_site_label _atom_site_fract_x _atom_site_fract_y _atom_site_fract_z _atom_site_U_iso_or_equiv _atom_site_adp_type Ca1 0.50000 0.04260 0.25000 0.10000 Uiso Cr1 0.00000 0.00000 0.00000 0.10000 Uiso F1 0.00970 -0.29340 -0.02910 0.10000 Uiso F2 -0.22730 -0.02300 -0.11740 0.10000 Uiso F3 0.00000 -0.07210 0.25000 0.10000 Uiso # loop_ _geom_bond_atom_site_label_1 _geom_bond_atom_site_label_2 _geom_bond_distance _geom_bond_site_symmetry_1 _geom_bond_site_symmetry_2 Ca1 F1 2.391 1_555 5_555 Ca1 F1 2.391 1_555 6_555 Ca1 F1 2.292 1_555 7_545 Ca1 F1 2.292 1_555 8_545 Ca1 F2 2.215 1_555 3_555 Ca1 F2 2.215 1_555 4_655 Ca1 F3 2.494 1_555 5_555 Cr1 F1 1.918 1_555 1_555 Cr1 F1 1.918 1_555 3_555 Cr1 F2 1.848 1_555 1_555 Cr1 F2 1.848 1_555 3_555 Cr1 F3 1.940 1_555 1_555 Cr1 F3 1.940 1_555 3_555 # ############################################################# # DEFINITION OF THE FORMULA UNIT AS A TECTON # # The next loop lists the tectons, in this case the only tecton defined # contains one formula unit. # loop_ _tecton_topology_id # List reference _tecton_topology_formula _tecton_topology_special_details 1 'Ca Cr F5' 'The formula unit' # # The next loop lists the seven atoms that compose the tecton and # gives their chemical properties. Note that the atom_site list in the # crystallographic items given above only contains five atoms because the # molecular unit occupies two asymmetric units and two F atoms are duplicated # by the two-fold axis. # # _tecton_topology_atom_id is the list-reference and as in the previous # example has been constructed from the atom_label. The tecton number is not # needed here because the atom_label is sufficient to ensure uniqueness. # _tecton_topology_atom_tecton_id is next shown. Although there is only one # tecton, this item is needed to link the atoms to the formula in the # tecton_topology loop above. # _tecton_topology_atom_label is included for the benefit of the user. It has # no parent or child and is not required for CIF management. The CIF # identifies the atom by _tecton_topology_atom_id. # _tecton_topology_atom_valence is the atomic valence as used in the bond # valence model, a model which is included here because it is based on the # topological properties of the bond network. # _tecton_topology_atom_coord_number is the number of bonds in the bond graph # that terminate on the atom. # loop_ _tecton_topology_atom_id # List-reference _tecton_topology_atom_tecton_id # Child of _tecton_topology_id _tecton_topology_atom_label _tecton_topology_atom_type_symbol # Child of _atom_type_symbol _tecton_topology_atom_valence _tecton_topology_atom_coord_number # Number of bonds formed by this atom _tecton_topology_atom_details Ca 1 Ca1 Ca 2 7 ? Cr 1 Cr1 Cr 3 6 ? F1 1 F1 F -1 3 ? F2 1 F2 F -1 2 ? F3 1 F3 F -1 3 ? F4 1 F4 F -1 3 ' Related to F1 by crystallographic symmetry' F5 1 F5 F -1 2 ' Related to F2 by crystallographic symmetry' # # The next loop lists the bonds in the tecton. Some bonds appear # twice (e.g. Cr.F3.1 and Cr.F3.2). The atoms of the tecton # specified in these cases (e.g., atoms Cr and F3) map onto # different atom pairs in the crystal as can be seen the # map_tecton2crystal_bond loop below. # # _tecton_topology_bond_id is the list-reference and in this example has been # constructed from the ids of the two atoms that form the bond since it is # parent to _map_tecton2crystal_bond_id. # # _tecton_topology_bond_valence is a quantity determined from the topology and # is used to calculate (in this case) the ideal bond lengths given in the # tecton_geom_bond loop. # loop_ _tecton_topology_bond_id # list reference _tecton_topology_bond_atom_id_1 # Child of _tecton_topology_atom_id _tecton_topology_bond_atom_id_2 # Child of _tecton_topology_atom_id _tecton_topology_bond_valence # Predicted bond valence _tecton_topology_bond_type Cr.F1 Cr F1 0.48 ? Cr.F4 Cr F4 0.48 ? Cr.F2 Cr F2 0.61 ? Cr.F5 Cr F5 0.61 ? Cr.F3.1 Cr F3 0.41 ? Cr.F3.2 Cr F3 0.41 ? Ca.F1.1 Ca F1 0.26 ? Ca.F1.2 Ca F1 0.26 ? Ca.F4.1 Ca F4 0.26 ? Ca.F4.2 Ca F4 0.26 ? Ca.F2 Ca F2 0.39 ? Ca.F5 Ca F5 0.39 ? Ca.F3 Ca F3 0.18 ? # ############################################################ # DEFINITION OF THE TECTON GEOMETRY # # The tecton_geom_atom loop is omitted as the geometry is not defined here in # terms of ideal atomic coordinates. # # The ideal interatomic distances are next given. # They can be compared to the observed distances given in the # crystallographic _geom_bond list above. # # The distances defined in this list do not need to be the same as the bonds # defined in the tecton_topology_bond list; some distances given here may be # between non-bonded atoms. # # Since the list-reference is not parent to any other item, an alphabetical # sequence of letters has been chosen (to show that any string is valid). # # The distances are defined by the second and third items in the loop. # # The bond valence which is used to calculate these distances is identical to # the value given in tecton_topology_bond_valence so we may not need this item # here. ## ## NOTE: There is a weakness in the present definition that requires more ## thought. As presently defined, the geometry is given only for the atoms in ## the finite bond graph, i.e. that atoms in the tecton from which the ## infinite crystal is generated, but there will be occasions when it is ## necessary to give distances and angles between atoms that span more than ## one tecton. This could be done by specifying the ideal geometry using the ## atoms in the crystal or alternatively by introducing the translational ## symmetry operations between tectons that are needed to specify the long ## range order. The choice of which method to use requires more thought - ## it is deferred to a later draft. ## # loop_ _tecton_geom_dist_id # List-reference _tecton_geom_dist_atom1_id # Child of _tecton_topology_atom_id _tecton_geom_dist_atom2_id # Child of _tecton_topology_atom_id _tecton_geom dist_distance # Ideal bond distance in Angstroms _tecton_geom_dist_valence # Same as _tecton_topology_bond_valence _tecton_geom_dist_details A Cr F1 1.93 0.48 'Bond distances calculated from bond valences' B Cr F4 1.93 0.48 'Bond distances calculated from bond valences' C Cr F2 1.84 0.61 'Bond distances calculated from bond valences' D Cr F5 1.84 0.61 'Bond distances calculated from bond valences' E Cr F3 1.99 0.41 'Bond distances calculated from bond valences' F Cr F3 1.99 0.41 'Bond distances calculated from bond valences' G Ca F1 2.34 0.26 'Bond distances calculated from bond valences' H Ca F1 2.34 0.26 'Bond distances calculated from bond valences' I Ca F4 2.34 0.26 'Bond distances calculated from bond valences' J Ca F4 2.34 0.26 'Bond distances calculated from bond valences' K Ca F2 2.19 0.39 'Bond distances calculated from bond valences' L Ca F5 2.19 0.39 'Bond distances calculated from bond valences' M Ca F3 2.48 0.18 'Bond distances calculated from bond valences' # Similar angle and torsion loops could also be given but are omitted here for # brevity. Torsion angles are rarely used to define the geometry of inorganic # compounds. # ############################################################ # MAPPING THE TOPOLOGY ONTO THE CRYSTAL # # The next loop maps the atoms of the tecton onto the atoms of the # crystal. # # Note that atoms F4 and F5 in the molecular unit map onto # symmetry-generated copies of F1 and F2 in the crystal. # # The additional translation components of the symmetry operation are included # here by way of illustration, though strictly only necessary in the bond # loop. # loop_ _map_tecton2crystal_atom_id # List reference _map_tecton2crystal_atom_atom_id # Child of _tecton_topology_atom_id _map_tecton2crystal_atom_atom_site_label # Child of _atom_site_label _map_tecton2crystal_atom_symop_id # Child of _space_group_symop_id _map_tecton2crystal_atom_trans_x _map_tecton2crystal_atom_trans_y _map_tecton2crystal_atom_trans_z 1 Ca Ca1 1 0 0 0 2 Cr Cr1 1 0 0 0 3 F1 F1 1 0 0 0 4 F2 F2 1 0 0 0 5 F3 F3 1 0 0 0 6 F4 F1 3 0 0 0 7 F5 F2 3 0 0 0 # # The next loop maps the bonds from the tecton onto the crystal. This loop # is only needed for infinitely connected structures because these are the # only graphs in which there can be more than one bond between the same # pair of atoms in the tecton. # # Since this loop maps the bonds of the tecton directly onto pairs of atoms in # the crystal, the tecton bond is sufficiently defined by # _tecton_topology_bond_id, but the bond in the crystal must be defined fully # in terms of two atom_site_labels and their corresponding symmetry operations # including the additional lattice translations. It should be sufficient to # use _map_tecton2crystal_bond_bond_id as the list-reference (though I have # not done this here. # # Note that bonds 5 and 6 map onto different pairs of atoms in the crystal. # # The bonds I have labelled 'link' are those that link the atoms in the # tecton (the atoms that form the finite bond graph) to the atoms in # symmetry-related tectons in the infinite graph. The remaining bonds are # those formed between the atoms within the tecton. # loop_ _map_tecton2crystal_bond_id # List reference _map_tecton2crystal_bond_bond_id # Child of _tecton_topology_bond_id _map_tecton2crystal_bond_atom_site_label_1 # Child of _atom_site_label _map_tecton2crystal_bond_symop_1 # Child of _space_group_symop_id _map_tecton2crystal_bond_trans_x_1 _map_tecton2crystal_bond_trans_y_1 _map_tecton2crystal_bond_trans_z_1 _map_tecton2crystal_bond_atom_site_label_2 # Child of _atom_site_label _map_tecton2crystal_bond_symop_2 # Child of _space_group_symop_id _map_tecton2crystal_bond_trans_x_2 _map_tecton2crystal_bond_trans_y_2 _map_tecton2crystal_bond_trans_z_2 _map_tecton2crystal_bond_dist # Observed distance (optional) _map_tecton2crystal_bond_details 1 Cr.F1 Cr1 1 0 0 0 F1 1 0 0 0 1.918 ? 2 Cr.F4 Cr1 1 0 0 0 F4 1 0 0 0 1.918 ? 3 Cr.F2 Cr1 1 0 0 0 F2 1 0 0 0 1.848 ? 4 Cr.F5 Cr1 1 0 0 0 F5 1 0 0 0 1.848 ? 5 Cr.F3.1 Cr1 1 0 0 0 F3 1 0 0 0 1.940 ? 6 Cr.F3.2 Cr1 1 0 0 0 F3 3 0 0 0 1.940 link 7 Ca.F1.1 Ca1 1 0 0 0 F1 5 0 0 0 2.391 link 8 Ca.F1.2 Ca1 1 0 0 0 F1 6 0 0 0 2.292 link 9 Ca.F4.1 Ca1 1 0 0 0 F4 5 0 -1 0 2.391 link 10 Ca.F4.2 Ca1 1 0 0 0 F4 6 0 -1 0 2.292 link 11 Ca.F2 Ca1 1 0 0 0 F5 1 0 0 0 2.215 ? 12 Ca.F5 Ca1 1 0 0 0 F2 4 1 0 0 2.215 link 13 Ca.F3 Ca1 1 0 0 0 F3 5 0 0 0 2.494 link # ################# End of second CIF #################### ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 4. COMPARISON OF THE ABOVE PROPOSAL WITH mmCIF: +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ mmCIF has a chemical description which is designed for biological molecules. The contents of the crystal are divided into a small number of ENTITIES which are classified as either polymers (e.g. a protein molecule), non-polymers, or water. A category called struct_asym describes which entities are found in the asymmetric unit. Polymeric entities are typically composed of monomers or COMPONENTS which are described in the category CHEM_COMP. The definitions in this set of categories are very similar to our definitions in the tecton_topology and tecton_geom categories. Chem_comp is designed to give the contents and geometries of the individual monomers that compose the macromolecules. It describes the ideal geometry of the monomers either in terms of Cartesian coordinates or in terms of bond lengths and angles. Unlike our proposal which uses _map_tecton2crystal_ to map the molecular units onto the crystal structure, the atom_site loop itself contains pointers to the corresponding atom in chem_comp, an arrangement that does not work for small molecules where an atom listed in the atom_site loop (asymmetric unit) may map onto more than one atom in the tecton e.g., if the tecton contains crystallographic symmetry as in the case of CaCrF5 above. We should make the definitions of items in the tecton_topology and tecton_geom categories correspond exactly to those used in chem_comp to allow direct translation between the two categories (this should not be a problem). chem_comp defines a very large number of additional properties such as the chirality of individual atoms and planes of atoms, as well as properties that are of interest only in biological structures. We may wish to add some of these to our lists. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 5. SAMPLE CIFS WITH COMMENTS REMOVED +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ data_disordered_TNT loop_ _tecton_topology_id # List-reference _tecton_topology_name # Name e.g. full IUPAC name _tecton_topology_formula _tecton_topology_Zprime # Number of symmetry independent # copies of the tecton in the crystal _tecton_topology_special_details TNT '2,4,6 trinitrotoluene' 'C7 H5 N3 O6' 1 molecule BNZ '1,2,4,6 benzene ring' 'C6 H2' 1 moiety NITRO 'nitro group' 'N O2' 2 group loop_ _tecton_topology_atom_id # List-reference _tecton_topology_atom_tecton_id # Child of _tecton_id _tecton_topology_atom_label _tecton_topology_atom_type_symbol # Child of _atom_type_symbol _tecton_topology_atom_coord_number # Number of bonds formed by this atom _tecton_topology_atom_chirality _tecton_topology_atom_details T.C1 TNT C1 C 3 . ? T.C2 TNT C2 C 3 . ? T.C3 TNT C3 C 3 . ? T.C4 TNT C4 C 3 . ? T.C5 TNT C5 C 3 . ? T.C6 TNT C6 C 3 . ? T.C7 TNT C7 C 4 . ? T.H3 TNT H3 H 1 . ? T.H5 TNT H5 H 1 . ? T.H71 TNT H71 H 1 . ? T.H72 TNT H72 H 1 . ? T.H73 TNT H73 H 1 . ? T.N2 TNT N2 N 3 . ? T.O21 TNT O21 O 1 . ? T.O22 TNT O22 O 1 . ? T.N4 TNT N4 N 3 . ? T.O41 TNT O41 O 1 . ? T.O42 TNT O42 O 1 . ? T.N6 TNT N6 N 3 . ? T.O61 TNT O61 O 1 . ? T.O62 TNT O62 O 1 . ? B.C1 BNZ C1 C 3 . 'benzene ring' B.C2 BNZ C2 C 3 . 'benzene ring' B.C3 BNZ C3 C 3 . 'benzene ring' B.C4 BNZ C4 C 3 . 'benzene ring' B.C5 BNZ C5 C 3 . 'benzene ring' B.C6 BNZ C6 C 3 . 'benzene ring' B.H3 BNZ H3 C 1 . 'benzene ring' B.H5 BNZ H5 H 1 . 'benzene ring' N.N1 NITRO N1 N 3 . 'nitro group' N.O1 NITRO O1 O 1 . 'nitro group' N.O2 NITRO O2 O 1 . 'nitro group' loop_ _tecton_topology_bond_id # List-reference _tecton_topology_bond_atom1_id # Child of _tecton_topology_atom_id _tecton_topology_bond_atom2_id # Child of _tecton_topology_atom_id _tecton_topology_bond_type 1 T.C1 T.C2 delocalized # TNT benzene ring 2 T.C2 T.C3 delocalized 3 T.C3 T.C4 delocalized 4 T.C4 T.C5 delocalized 5 T.C5 T.C6 delocalized 6 T.C6 T.C1 delocalized 7 T.C3 T.H3 single 8 T.C5 T.H5 single 9 T.C7 T.C1 single # TNT Methyl group 10 T.C7 T.H71 single 11 T.C7 T.H72 single 12 T.C7 T.H73 single 13 T.N2 T.C2 single # TNT N2 nitro group 14 T.N2 T.O21 delocalized 15 T.N2 T.O22 delocalized 16 T.N4 T.C4 single # TNT N4 nitro group 17 T.N4 T.O41 delocalized 18 T.N4 T.O42 delocalized 19 T.N6 T.C6 single # TNT N6 nitro group 20 T.N6 T.O61 delocalized 21 T.N6 T.O62 delocalized 22 B.C1 B.C2 delocalized # 1,2,4,6 substituted benzene ring 23 B.C2 B.C3 delocalized 24 B.C3 B.C4 delocalized 25 B.C4 B.C5 delocalized 26 B.C5 B.C6 delocalized 27 B.C6 B.C1 delocalized 28 B.C3 B.H3 single 29 B.C5 B.H5 single 30 B.C1 . single # A dangling bond 31 B.C2 . single 32 B.C4 . single 33 B.C6 . single 34 N.N1 N.O1 delocalized # Nitro group 35 N.N1 N.O2 delocalized 36 N.N1 . single loop_ _tecton_conformer_id # List-reference _tecton_conformer_tecton_id # Child of _tecton_topology_id _tecton_conformer_point_group # Schoenflies point group symbol of conformer _tecton_conformer_chirality _tecton_conformer_details TNTaa TNT Cs achiral 'TNTaa conformer' TNTbb TNT Cs achiral 'TNTbb conformer' TNTab TNT C1 +x 'TNTab conformer' TNTba TNT C1 -x 'TNTba conformer' loop_ _tecton_conformer_equiv_id # List-reference _tecton_conformer_equiv_conformer_id # Child of _tecton_conformer_id _tecton_conformer_equiv_label # Parent to _tecton_geom_atom_conformer_label 1 TNTaa all 2 TNTaa aa 3 TNTbb all 4 TNTbb bb 5 TNTab all 6 TNTab ab 7 TNTba all 8 TNTba ba loop_ _tecton_geom_atom_id # List-reference, child of _tecton_topology_atom_id _tecton_geom_atom_conformer_label # Child of _tecton_conformer_equiv_label _tecton_geom_atom_coord_x # Coordinates of atom in Angstrom _tecton_geom_atom_coord_y # _tecton_geom_atom_coord_z # _tecton_geom_atom_details T.C1 all 0.037 0.146 -0.124 ? T.C2 all 1.378 0.562 0.134 ? T.C3 all 1.846 1.421 0.204 ? T.C4 all 2.567 1.834 0.304 ? T.C5 all 1.745 1.563 0.245 ? T.C6 all 0.962 0 498 0.103 ? T.H3 all 2.13 1.72 0.24 ? T.H5 all 1.84 2.05 0.36 ? loop_ _tecton_geom_dist_id # List-reference _tecton_geom_dist_conformer_label # Child of _tecton_geom_equiv_label _tecton_geom_dist_atom1_id # Child of _tecton_topology_atom_id _tecton_geom_dist_atom2_id # Child of _tecton_topology_atom_id _tecton_geom_dist_distance # Distance atom1-atom2 in Angstroms 1 all T.C7 T.C1 1.54 # TNT methyl group 2 all T.C7 T.H71 1.05 3 all T.C7 T.H72 1.05 4 all T.C7 T.H73 1.05 5 all T.N4 T.C4 1.43 # TNT N4 nitro group 6 all T.N4 T.O41 1.18 7 all T.N4 T.O42 1.18 8 all T.N2 T.C2 1.43 # TNT N2 nitro group 9 all T.N2 T.O21 1.18 10 all T.N2 T.O22 1.18 11 all T.N6 T.C6 1.43 # TNT N6 nitro group 12 all T.N6 T.O61 1.18 13 all T.N6 T.O62 1.18 loop_ _tecton_geom_angle_id # List-reference _tecton_geom_angle_conformer_label # Child of _tecton_geom_equiv_label _tecton_geom_angle_atom1_id # Child of _tecton_topology_atom_id _tecton_geom_angle_atom2_id # Child of _tecton_topology_atom_id _tecton_geom_angle_atom3_is # Child of _tecton_topology_atom_id _tecton_geom_angle_angle # Angle in degrees 1 all T.C1 T.C7 T.H71 109 # TNT Methyl group 2 all T.C1 T.C7 T.H72 109 3 all T.C1 T.C7 T.H73 109 4 all T.H71 T.C7 T.H72 109 5 all T.H72 T.C7 T.H73 109 6 all T.H73 T.C7 T.H71 109 7 all T.O41 T.N4 T.C4 117 # TNT N4 nitro group 8 all T.O42 T.N4 T.C4 117 9 all T.O41 T.N4 T.O42 126 10 all T.O21 T.N2 T.C2 117 # TNT N2 nitro group 11 all T.O22 T.N2 T.C2 117 12 all T.O21 T.N2 T.O22 126 13 all T.O61 T.N6 T.C6 117 # TNT N6 nitro group 14 all T.O62 T.N6 T.C6 117 15 all T.O61 T.N6 T.O62 126 loop_ _tecton_geom_torsion_id # List-reference _tecton_geom_torsion_conformer_label # Child of _tecton_geom_equiv_label _tecton_geom_torsion_atom1_id # Child of _tecton_topology_atom_id _tecton_geom_torsion_atom2_id # Child of _tecton_topology_atom_id _tecton_geom_torsion_atom3_id # Child of _tecton_topology_atom_id _tecton_geom_torsion_atom4_id # Child of _tecton_topology_atom_id _tecton_geom_torsion_angle # Torsion angle in degrees 1 all T.C3 T.C4 T.N4 T.O41 90 2 aa T.C1 T.C2 T.N2 T.O21 10.5 3 aa T.C1 T.C6 T.N6 T.O61 10.5 4 bb T.C1 T.C2 T.N2 T.O21 -10.5 5 bb T.C1 T.C6 T.N6 T.O61 -10.5 6 ab T.C1 T.C2 T.N2 T.O21 10.5 7 ab T.C1 T.C6 T.N6 T.O61 -10.5 8 ba T.C1 T.C2 T.N2 T.O21 -10.5 9 ba T.C1 T.C6 T.N6 T.O61 10.5 loop_ _map_tecton_atom_map_id # List-reference _map_tecton_atom_atom1_id # Child of _tecton_topology_atom_id _map_tecton_atom_atom2_id # Child of _tecton_topology_atom_id 1 B.C1 T.C1 # mapping 1,2,4,6 benzene moiety onto TNT 2 B.C2 T.C2 3 B.C3 T.C3 4 B.C4 T.C4 5 B.C5 T.C5 6 B.C6 T.C6 7 B.H3 T.H3 8 B.H5 T.H5 9 N.N1 T.N2 # mapping the nitro group onto the TNT N2 group 10 N.O1 T.O21 11 N.O2 T.O22 12 N.N1 T.N4 # mapping the nitro group onto the TNT N4 group 13 N.O1 T.O41 14 N.O2 T.O42 15 N.N1 T.N6 # mapping the nitro group onto the TNT N6 group 16 N.O1 T.O61 17 N.O2 T.O62 loop_ _map_tecton2crystal_atom_id # List-reference _map_tecton2crystal_atom_atom_id # Child of _tecton_topology_atom_id _map_tecton2crystal_atom_tecton_label # Child of _tecton_conformer_equiv_label _map_tecton2crystal_atom_occup_number # Occupation number of tecton atom _map_tecton2crystal_atom_atom_site_label # child of _atom_site_label _map_tecton2crystal_atom_symop_id # child of _space_group_symop_id 1 T.C1 all 1 C1 1 2 T.C2 all 1 C2 1 3 T.C3 all 1 C3 1 4 T.C4 all 1 C4 1 5 T.C5 all 1 C3 2 6 T.C6 all 1 C2 2 7 T.H3 all 1 H3 1 8 T.H5 all 1 H3 2 9 T.C7 all 1 C7 1 10 T.H71 all 1 H71 1 11 T.H72 all 1 H72 1 12 T.H73 all 1 H71 2 13 T.N4 all 1 N4 1 14 T.O41 all 1 O41 1 15 T.O42 all 1 O42 1 16 T.N2 aa 0.5 N2a 1 17 T.O21 aa 0.5 O21a 1 18 T.O22 aa 0.5 O22a 1 19 T.N6 aa 0.5 N2a 2 20 T.O61 aa 0.5 O21a 2 21 T.O62 aa 0.5 O22a 2 22 T.N2 bb 0.5 N2b 1 23 T.O21 bb 0.5 O21b 1 24 T.O22 bb 0.5 O22b 1 25 T.N6 bb 0.5 N2b 2 26 T.O61 bb 0.5 O21b 2 27 T.O62 bb 0.5 O22b 2 ############ End of first CIF ################ # ############ Start of second CIF ############# data_Ca_Cr_F5 _chemical_formula_sum 'Ca Cr F5' _cell_length_a 9.0050 _cell_length_b 6.4720 _cell_length_c 7.5330 _cell_angle_alpha 90.00 _cell_angle_beta 115.85 _cell_angle_gamma 90.00 _cell_formula_units_Z 8 _space_group_name_H-M_alt 'C 2/c' _space_group_name_Hall '-C 2yc' loop_ _space_group_symop_id _space_group_symop_operation_xyz 1 ' X, Y, Z' 2 '-X, Y,-Z+1/2' 3 '-X,-Y,-Z' 4 ' X,-Y, Z+1/2' 5 ' X+1/2, Y+1/2, Z' 6 '-X+1/2, Y+1/2,-Z+1/2' 7 '-X+1/2,-Y+1/2,-Z' 8 ' X+1/2,-Y+1/2, Z+1/2' loop_ _atom_site_label _atom_site_fract_x _atom_site_fract_y _atom_site_fract_z _atom_site_U_iso_or_equiv _atom_site_adp_type Ca1 0.50000 0.04260 0.25000 0.10000 Uiso Cr1 0.00000 0.00000 0.00000 0.10000 Uiso F1 0.00970 -0.29340 -0.02910 0.10000 Uiso F2 -0.22730 -0.02300 -0.11740 0.10000 Uiso F3 0.00000 -0.07210 0.25000 0.10000 Uiso # loop_ _geom_bond_atom_site_label_1 _geom_bond_atom_site_label_2 _geom_bond_distance _geom_bond_site_symmetry_1 _geom_bond_site_symmetry_2 Ca1 F1 2.391 1_555 5_555 Ca1 F1 2.391 1_555 6_555 Ca1 F1 2.292 1_555 7_545 Ca1 F1 2.292 1_555 8_545 Ca1 F2 2.215 1_555 3_555 Ca1 F2 2.215 1_555 4_655 Ca1 F3 2.494 1_555 5_555 Cr1 F1 1.918 1_555 1_555 Cr1 F1 1.918 1_555 3_555 Cr1 F2 1.848 1_555 1_555 Cr1 F2 1.848 1_555 3_555 Cr1 F3 1.940 1_555 1_555 Cr1 F3 1.940 1_555 3_555 loop_ _tecton_topology_id # List-reference _tecton_topology_formula _tecton_topology_special_details 1 'Ca Cr F5' 'The formula unit' loop_ _tecton_topology_atom_id # List-reference _tecton_topology_atom_tecton_id # Child of _tecton_topology_id _tecton_topology_atom_label _tecton_topology_atom_type_symbol # Child of _atom_type_symbol _tecton_topology_atom_valence # Formal oxidation state _tecton_topology_atom_coord_number # Number of bonds formed by this atom _tecton_topology_atom_details Ca 1 Ca1 Ca 2 7 ? Cr 1 Cr1 Cr 3 6 ? F1 1 F1 F -1 3 ? F2 1 F2 F -1 2 ? F3 1 F3 F -1 3 ? F4 1 F4 F -1 3 ' Related to F1 by crystallographic symmetry' F5 1 F5 F -1 2 ' Related to F2 by crystallographic symmetry' loop_ _tecton_topology_bond_id # list-reference _tecton_topology_bond_atom_id_1 # Child of _tecton_atom_id _tecton_topology_bond_atom_id_2 # Child of _tecton_atom_id _tecton_topology_bond_valence # Predicted bond valence _tecton_topology_bond_type Cr.F1 Cr F1 0.48 ? Cr.F4 Cr F4 0.48 ? Cr.F2 Cr F2 0.61 ? Cr.F5 Cr F5 0.61 ? Cr.F3.1 Cr F3 0.41 ? Cr.F3.2 Cr F3 0.41 ? Ca.F1.1 Ca F1 0.26 ? Ca.F1.2 Ca F1 0.26 ? Ca.F4.1 Ca F4 0.26 ? Ca.F4.2 Ca F4 0.26 ? Ca.F2 Ca F2 0.39 ? Ca.F5 Ca F5 0.39 ? Ca.F3 Ca F3 0.18 ? loop_ _tecton_geom_dist_id # List-reference _tecton_geom_dist_atom1_id # Child of _tecton_topology_atom_id _tecton_geom_dist_atom2_id # Child of _tecton_topology_atom_id _tecton_geom dist_distance # Ideal bond distance in Angstroms _tecton_geom_dist_valence # Bond valence corresponding to dist. _tecton_geom_dist_details A Cr F1 1.93 0.48 'Bond distances calculated from bond valences' B Cr F4 1.93 0.48 'Bond distances calculated from bond valences' C Cr F2 1.84 0.61 'Bond distances calculated from bond valences' D Cr F5 1.84 0.61 'Bond distances calculated from bond valences' E Cr F3 1.99 0.41 'Bond distances calculated from bond valences' F Cr F3 1.99 0.41 'Bond distances calculated from bond valences' G Ca F1 2.34 0.26 'Bond distances calculated from bond valences' H Ca F1 2.34 0.26 'Bond distances calculated from bond valences' I Ca F4 2.34 0.26 'Bond distances calculated from bond valences' J Ca F4 2.34 0.26 'Bond distances calculated from bond valences' K Ca F2 2.19 0.39 'Bond distances calculated from bond valences' L Ca F5 2.19 0.39 'Bond distances calculated from bond valences' M Ca F3 2.48 0.18 'Bond distances calculated from bond valences' loop_ _map_tecton2crystal_atom_id # List-reference _map_tecton2crystal_atom_atom_id # Child of _tecton_topology_atom_id _map_tecton2crystal_atom_atom_site_label # child of _atom_site_label _map_tecton2crystal_atom_symop_id # child of _space_group_symop_id _map_tecton2crystal_atom_trans_x _map_tecton2crystal_atom_trans_y _map_tecton2crystal_atom_trans_z 1 Ca Ca1 1 0 0 0 2 Cr Cr1 1 0 0 0 3 F1 F1 1 0 0 0 4 F2 F2 1 0 0 0 5 F3 F3 1 0 0 0 6 F4 F1 3 0 0 0 7 F5 F2 3 0 0 0 loop_ _map_tecton2crystal_bond_id # List-reference _map_tecton2crystal_bond_bond_id # Child of _tecton_topology_bond_id _map_tecton2crystal_bond_atom_site_label_1 # Child of _atom_site_label _map_tecton2crystal_bond_symop_1 # Child of _space_group_symop_id _map_tecton2crystal_bond_trans_x_1 _map_tecton2crystal_bond_trans_y_1 _map_tecton2crystal_bond_trans_z_1 _map_tecton2crystal_bond_atom_site_label_2 # Child of _atom_site_label _map_tecton2crystal_bond_symop_2 # Child of _space_group_symop_id _map_tecton2crystal_bond_trans_x_2 _map_tecton2crystal_bond_trans_y_2 _map_tecton2crystal_bond_trans_z_2 _map_tecton2crystal_bond_dist # Observed distance (optional) _map_tecton2crystal_bond_details 1 Cr.F1 Cr1 1 0 0 0 F1 1 0 0 0 1.918 ? 2 Cr.F4 Cr1 1 0 0 0 F4 1 0 0 0 1.918 ? 3 Cr.F2 Cr1 1 0 0 0 F2 1 0 0 0 1.848 ? 4 Cr.F5 Cr1 1 0 0 0 F5 1 0 0 0 1.848 ? 5 Cr.F3.1 Cr1 1 0 0 0 F3 1 0 0 0 1.940 ? 6 Cr.F3.2 Cr1 1 0 0 0 F3 3 0 0 0 1.940 link 7 Ca.F1.1 Ca1 1 0 0 0 F1 5 0 0 0 2.391 link 8 Ca.F1.2 Ca1 1 0 0 0 F1 6 0 0 0 2.292 link 9 Ca.F4.1 Ca1 1 0 0 0 F4 5 0 -1 0 2.391 link 10 Ca.F4.2 Ca1 1 0 0 0 F4 6 0 -1 0 2.292 link 11 Ca.F2 Ca1 1 0 0 0 F5 1 0 0 0 2.215 ? 12 Ca.F5 Ca1 1 0 0 0 F2 4 1 0 0 2.215 link 13 Ca.F3 Ca1 1 0 0 0 F3 5 0 0 0 2.494 link # End of sample CIFs
_______________________________________________ coreCIFchem mailing list coreCIFchem@iucr.org http://scripts.iucr.org/mailman/listinfo/corecifchem
[Send comment to list secretary]
[Reply to list (subscribers only)]
- Follow-Ups:
- Re: CoreCIFchem Discussion #6 (Howard Flack)
- Re: CoreCIFchem Discussion #6 (Howard Flack)
- Prev by Date: Re: coreCIFchem #5
- Next by Date: [Fwd: Re: Proposed CIF addition for chemical descriptions]
- Prev by thread: [Fwd: Re: Proposed CIF addition for chemical descriptions]
- Next by thread: Re: CoreCIFchem Discussion #6
- Index(es):