Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Discussion #2

  • To: "Chemical information in core CIF" <corecifchem@iucr.org>
  • Subject: RE: Discussion #2
  • From: "Bollinger, John Clayton" <jobollin@indiana.edu>
  • Date: Mon, 1 Dec 2003 17:47:06 -0500

David Brown wrote:

[...]

>     HDF, coming at this problem from a different direction, 
> expressed shock at the thought that the chemists have no 
> unique way of defining a molecule.
> 
>     From what I learned at the workshop, chemists not only 
> have no unique way of defining a molecule, they don't even 
> care!  Most of the ontologies
> (dictionaries) being developed in chemistry are, like CIF, closely 
> related to
> an experimental technique where the question of defining a 
> molecule is not important.  Peter Murray-Rust's CML defines a 
> molecule as a composed of 
> atoms,
> but leaves it to the author to state which atoms.  Miloslav 
> Nic in Prague is developing GTML (Graph Theory Mark-up 
> Language) which can be used for molecular descriptions, but 
> this also assumes that the molecular graph is already known.

[...]

> properties.  The use of graph theory separates out the 
> intrinsically chemical concepts from the graph theoretical 
> description that can be manipulated mathematically.
> 
>     The bond graph represents a chemical interpretation of 
> the 3D geometry, i.e., the geometry tells us which atoms are 
> neighbours but not where the 
> bonds
> are to be found.  The bonds are assigned by applying various rules 
> relating to
> the chemical properties of the atoms.  However, not all bonds 
> are of equal value; some are clearly stronger than others, 
> i.e., they survive many of the physical and chemical 
> treatments we can subject them to such as melting or 
> dissolution in a solvent.  Weaker bonds do not survive this 
> treatment.  
> We can
> thus imagine that each of the possible edges in the graph is 
> associated 
> with a
> number representing its 'strength'.  (The 'strength' would be 
> zero for the edges that could not possibly represent a 
> chemical bond).  I will 
> deliberately
> avoid defining in detail what I mean by 'strength', but 
> qualitatively it represents the number of electron pairs 
> associated with the bond and it 
> obeys
> (by definition in some treatments) the rule that the sum of 
> the bond 'strengths' received by any atom is equal to the 
> number of valence electrons the atom uses for bonding.  
> (Implicit in this description is the notion 
> that a
> bond 'strength' is not restricted to integer values).  For 
> over a century chemists have struggled to find a tight 
> quantitative definition for bond 'strength' under such names 
> as bond order, bond number, bond valence, electrostatic bond 
> strength, etc., each definition trying to capture the concept 
> in numeric form.  All of these definitions are incomplete in 
> one way or another, but in principle they allow us to order 
> the bonds from strongest to weakest.  Assuming that we can at 
> least determine this order even if we cannot assign actual 
> numbers to the bond 'strength', our problem then 
> reduces
> to the question of where to place the cut-off between the 
> bonds that are 
> shown
> on the graph and those that are omitted.

[...]

>     Before we try to define CIF items for particular chemical 
> concepts, we need to have a consensus about the definition of 
> a molecule.  I have 
> made some
> suggestions above, and I would be interested in people's 
> comments.  Is graph theory a fruitful way to go or should we 
> take a different approach?  
> What are
> the problems we might encounter using the approach described above?

I find the idea of relating chemical properties to a bond graph to be
rather attractive, although I confess to the influence of a bit of
background in formal mathematics.

One aspect that David left unexplored is the possibility of applying
multiple properties to bond graph edges.  One need not choose a single
measure of bond strength, nor make measures of bond strength the only
properties a bond graph edge may have.  For instance, one could apply an
explicit bond categorization (e.g. "covalent", "dative", "hydrogen
bond", "non-bond").  One could also express purely chemical information,
such as the fact that "this is the bond that is broken in the course of
the von Foo reaction".  CIF is conveniently flexible in this regard, as
authors may include exactly the properties they wish to describe while
ignoring all others.

One oddity I see with that approach is that the interatomic distances
currently described by _geom_bond_distance fit nicely into the
collection of properties that could be associated with bond graph edges,
but many other _geom_* items do not.  Graph theorists do have concepts
that could be applied there, but we must take care to avoid making CIF
(more) incomprehensible to mere mortals.  Perhaps, though, it does make
sense to consider whether all the various geom_* categories should be
subsumed into a scheme such as this -- they are all examples of data
that have both crystallographic significance and chemical significance.

As chemists in general have no single consistent definition of a
molecule, it would be fruitless for us to attempt to impose a universal
one of our own in hopes of satisfying everyone.  The alternatives I see
are

(1) to choose our own definition for CIF purposes and use it
consistently;
(2) to support diverse CIF items with which to describe multiple
different molecule concepts;
(3) to provide sufficient data for a chemist to apply his or her own
definition of "molecule"; or
(4) to attempt to ignore molecules altogether.

Although (4) is perhaps most true to pure crystallography, I think it is
least suitable for our purpose.  Option (2) strikes me as inelegant and
short-sighted.  Option (1) might be feasible if we could actually come
up with a suitable definition, but I question whether that is possible.
That leaves option (3), to which category I would assign most
applications of the bond graph approach that I have imagined so far.


John Bollinger

--

John C. Bollinger, Ph.D.
Indiana University
Molecular Structure Center

jobollin@indiana.edu 
_______________________________________________
coreCIFchem mailing list
coreCIFchem@iucr.org
http://scripts.iucr.org/mailman/listinfo/corecifchem


[Send comment to list secretary]
[Reply to list (subscribers only)]