[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: Powder CIF Proposals
- Subject: Re: Powder CIF Proposals
- From: "ROBIN SHIRLEY (USER)" <R.Shirley@xxxxxxxxxxxx>
- Date: Fri, 20 Oct 2000 13:48:50 +0100 (BST)
Apologies for these delayed responses to Nick's comments of 29 Sept: 1) 00-2-11.2) _pd_index_appendix > > The sort of indexing history envisaged in my original proposal can > > now be captured and updated automatically in the form of the > > Crysfire logfile for that dataset - an example is attached. > My main concern with these proposals is that I would like to see > the dictionary definitions for these data items. Until then I > accept that all of these proposed items are reasonable but I would > like to know how I would parse their contents. For instance how > will an indexed history be represented and parsed? The intention is that this should essentially provide an opportunity to record human-readable summary of the indexing history of the current dataset. Thus, even if a Crysfire logfile now forms a good basis for this section, it should remain undefined free text, between lines containing semi-colons in col 1. e.g... _pd_index_appendix ; Any human- or program-generated text that summarises the indexing history of this dataset. ; 2) _pd_index_merit > > Thus this would become: > > _pd_index_merit M FOM program > > > > (e.g. _pd_index_merit 21.7 M20 ITO12, > > or _pd_index_merit 54.215 M1 CRYS934h). > This is syntactically incorrect, and it begs the question are > these 3 components of one object (a list) or 3 separate objects? Thanks for pointing out the incorrect syntax, which slipped in while the concept was developing in response to people's suggestions. My original intention was that this should simply be a piece of quoted text: e.g. _pd_index_merit 'M20 = 21.7 (ITO12)' But in response to people's feedback this became elaborated into a sequence of three separate items (which could be looped if necessary): M (the numerical value) FOM (the generic type, left as quoted text) "program", or as I now prefer, "source" (another piece of quoted text, which for example summarises the program version or other source of the specific algorithm used) e.g. _pd_index_merit_M 21.7 _pd_index_merit_FOM 'M20' _pd_index_merit_source 'ITO12' The reason for leaving the FOM and source items as quoted text is that I see no early prospect of standardising them, and have doubts whether such a restriction would actually be desirable. Some possible FOM terms are still rather fluid (especially the most popular, M20, which was originally defined in a way that left to the judgement of the implementor what was meant by an "indexed" line, and has since been subject to various reinterpretations and extensions). There are also newer FOM contenders (e.g. FN, M1, PM) which outperform M20 for particular purposes. Thus I'm not sure that we are ready to try to compile a list of standardised FOM definitions. A way round this is to leave them as text. This argument applies more strongly in the case of "source", which could then be left open to whatever elaboration might seem helpful, such as the addition of a reference or a URL. > I also have a general comment concerning worries of the potential > size of data files and looped items. I think many are increasingly > coming around to the idea that it is important to retain > "primitive" (read non-derivable) data as much as possible. If these > trial cells are "relevant" to the discipline then there should be a > mechanism for retaining such information. I have no particular position on this issue, except to point out that one should perhaps not rely too boldly on the increasing storage capacity of modern computers, since such lists could easily become very large, and in the case of indexing most of their bulk would refer to low-probability hypotheses. This is why I tend to favour keeping relatively concise summary logs in a section such as_pd_index_appendix rather than retaining more bulky looped lists. Best wishes Robin Shirley ------------------------------------------------------- Date: Fri, 29 Sep 2000 08:25:33 +0100 (BST) From: Nick Spadaccini <nick@cs.uwa.edu.au> Subject: Re: Powder CIF Proposals On Thu, 28 Sep 2000, ROBIN SHIRLEY (USER) wrote: > 00-2-11.1) _pd_proc_quadr_Q (or _pd_index_quad_Q - see discussion > below) > > I accept that if this could be derived directly from > _pd_peak_d_spacing, then the case for including it would be weak, The fact that a quantity may be directly derivable from another is NOT an argument for its exclusion. Such an argument would (strictly) see structure coordinates (as an extreme example) not defined since these are derivable from intensity measurements. The STAR developers have spent the last three years working on the definition and semantics of the method attributes supported by STAR and by inference CIF. This is the mechanism by which the exact relationships between data items may be specified (algorithmically). Hence in our prototype the dictionary (which is the MOST important component of the STAR and CIF systems) is literally compiled into a suite of Java and Python objects. A request for a data item results in the value if stored in the data file or an invocation of the objects which will eventually result in a value by evaluation. My point here is that, the fact that some quantity is derivable from another is an important INCLUSION to be made into the dictionary rather that a reason to exclude it. > 00-2-11.2) _pd_index_appendix > The sort of indexing history envisaged in my original proposal can > now be captured and updated automatically in the form of the Crysfire > logfile for that dataset - an example is attached. My main concern with these proposals is that I would like to see the dictionary definitions for these data items. Until then I accept that all of these proposed items are reasonable but I would like to know how I would parse their contents. For instance how will an indexed history be represented and parsed? > Thus this would become: > _pd_index_merit M FOM program > > (e.g. _pd_index_merit 21.7 M20 ITO12, > or _pd_index_merit 54.215 M1 CRYS934h). This is syntactically incorrect, and it begs the question are these 3 components of one object (a list) or 3 separate objects? I also have a general comment concerning worries of the potential size of data files and looped items. I think many are increasingly coming around to the idea that it is important to retain "primitive" (read non-derivable) data as much as possible. If these trial cells are "relevant" to the discipline then there should be a mechanism for retaining such information. cheers Nick -------------------------------- Dr Nick Spadaccini Department of Computer Science voice: +(61 8) 9380 3452 University of Western Australia fax: +(61 8) 9380 1089 Nedlands, Perth, WA 6907 email: nick@cs.uwa.edu.au AUSTRALIA web: http://www.cs.uwa.edu.au/~nick
Reply to: [list | sender only]
- Prev by Date: Re: Backus-Naur Form for CIF
- Next by Date: Re: Powder CIF Proposals
- Prev by thread: Re: Powder CIF Proposals
- Next by thread: Re: Powder CIF Proposals
- Index(es):