[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: Accent escape sequences
- To: jrh@anbf2.kek.jp,"Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
- Subject: Re: Accent escape sequences
- From: Joe Krahn <krahn@niehs.nih.gov>
- Date: Tue, 06 Mar 2007 10:41:22 -0500
- In-Reply-To: <1173161912.31606.47.camel@anbf10>
- References: <45E72969.1090100@niehs.nih.gov> <20070302101147.GA26353@emerald.iucr.org> <Pine.BSF.4.58.0703020830490.46806@epsilon.pair.com> <45EA0C29.5060604@niehs.nih.gov> <a06230900c20fde7910a9@[192.168.2.101]> <45EC3846.5070001@niehs.nih.gov> <20070305160044.GB13871@emerald.iucr.org><1173161912.31606.47.camel@anbf10>
James Hester wrote: > Thinking about the mechanics of implementing these suggestions, it would > make sense to define different types of text field using the > _item_type_list.code DDL2 attribute. Currently mmCIF appears to have > only 'text' for multiline data, and imgCIF has 'binary' in addition to > this. A new type (e.g. 'mime') could specify a regex that matches a > mime header, something like what is done for the imgCIF 'binary' type. Should these follow standard MIME types for better standardization, or maybe just have an optional subtype code specific to CIF to allow for better specialization? I would go with the latter. > > A variation on this would be to define a larger number of > _item_type_list.codes corresponding to the various text formats of > interest, for example 'ascii_markup','tex','html','mathml'. This would > mean that the format of a given data item would be determined at > dictionary writing time if a single type code is given in the > dictionary. While this might work and be quite useful when writing > dictionaries, it is probably too onerous when producing data files. So > the data dictionary would specify a list of possible text type codes, > and a magic number or mime header would be useful in the data item text > field in order to disambiguate. Why is that too onerous for text fields? CIF already has the problem that everything is plain text in the absence of a dictionary, yet there is no numeric flag. CIF would be much more self-defined if that were the case, but the current design is to base everything on a dictionary. In general, a generic CIF parser should be able to handle all of the text fields un-processed, and leave it to the reader to make sense of it. That is why the multipart/alternative is good; even the raw form is readable. The only caveat added by MIME is that regions between the multi-part boundaries may contain the "<eof>;" end mark. > > Regarding the suggestion that there be several representations of the > same text using a mime multipart approach, I think caution is warranted > insofar as this might relate to dictionary data items (as opposed to > data file data items), in that all of the parts should be kept > synchronised, entailing more work, and work which involves specialised > knowledge. Perhaps alternative parts need to include a flag as to which form is the authoritative representation. Then, if it is changed and the other form(s) are not, the alternatives must be deleted or marked "invalid" until they can be remade properly. However, it is certainly worth avoiding excessive use. In general, something like equations should only be edited by the author(s), whereas most other users will handle it in "read only" form. This could be used for something like internationalization of CIF dictionaries as well. In that case, I assume that English would be the primary reference, and other contributors can add translations. If the English part changes, the translations could be flagged as out-of-date until a language contributor can update the translation. Of course, my native language is American English, so it is not a big deal for me. What do non-English crystallographers think? I have wondered about the possibility of having US and UK alternatives for words like metre. Or should we just declare US spelling as wrong? Joe
Reply to: [list | sender only]
- References:
- Accent escape sequences (Joe Krahn)
- Re: Accent escape sequences (Brian McMahon)
- Re: Accent escape sequences (Herbert J. Bernstein)
- Re: Accent escape sequences (Joe Krahn)
- Re: Accent escape sequences (Herbert J. Bernstein)
- Re: Accent escape sequences (Joe Krahn)
- Re: Accent escape sequences (Brian McMahon)
- Re: Accent escape sequences (James Hester)
- Prev by Date: Re: Accent escape sequences
- Next by Date: New accent modifier types?
- Prev by thread: Re: Accent escape sequences
- Next by thread: Re: Accent escape sequences
- Index(es):