[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Fwd: Re: A Search Engine for Searching Across Distributed EprintArchives
- To: epc@iucr.org
- Subject: Fwd: Re: A Search Engine for Searching Across Distributed EprintArchives
- From: Pete Strickland <ps@iucr.org>
- Date: Thu, 21 Oct 2004 09:11:57 +0100
- Organization: IUCr
---------- Forwarded Message ---------- Subject: Fwd: Re: A Search Engine for Searching Across Distributed Eprint Archives Date: Wednesday 20 October 2004 4:42 pm From: Barry Mahon <barry.mahon@IOL.IE> To: ICSTI-L@DTIC.MIL Dear All, For those of you who don't see this list, a comment from Stevan Harnad about 'cross server' searching of articles.... As we know he doesn't think this presents a problem, something many would disagree with.... but here he argues,in effect, for a 'wait and see' policy for when we have all the material in OA. Bye, Barry On Wed, 20 Oct 2004, Donat Agosti wrote: > Something, which bothers me and doesn't show up in most of the > discussion of open access, is the construction of search tools across > digital publications (and potentially millions of pages of legacy > information). In the end, this will be the real issue, not just reading > another publication face to face. The real issue -- and the 1st, 2nd, 3rd and Nth priority today -- is Open Access (OA) *content*: The full-texts of the 2.5 million annual articles published in the world's 24,000 peer-reviewed journals are still not openly accessible online (only about 20% of them are). It is merely distraction and dreaming to worry about search tools when the OA content is not yet there for them to search! Having said that, cross-archive search tools (for the little OA content we have so far) already *do* exist (and they are already far more powerful than their sparse content yet deserves!): http://oaister.umdl.umich.edu/o/oaister/ http://citebase.eprints.org/ http://www.scirus.com/srsapp/ And (I promise you), providing more OA content is guaranteed to inspire the creation of more and more such tools, with more and more powerful capacities. So please, don't worry about more powerful search tools when the cupboards are still bare: Fill the cupboards and the search tools will come, hungrily! > What do you think about that? It seems, that the big publishing houses > are already thinking about that, and that they developed such facilities. The big publishing houses' cupboards are *not* bare: They have the 100% Toll Access content on which to provide ever more powerful search tools. Let's provide 100% Open Access content and then watch what happens! > This of course is one of the most important tools, for data > mining, extraction, or just finding the right piece of information. It > also means, that we look beyond self-archived pdf documents to searchable > documents with some mark up of their logic content included. Any ideas? Two ideas: (1) Provide the full-text Open Access content, and the tools for finding, mining and extracting from it will come with the territory. (2) The primary target is journal articles, which consist primarily of text. The most powerful means of text-processing today is full-text inversion. (This is part of the magic that google does.) Enhancing this with citation-linking (in place of google's ordinary linking), plus some hub/authority analysis, citation and download ranking, co-citation analysis, co-text (semantic/similarity) analysis, and full-text boolean search, and I think you will have search capabilities to surpass your wildest dreams. The only missing element is the content. Please let's not forget that, and lapse into Oneirology instead of Open Access Provision! Stevan Harnad ---------------------------------------------------- This message has been processed by Firetrust Benign. ------------------------------------------------------- -- Best wishes Peter Strickland Managing Editor IUCr Journals ---------------------------------------------------------------------- IUCr Editorial Office, 5 Abbey Square, Chester CH1 2HU, England Phone: 44 1244 342878 Fax: 44 1244 314888 Email: ps@iucr.org Ftp: ftp.iucr.org WWW: http://journals.iucr.org/ NEWSFLASH: Complete text of all IUCr journals back to 1948 now online! Visit Crystallography Journals Online for more details _______________________________________________ Epc mailing list Epc@iucr.org http://scripts.iucr.org/mailman/listinfo/epc
Reply to: [list | sender only]
- Prev by Date: Fwd: Proceedings of Stockholm Meeting - European Science Open Forum;session on science publishing
- Next by Date: ICSTI: news items
- Prev by thread: Re: FYI from DOI news
- Next by thread: Fwd: Proceedings of Stockholm Meeting - European Science Open Forum;session on science publishing
- Index(es):