EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #04146
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
[EP-tech] Antwort: Use of truncation in advanced searches
- To: eprints-tech@ecs.soton.ac.uk
- Subject: [EP-tech] Antwort: Use of truncation in advanced searches
- From: martin.braendle@id.uzh.ch
- Date: Tue, 21 Apr 2015 11:18:36 +0200
Hi Gilles,
our repo has about 80'000 records and 56% fulltext, so is comparable to yours.
Advanced search of thermograph* in
title: immediate (1-2 seconds)
documents (full text): 20-30 seconds. The mysql daemon goes up to 70-100% CPU load.
Quick search (Xapian):
title:thermograph* : immediate
thermograph* : immediate
We recommend in our help page (http://www.zora.uzh.ch/help/) that Quick Search should be the tool of choice and only for very precise searches Advanced Search should be used.
From a recent debug session (on another issue) I know that EPrints translates behind the scenes an advanced search query into a series of dozens of complicated SQL statements. It might be that for certain cases these are not optimized.
If it were that simple as
select distinct ei.eprintid from eprint__rindex ei, eprint e where ei.field='documents' and ei.word like 'thermograph%' and e.eprint_status='archive' and e.eprintid=ei.eprintid;
then that query would be answered in a fraction of a second. But it isn't, and can't be, and EPrints software engineers surely have put a lot of effort into the EPrints database engine part to cover all possible situations.
Best regards,
Martin
--
Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Winterthurerstr. 190
CH-8057 Zürich
Gilles Fournié ---21/04/2015 10:22:27---Hi, I have a question about right-hand truncation in advanced searches.
Von: Gilles Fournié <gilles.fournie@cirad.fr>
An: eprints-tech@ecs.soton.ac.uk
Datum: 21/04/2015 10:22
Betreff: [EP-tech] Use of truncation in advanced searches
Gesendet von: eprints-tech-bounces@ecs.soton.ac.uk
Hi,
I have a question about right-hand truncation in advanced searches.
If we search for (in title for example) :
thermography
the search runs for 1 to 3 seconds before returning results.
If we extend our search to :
thermography thermographie
the search time is about the same.
But if we try to use a wildcard :
thermograph*
the search takes a very long time (counts in minutes) !
Does somebody have experienced such delays ?
Any clues about what we can do to solve this problem ?
(our archive contains ~ 91000 eprints)
Best regards,
GF
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/
- References:
- [EP-tech] Use of truncation in advanced searches
- From: Gilles Fournié <gilles.fournie@cirad.fr>
- [EP-tech] Use of truncation in advanced searches
- Prev by Date: [EP-tech] Use of truncation in advanced searches
- Next by Date: [EP-tech] Re: Antwort: Use of truncation in advanced searches
- Previous by thread: [EP-tech] Use of truncation in advanced searches
- Next by thread: [EP-tech] Re: Antwort: Use of truncation in advanced searches
- Index(es):