EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #02708
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
[EP-tech] Re: Simple search - length of query
- To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
- Subject: [EP-tech] Re: Simple search - length of query
- From: John Salter <J.Salter@leeds.ac.uk>
- Date: Wed, 5 Mar 2014 12:55:06 +0000
Hi Jan/Seb, It looks like my indexes are the same as yours (see below). Which version of MySQL are you running? What is your ' optimizer_search_depth' set to? Seb, When I took a snapshot of the process', the query was in the 'statistics' stage - for a long time - so I'll see if tweaking the optimizer_search_depth helps! Longer term we'll be switching to Xapian :o) Cheers, John mysql> SHOW INDEXES FROM eprint__rindex\G *************************** 1. row *************************** Table: eprint__rindex Non_unique: 0 Key_name: PRIMARY Seq_in_index: 1 Column_name: field Collation: A Cardinality: 49 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 2. row *************************** Table: eprint__rindex Non_unique: 0 Key_name: PRIMARY Seq_in_index: 2 Column_name: word Collation: A Cardinality: 1238267 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 3. row *************************** Table: eprint__rindex Non_unique: 0 Key_name: PRIMARY Seq_in_index: 3 Column_name: eprintid Collation: A Cardinality: 14859215 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 4. row *************************** Table: eprint__rindex Non_unique: 1 Key_name: eprint__rindex_eprintid_2 Seq_in_index: 1 Column_name: eprintid Collation: A Cardinality: 15994 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 5. row *************************** Table: eprint__rindex Non_unique: 1 Key_name: eprint__rindex_eprintid_2 Seq_in_index: 2 Column_name: field Collation: A Cardinality: 154783 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: 5 rows in set (0.00 sec) -----Original Message----- From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Jan Ploski Sent: 05 March 2014 12:20 To: eprints-tech@ecs.soton.ac.uk Cc: John Salter Subject: [EP-tech] Re: Simple search - length of query Could it be that you're missing some essential indexes on eprint__rindex? Your example query is very hard on my 3.0.5 database while it returns quickly on 3.3.12. The difference is that the eprint__rindex primary key in 3.3.12 consists of (field, word, eprintid) and there is also a non-unique index on (eprintid, field). In the 3.0.5 database I only have a non-unique index on eprintid. John Salter wrote: > Hi, > Yesterday we had a bit of an issue with our repository when someone pasted a full citation into the simple search box. > This produced an impressive SQL query that locked things up and made users unhappy... > > Is there a way to sanitise what a 'simple' search might try to handle? e.g. would restricting it to a certain number of words be acceptable? > Would the Xapian search handle a request like the one below any better? > > Details below/attached if you're interested! > Cheers, > John > > GET /cgi/search/simple?full=%E2%80%98Families%2C+Domesticity+and+Intimacy%3A+Changing+Relationships+in+Changi > ng+Times%E2%80%99%2C+in+Richardson%2C+D%2C+and+Robinson%2C+V.+%28eds%29+Introducing+Women%27s+Studies%2C+third+edition.+Basingstoke%3A+Palgrave%2C+2008+pp.+1 > 25-143.+&_action_search=Search&_order=bytitle&basic_srchtype=ALL&_satisfyall=ALL > > searchexp created in cache table: > 0|1|-date/creators_name/title|archive|-|full:abstract/creators_name/date/documents/title:ALL:IN:▒Families, Domesticity and Intimacy%3A Changing Relationships in Changing Times▒, in Richardson, D, and Robinson, V. (eds) Introducing Women's Studies, third edition. Basingstoke%3A Palgrave, 2008 pp. 125-143. |-|eprint_status:eprint_status:ANY:EQ:archive|metadata_visibility:metadata_visibility:ANY:EQ:show > > The SQL generated by search is attached (get ready for this - it's a thing of beauty ;o) - you can see why it took a while to run! > > > > > *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech > *** Archive: http://www.eprints.org/tech.php/ > *** EPrints community wiki: http://wiki.eprints.org/ > *** EPrints developers Forum: http://forum.eprints.org/ *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech *** Archive: http://www.eprints.org/tech.php/ *** EPrints community wiki: http://wiki.eprints.org/ *** EPrints developers Forum: http://forum.eprints.org/
- References:
- [EP-tech] Simple search - length of query
- From: John Salter <J.Salter@leeds.ac.uk>
- [EP-tech] Re: Simple search - length of query
- From: Jan Ploski <jpl@plosquare.com>
- [EP-tech] Simple search - length of query
- Prev by Date: [EP-tech] Re: Simple search - length of query
- Next by Date: [EP-tech] Spam filtering on request emails
- Previous by thread: [EP-tech] Re: Simple search - length of query
- Next by thread: [EP-tech] Re: Simple search - length of query
- Index(es):