EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #09880


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

RE: [EP-tech] sort by date in a simple search with xapian does not sort correctly


CAUTION: This e-mail originated outside the University of Southampton.

Hi David

 

I install the modification on file “lib/cfg.d/search_xapian.pl”   like you tell me on this page

https://github.com/eprints/eprints3.4/commit/b4d6f20d7d328efc47afb19d818a861becbb4a20

after I run these command:

 

systemctl stop epindexer

rm -rI    archives/eprints_id/var/xapian

bin/epadmin reindex eprints_ID eprint –verbose

systemctl restart httpd

systemctl start epindexer

 

the sort by date in now perfect

 

thank a lot

 

il will do that on all eprints_id

 

Mario

 

De : David R Newman <drn@ecs.soton.ac.uk>
Envoyé : 17 novembre 2024 12:55
À : eprints-tech@ecs.soton.ac.uk; Beaudoin, Mario <Mario.Beaudoin@uqtr.ca>
Objet : Re: [EP-tech] sort by date in a simple search with xapian does not sort correctly

 

Hi Mario,

I have found the issue.  It was due to the original fix for the following issue:

https://github.com/eprints/eprints3.4/issues/246

As Xapian works differently to the database-based simple search, it was text indexing non-text fields.  The original fix turns out to not be appropriate as the database search uses both its index (e.g. the eprint__rindex) and the ordervalues tables (e.g eprint__ordervalues_en and eprint__ordervalues_fr) to process a search.  Therefore, fields that do not have text_index to 1 set can still be used in simple search to support ordering.  The following further fix resolves this issue:

https://github.com/eprints/eprints3.4/commit/b4d6f20d7d328efc47afb19d818a861becbb4a20

This commit also reverts a no longer required fix in  perl_lib/EPrints/DataObj/Subject.pm for https://github.com/eprints/eprints3.4/issues/411 which was a post 3.4.6 release change, so you should only need to apply the changes to lib/cfg.d/search_xapian.pl.  Unfortunately, to fix the sorting issue you will need to perform a complete re-index after updating search_xapian.pl.  Also be sure to restart the indexer as well (and probably best to also reload Apache so it is consistent with the indexer).

Regards

David Newman

On 17/11/2024 10:07 am, David R Newman wrote:

Hi Mario,

Xapian does have a bit of an oddity where it occasionally forgets how to do certain orderings.  I am looking at my development EPrints repository and the ordering by year (both most recent and oldest first) show the same ordering as by author's name, as that the secondary ordering for those options.  I believe there is a way I have fixed this in the past without having to create a new Xapian index and reindex from scratch but I cannot remember offhand.

I was looking back over the mail archive and I noted that you commented out a related warning message.  Back in July you reported the error message:

Search::Xapian can't sort by eprint.byrelevance.fr: unknown sort key

That is a bug with the code.  Some time back we added an actual value for the by relevance ordering option in search because otherwise you could not be sure this would be set as the default if specified in the search configuration.  What was not also done was to make sure this option is ignored when building the sorter for Xapian search results.  Other orderings (e.g. by title) specify title as the primary order but use author's name and year as secondary/tertiary orderers, finally if there is still a tie by relevance is used to determine the order.  By relevance is a built-in ordering mechanism for Xapian search, so does not need to be specified and does not have a sort key, which means you get an warning message like the one above.  Therefore, I will make sure the code is fixed for a future release, to avoid this unnecessary warning message being generated.

Regards

David Newman

On 15/11/2024 7:48 pm, Beaudoin, Mario wrote:

CAUTION: This e-mail originated outside the University of Southampton.

CAUTION: This e-mail originated outside the University of Southampton.

Hi all,

When i sort by date  the result of a simple search is not good.

I have the latest eprints version and use xapian for simple search.

Do you have any idea how to fix that?

or which file is used for sorting in simple search when xapian is on

and with table xapian use for sorting by date to be sure the date format is good

 

thank

 

Mario Beaudoin

 

 



*** Options: https://wiki.eprints.org/w/Eprints-tech_Mailing_List
*** Archive: https://www.eprints.org/tech.php/
*** EPrints community wiki: https://wiki.eprints.org/
 




*** Options: https://wiki.eprints.org/w/Eprints-tech_Mailing_List
*** Archive: https://www.eprints.org/tech.php/
*** EPrints community wiki: https://wiki.eprints.org/