EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #08475
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
[EP-tech] Antwort: Re: Help indexing phrases
- To: <eprints-tech@ecs.soton.ac.uk>, David R Newman <drn@ecs.soton.ac.uk>
- Subject: [EP-tech] Antwort: Re: Help indexing phrases
- From: <martin.braendle@uzh.ch>
- Date: Mon, 25 Jan 2021 12:49:00 +0100
Von: "David R Newman via Eprints-tech"
Gesendet von:
Datum: 25.01.2021 10:39
Betreff: Re: [EP-tech] Help indexing phrases
Hi Phil,
Unfortunately, I don't think this is possible. I think you would need to create a new field that is an id multiple field and use this. You could probably write a script to map from the uncontrolled keywords field into this new multiple id field. However, even with this new field I am not sure how well Xapian would index these as individual multi-word terms. Advanced search for this field should work as you require. In 3.4.2 I introduced the Idci MetaField that is basically the same as the Id MetaField but that matches case-insensitively, this is useful for mathcing things like email addresses and usernames, where case does not usually make a functional difference.
I have been thinking how best to implement a keywords fields that is more effective across simple, advanced and faceted search, particularly for multi-word terms. I have yet to conclude on a solution, as I need to better understand how Xapian indexing works to see if it can be setup to allow EPrints to effectively index multiple-word terms.
Regards
David Newman
CAUTION: This e-mail originated outside the University of Southampton.Using uncontrolled keywords field which has phrases separated by commas, like to index the whole phrase.
For example :-
evacuation lift, part b - fire safety, b5 access and facilities for the fire
service, fire risk assessment, residual risk, building safety, b4 external
fire spread, means of escape, principal works, health & safety strategy
https://eprints.buildvoc.co.uk/id/eprint/865/
Question how do I configure xapian or indexing.pl to index the whole phrase instead of the individual terms for example fire, safety, or building
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
- Follow-Ups:
- [EP-tech] Antwort: Re: Help indexing phrases
- From: <martin.braendle@uzh.ch>
- [EP-tech] Antwort: Re: Help indexing phrases
- References:
- Re: [EP-tech] Help indexing phrases
- From: David R Newman <drn@ecs.soton.ac.uk>
- [EP-tech] Help indexing phrases
- From: Phil Stacey <phil@buildvoc.co.uk>
- [EP-tech] Antwort: Re: Help indexing phrases
- From: <martin.braendle@uzh.ch>
- Re: [EP-tech] Help indexing phrases
- Prev by Date: Re: [EP-tech] [EXTERNAL] Re: ORCID integration
- Next by Date: [EP-tech] Faceted Search (EPrints goes ElasticSearch)
- Previous by thread: [EP-tech] EPrints/CRIS
- Next by thread: [EP-tech] DOI handling in orcid_support_advance
- Index(es):