EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #02383

[EP-tech] indexer/tokenizer config

To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
Subject: [EP-tech] indexer/tokenizer config
From: Matthew Kerwin <matthew.kerwin@qut.edu.au>
Date: Tue, 12 Nov 2013 10:28:05 +1000

Hi EPrints world,

I was having a look at the prevalence of the common typo “seperator” (for “separator”) in EPrints trunk, and discovered that perl_lib/EPrints/Index/Tokenizer.pm defines a hashref $EPrints::Index::FREETEXT_SEPERATOR_CHARS which is almost exactly the same as the config value $c->{indexing}->{freetext_seperator_chars} in cfg.d/indexing.pl, however the former includes an extra character, and I can’t work out how either of them are referenced (if at all) in the codebase.

Could someone provide some clarification here on which is used where/how, and how they could be cleaned up or better integrated?

Cheers

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Prev by Date: [EP-tech] Re: IRstat
Next by Date: [EP-tech] Re: OAI filters
Previous by thread: [EP-tech] IRstat
Next by thread: [EP-tech] Re: OAI filters
Index(es):
- Date
- Thread