EPrints Technical Mailing List Archive

Message: #01702


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: Solr package!!


On 11/mar/2013, at 12:05, Tim Brody <tdb2@ecs.soton.ac.uk> wrote:

> On Mon, 2013-03-11 at 10:52 +0100, Stefano Cecere wrote:
>> dear EP techs
>> 
>> we were already testing new migrations to DSpace, because of the poor search engine in EPrints..
>> when today i discover this: http://bazaar.eprints.org/273/ (a Solr integration package)
>> 
>> has anybody already tested it?
>> Denis: what limitations has this integration?
>> reading the code now it seems to index just the titles
>> ever tried to index file's contents (pdf) ?
> 
> Hi,
> 
> It would be helpful to expand on "poor search engine". Have you tried
> using the Xapian engine? What search activities do you need to do that
> you can't at the moment? Do you have an existing corporate search
> appliance that you could integrate with EPrints?

Hi Tim
yes i'm using Xapian since it was available in ePrints.


> It is most helpful to us (EPrints) to know the problem you're trying to
> solve, rather than focusing on a specific feature. For instance, you may
> be trying to solve a query with a search, that may be better solved by
> creating a custom report (Export).

uh.. i have always been thinking that you knew that ePrints search engine is its weakest point, but that it was ok for you.

i did already report suggestions and comments about the ePrints searching experience in last years

what i really need to achieve is a google like search within thousands of texts.
and since Solr it's possible!
i have been using Solr integrated with other platforms (PHP CMS), and it's not difficult to implement advanced facetting, synonymous, "you intended..", auto complete, similar, result grouping and highlighting in summary listing..
since Solr is doing all the mechanics, its client has just to inject the document with relevant meta data.
it already comes with interpreters for most document types.

this is something that DSpace has begun to do (with 3.0), and many platforms are abandoning their custom searches, integrating good connections with Solr (or ElasticSearch)

PS: recently i could live with ePrints because i did inject all it's data into VuFind 2.0 (which uses Solr as base search engine)
but it's a mess to explain to the users the 2 platforms!

so we have begun to experiment with alternatives!

ciao

stefano







____    ___   __  _ _______________________

Stefano Cecere
Multimedia Archive developer
Centro Studi Umanisti KRUR - Firenze
stefano.cecere@krur.com



Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail