EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #01980


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: Documentation for EPrint searches


Ian,

The syntax below works on 3.2+:

$session->dataset( 'eprint' )->search( filters => [ ...... ] )->map( ... );

or even, to search the entire dataset:

$session->dataset( 'eprint' )->search->map( ... );


But the "match => 'SET'" filter will not, as this was added in 3.3 (you'd need perl_lib/EPrints/Search/Condition/IsNotNull.pm and some "glue" in between). In other words, you cannot filter on a "defined" value in 3.2, you'll need to filter your subset down by hand e.g.:

Condition 1: userid = 1234
_AND_
Condition 2: creators_id is set

$session->dataset( 'eprint' )->search(
    # Condition1
filters => [ { meta_fields => [ 'userid' ], value => '1234', match => 'EX' }]
)->map( sub {

    my( undef, undef, $eprint ) = @_;

    # _AND_ Condition 2
    return if( !$eprint->is_set( 'creators_id' ) );

    # do what you need to do here...

} );

You may adapt your code to the above syntax (preferred since 3.2):

my $results1 = $session->dataset( 'archive' )->search(
    filters => [
        { meta_fields => [ 'foo' ], value => 'Ian' },
        { meta_fields => [ 'lastmod' ], value => $date }
] );

print $results1->count." records modified\n" if ( $noise > 0 );


# ...->reorder( 'eprintid' ) is equivalent to your sort()

$results1->reorder( 'eprintid' )->map( sub {

    my( undef, undef, $eprint ) = @_;

    print $eprint->export( 'Text' );
    print "\n------------------\n";

} );

# You don't need to call $results1->dispose() anymore


Note that I haven't tested that code above, but that's the general idea. Also Time::Calc works pretty well to do time differences ("last 3 months") but I'm not sure it comes with the standard perl install.

Seb.


On 28/05/13 16:13, Ian Stuart wrote:
Cheers Seb....

How about 3.2 (half of us are still old-fashioned that way :chuckle: )

This is the very basics that I already know:

# Right... which datasets are we searching?
my $dso = $session->get_repository->get_dataset("archive");
# Set up a search object
my $searchexp1 = EPrints::Search->new(
    satisfy_all => 1,
    session     => $session,
    dataset     => $dso,
);
# Find all the eprints where "foo" is "Ian"
$searchexp1->add_field( $dso->get_field("foo"), 'Ian' );
# now search for all eprints where we have "last modified" date of
# yesterday (using Time::Piece & Time::Seconds here)
my $time  = localtime;
my $diff  = 3;  # Number of months the report covers
my $sdate = $time->strftime("%Y-%m-%d");
$time -= ( ONE_MONTH * $diff );
my $edate = $time->strftime("%Y-%m-%d");
my $date  = "${edate}-${sdate}";
$searchexp1->add_field( $dso->get_field("lastmod"), $date );

# do the search
my $results1 = $searchexp1->perform_search;

my $total_new = $results1->count;
print "$total_new records modified\n" if ( $noise > 0 );

# This is where do something with the results from our search
foreach my $epid ( sort @{ $results1->ids } ) {
    my $ep = $dso->dataobj($epid);
    print $ep->export('Text');
    print "\n---------------\n";
} ## end foreach my $epid ( sort @{ ...})
# and tidy up
$results1->dispose;


I'd still like to know how to restrict the search to only include
records where a particular field is defined (such as a creators id
[email address])


On 28/05/13 15:37, Sebastien Francois wrote:
Hi Ian,

On recent versions of EPrints (this was introduced in the 3.3 branch but
I'm not sure when exactly) - I tested on a 3.3.11:

my $list = $session->dataset( 'eprint' )->search(
           filters => [
                   { meta_fields => [ 'abstract' ], match => 'SET' },
                   { meta_fields => [ 'title' ], value => 'Fred', match =>
'IN' }
                   ] );

print STDERR "Found ".$list->count." results\n";

I personally like to nest the calls:

$session->dataset( 'eprint' )->search(
           filters => [
                   { meta_fields => [ 'abstract' ], match => 'SET' },
                   { meta_fields => [ 'title' ], value => 'Fred', match =>
'IN' }
] )->map( sub {

       my( undef, undef, $eprint ) = @_;

       # do something with $eprint

} );

otherwise use $list->map( ... );

Seb.

On 28/05/13 15:17, Ian Stuart wrote:
I'm sure I ask this every 6 months.... and I keep searching the net &
not find anything...

Is there any documentation on creating searches in EPrints?

I want to find all the eprints where field A is not empty and field B
has the value "Fred"

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/