EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #09951


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] How to check deleted/missing uploaded files?


Hi Agung,

I am not sure what you mean by "missing":

1. An eprint record that has not had any documents uploaded to it.
2. An eprint record that has had one or more documents uploaded to it but the files are not present on the filesystem where they are expected to be.

For 1, finding a negative using the EPrints API is a little tricky.  I would assume you only care about items in the live archive and review buffer.  As retired items are no longer relevant and if they are in the user inbox they may not have yet uploaded a file.  So first I would use search to find all these eprints:
$ds = $repo->dataset( "eprint" );
$list = $ds->search(filters => [{
	meta_fields => [qw( eprint_status )], value => "archive buffer",
}]);
I would then run the map function over the list to get it to print out the eprint IDs that have no documents.
sub fn {
    my( $session, $dataset, $eprint, $eprints ) = @_;
    push @$eprints, $eprint if scalar @{ $eprint->get_all_documents } == 0;
};
my $eprints = [];
$list->map( \&fn, $eprints );
You can then use the set of eprints in $eprints as you choose.  If you want editors/admins to be able to view this, then you probably want to install the Generic Reporting Framework plugin [A] and build a custom report.  I would advise using the Example report [B] as a template to create your own report.  Be sure to enable your new report with the following in a configuration file under your archive's cfg/cfg/d/ directory:
$c->{plugins}{"Screen::Report::YOUR_REPORT_NAME"}{params}{disable} = 0;
As there is an issue with searching for a negative, you will need to just use the "filters" function to get all eprints in the live archive and review buffer and then run an items function like:
sub items
{
	my( $self ) = @_;

	my $items = $self->SUPER::items;

	my $eprint_ids = [];
	$items->map( \&no_documents, $eprint_ids );
	
	my $order = defined $self->{processor}->{sort} ? $self->{processor}->{sort} : $self->param( 'custom_order' );
	my $new_items = EPrints::List->new( repository => $self->repository, dataset => $self->{processor}->{dataset}, ids => $eprint_ids, [order => $order] ); 

	return $new_items;
}

sub no_documents {
    my( $session, $dataset, $eprint, $eprint_ids ) = @_;
    push @$eprint_ids, $eprint->id if scalar( $eprint->get_all_documents() ) eq 0;
};
I have only written this off the top of my head, so it is completely untested and therefore will likely need a bit of tidying up to make it work.

If by missing you mean 2, there are a number of different approached you may need to use depending on the issue.  Sometimes the file is recorded in the database but missing on disk, sometimes it can be missing in both.  I don't think it is worth going into this until I know what you mean by missing, as what I have explain above may have already answered your question.

Regards

David Newman

[A] https://bazaar.eprints.org/1105/
[B] https://bazaar.eprints.org/1105/1/plugins/EPrints/Plugin/Screen/Report/Example.pm

On 26/01/2025 2:43 am, Agung Prasetyo W. wrote:
CAUTION: This e-mail originated outside the University of Southampton.
CAUTION: This e-mail originated outside the University of Southampton.
Hi,

Is there a way to check for missing uploaded files? so that they can be re-uploaded by the administrator/editor.

Thank you.

Regards,
Agung PW

*** Options: https://wiki.eprints.org/w/Eprints-tech_Mailing_List
*** Archive: https://www.eprints.org/tech.php/
*** EPrints community wiki: https://wiki.eprints.org/