EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #08576


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] OAI2 deleted vs destroyed records


Hi John,

I will look into whether this can be incorporated into future version of EPrints.

https://github.com/eprints/eprints3.4/issues/145

I think we need something a little bit more sophisticated for general purposes.  So that there is a permission (e.g. eprint/remove_been_in_archive) that can be set, probably just on an individual admin user basis, for the very rare cases when privacy or confidentiality require full removal of a live item.  It would also be useful to make even this 'super' admin user aware that they really do not want to fully remove a (once) live unless there is no other option.

Regards

Davud Newman

On 15/04/2021 17:00, John Salter via Eprints-tech wrote:
CAUTION: This e-mail originated outside the University of Southampton.

One option to prevent this happening is to alias the EPrints::Plugin::Screen::EPrint::Remove plugin with something like this - which checks for the datestamp being set - if it is set, then the item has been live at some point, and should be retained for OAI-PMH incremental harvesting purposes.

 

######################################

package EPrints::Plugin::Screen::EPrint::WreoRemove;

our @ISA = ( 'EPrints::Plugin::Screen::EPrint::Remove' );

 

use strict;

 

sub can_be_viewed

{

        my( $self ) = @_;

 

        return 0 unless $self->could_obtain_eprint_lock;

 

        # 2016-11-01 JLRS Add check for this item ever being live.

        return 0 if( $self->{processor}->{eprint}->exists_and_set( "datestamp" ) );

 

        return $self->allow( "eprint/remove" );

}

 

1;

 

######################################

Cheers,

John

 

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Alan.Stiles via Eprints-tech
Sent: 15 April 2021 16:38
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] OAI2 deleted vs destroyed records

 

CAUTION: This e-mail originated outside the University of Southampton.

Thanks all for the responses – I’ve suggested to our Primo folks that they need to set a full re-harvest running, and we’re considering how tightly to restrict the destroy option. As you say David, shouldn’t be a problem for items that have never been made live, but repository admins have been known to use it when they should really be using ‘delete’, so removing temptation or confusion may be simpler.

 

Possibly remove it for everyone and only allow people the option to ‘delete’ stuff. Then schedule a job to check items ‘deleted’ within a particular timeframe (e.g. between 6 and 7 months ago?) and only destroy a record which has never been put live.

 

Alan

 

From: David R Newman <drn@ecs.soton.ac.uk>
Date: Thursday, 15 April 2021 at 16:27
To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>, "Alan. Stiles" <alan.stiles@open.ac.uk>
Subject: Re: [EP-tech] OAI2 deleted vs destroyed records

 

CAUTION: This mail comes from outside the University. Please consider this before opening attachments, clicking links, or acting on the content.

Hi Alan,

Items that have never been live should not appear in any sets.  Even if they do appear in a set, the individual item request should return idDoesNotExist if they have never been live. 

I was not sure whether Primo periodically proactively checks individual items it has imported to see if their metadata / status has changed.  From what Iain says, this sounds like it is not the case.

Regards

David Newman

On 15/04/2021 16:16, Alan.Stiles via Eprints-tech wrote:

CAUTION: This e-mail originated outside the University of Southampton.

Thanks David,

Unfortunately, it looks like historically the destroy option has been available to the repository admins alongside the delete option, so we have quite a few where that option has been taken, though I think mostly they are not records that have ever been live, which is probably okay.  I have a feeling I might be changing the permissions on the destroy option in the near future…

 

I’ll check with our folks who deal with Primo as to whether they can do anything at their end of things with regards to how Primo treats those records (if it even sees them – it might not see the destroyed record if it’s harvesting a specific set).

 

Alan

 

 

CAUTION: This mail comes from outside the University. Please consider this before opening attachments, clicking links, or acting on the content.

Hi Alan,

If previously live records have been completely removed rather than retired then you can expect bad things to happen.  If there is a specific privacy issue that means the record cannot even been retained in a restricted form (retired), then removing may be the only option.  However, the need for this should be vanishingly small and therefore issues like you describe with Primo should be few a far between and may require manual intervention.

The error code idDoesNotExist is deliberately returned by EPrints OAI interface when a record is removed, as in effect the record never existed.  All I can suggest is that Primo should treat getting back idDoesNotExist the same as getting back an item that is marked as deleted.  Obviously, you may want to be a bit more careful about what to do when getting back idDoesNotExist, in case there is some error in the request that mangles the ID so it cannot be found.  I have no idea how Primo could be configured to do this but as far as I can tell EPrints is behaving as it should; reporting completely removed items as not existing whereas retired items are reported as 'Deleted'.

Regards

David Newman

On 15/04/2021 15:27, Alan.Stiles via Eprints-tech wrote:

CAUTION: This e-mail originated outside the University of Southampton.

Hi all,

I feel like there was a discussion about this here a year or two ago but I can’t find it now.

 

Records in our repository that get flagged as deleted show up in the OAI feed as ‘Deleted’, but records that get completely removed (destroyed) show up as error code ‘idDoesNotExist’.

 

It appears that Primo (our library search product) isn’t doing anything about updating records from the feed (configured within Primo to explicitly harvest our repository), at least where it’s getting the error response.

 

Any clues as to whether this is a Primo problem or my problem to sort out?

 

Cheers,

Alan



*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/

 

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/