EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #01526


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: Cleaning database up


Hi David,

Try to do it in PERL using the EPrints API...

Since you're using 3.1.3 at Lincoln, the script will look like:


#!/usr/bin/perl -w -I/opt/eprints3/perl_lib

use strict;
use EPrints;

my $session = new EPrints::Session( 1, $ARGV[0] ) or die( "$0 repository_id" );

my $userds = $session->get_repository->get_dataset( 'user' );

$userds->map( $session, sub {

    my( undef, undef, $user ) = @_;

    ## do your own tests here on $user...
    # return if ...

    ## remove all eprints that are in the "inbox" (user area):
    # my $eprints = $user->get_eprints;
    # $eprints->map( sub {
    #
    #    my( undef, undef, $eprint ) = @_;
    #
    #    return if $eprint->get_value( 'eprint_status' ) ne 'inbox';
    #
    #    $eprint->remove();
    #
    # } );

## and delete the account if the user matches (not that this also removes the saved searches for that user)
    # $user->remove();

} );

$session->terminate;
exit;


Note that I've commented out the bit that actually removes eprint/user objects - I let you customise it!

If I were you, I wouldn't remove the users that have some eprints in the repository - not sure how EPrints will cope will orphan objects.

Seb.


On 06/02/13 14:12, David Whitehead wrote:
Hi All,

at the current time the database that we have for EPrints is getting quite big with users that are no longer ever going to use the EPrints site again. I am looking at making a quick script which will go through the database and remove users if they have are no longer attached to an EPrint or they do not meet certain criteria i.e. join_date etc. I just wanted to quickly ask, how many different tables would I need to remove the user entry on or check for that matter to ensure its a clean deletion from the database. The current tables I know I will need to look at and clean is as followed:-

User table
eprints table
saved_search table

But is there any tool or method in /bin/epadmin for example which will clean the rest of the records if I do delete from the tables listed above. Any help or information you could supply would be of great helps, thanks all,

Kind regards,

David Whitehead.

The University of Lincoln - a top performer in student satisfaction, enjoying an unrivalled ascent through the University league tables, set in a dynamic, research rich and vibrant campus in the heart of a great historic student-friendly city.

The information in this e-mail and any attachments may be confidential. If you have received this email in error please notify the sender immediately and remove it from your system. Do not disclose the contents to another person or take copies.
Email is not secure and may contain viruses. The University of Lincoln makes every effort to ensure email is sent without viruses, but cannot guarantee this and recommends recipients take appropriate precautions. The University may monitor email traffic data and content in accordance with its policies and English law. Further information can be found at: http://www.lincoln.ac.uk/legal.

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/