EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #04569

< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] duplicate detection in EPrints 3.3

I would like to run a script that will go through my repository (3.3.12) and report any likely duplicates based on title (and possibly author).

What is the best way of doing this?


I found the following two plugins in EPrints Files:

·         Sebastien Francois’ EPrints 2 script: http://files.eprints.org/107/

·         Jon Hallet’s EPrints 3>3.1  script: http://files.eprints.org/640/


In addition,

·         There is a title_duplicates script in /cgi/users/lookup/ http://wiki.eprints.org/w/Cgi/users/lookup/

·         Page 40 of this file (http://www.eprints.org/software/training/programming/api_techniques.pdf)  refers to a duplicate detection script in the bin folder as an example – I couldn’t find this script – probably just an example of what could be done.



Is the Jon Hallett’s script in EPrints Files the most up-to-date version available?

Has anyone created a Bazaar version for duplicate detection and/or is there is something more recent that I am missing?







Tomasz Neugebauer
Digital Projects & Systems Development Librarian
Libraries / Bibliothèques
Concordia University / Université Concordia