EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #03794


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Antwort: Re: ORCiD


User not equal creator! User accounts are in most cases submitter accounts, which aren't the same persons (usually research group secretaries, e.g. in a research environment) than the authors (researchers). There are usually way more (factor 10-100) different creators in a repo than user accounts.

The BORIS repo of University of Bern has an ORCID implementation written by Peter West.

However, as John or Lizz suggested, storing ORCID in the creators_id or a subfield doesn't solve the normalization problem the current EPrints data model has.

The current data model just supports an 1:n relation between bibliographic (eprint) records and authors.
This may result in duplicate and mismatching names (and e-mail addresses), leading to various problems (name searches finding only subsets of publications, creation of incomplete publication lists, IRStats2 statistics with non-aggregated subsets, to name a few). To give some indication of the severity of the problem: The ZORA repo has about 75K records with 280K authors, of which 110K are unique names (which are not unique persons). Using some matching algorithms and additional criteria, we can reduce these to about 77K persons with unique names.

The current data model also requires that one would have to enter the ORCID of a given author for every publication. This is wrong.

The relation between publications and authors is n:m (many-to-many).

Library catalogs solve this by providing an authority file for authors and a join table that connects publications and authors. In the authority file, attributes such as name variants (e.g. Sir Elton Hercules John = Reginald Kenneth Dwight) , date of birth and death, IDs such as ORCID can be stored. ORCID must be entered only once: One per author.

Martin

--
Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Winterthurerstr. 190
CH-8057 Zürich

mail: martin.braendle@id.uzh.ch
phone: +41 44 63 56705
fax: +41 44 63 54505
http://www.id.uzh.ch

Inactive hide details for Justin Bradley ---16/01/2015 15:16:27---I’d add it to the user records instead, so any creators_id vJustin Bradley ---16/01/2015 15:16:27---I’d add it to the user records instead, so any creators_id value can be associated with the user and

Von: Justin Bradley <jb4@ecs.soton.ac.uk>
An: eprints-tech@ecs.soton.ac.uk
Datum: 16/01/2015 15:16
Betreff: [EP-tech] Re: ORCiD
Gesendet von: eprints-tech-bounces@ecs.soton.ac.uk





I’d add it to the user records instead, so any creators_id value can be associated with the user and the user.orcid.
Obviously that depends on if you are modelling users, or just creators without user IDs.
Justin

On 16 Jan 2015, at 14:09, John Salter <J.Salter@leeds.ac.uk> wrote:

--
Justin Bradley
Senior Software Consultant

jb4@ecs.soton.ac.uk
EPrints Services
Bay 2, 3081, B32
University of Southampton




*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive:
http://www.eprints.org/tech.php/
*** EPrints community wiki:
http://wiki.eprints.org/
*** EPrints developers Forum:
http://forum.eprints.org/