EPrints Technical Mailing List Archive
Message: #07655
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] Thesis Bulk Upload/Import
- To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>, "James Kerwin" <jkerwin2101@gmail.com>
- Subject: Re: [EP-tech] Thesis Bulk Upload/Import
- From: "Newman D.R." <drn@ecs.soton.ac.uk>
- Date: Thu, 17 Jan 2019 10:55:48 +0000
Hi James, In answer to your questions: 1) I think there are various reasons that institutions have separate theses repositories: - They want to showcase their theses with different branding/theme to their normal repository. Although different branding can be applied to different archives within one repository, it requires more advanced knowledge of configuring EPrints. It may also still lead to issue if you change a core element and do not realise that will be inherited by the thesis branding/theme. - Some institutions will restrict access to their main repository so only staff can submit. As theses may be submitted by the students who have written then having a separate repository can facilitate this. It can also more easily facilitates a different process of review, as I have observed in several repositories. - Sometimes I think this just comes down to a political decision that the institution wants to be keep theses separate. I can imagine that in the UK although it should not be a problem, in some people's eye it will simplify the REF (Research Excellence Framework) process, as theses would not generally be REF returnable. 2+3) The best way to import as much metadata with the highest accuracy possible is to use EPrints XML import. This allows you to submit multiple publications at once. It will also import documents submitted by URL in the metadata as long as those URLs are freely accessible. The problem with using EPrints XML import (assuming you are exporting from one EPrints repository to import), is that you may lose metadata or even have trouble importing if the field are either not in the importing repository or are of a different type (e.g. a free text field vs a multiple value field). EPrints XML import provides a facility to test whether an import would be successful without actually importing. Also, the XML schema for a repository can be found at /cgi/schema if you want to craft your own EPrints XML from the metadata source you already have. Other formats to EPrints XML may be more suited to your purpose if you are importing from a non-EPrints source. I do not have a huge amount of experience working with these other formats. Maybe others could advise. You can use EPrints CRUD API to import in any format if you want to automate the process: https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.eprints.org%2Fw%2FAPI%3AEPrints%2FApache%2FCRUD&data=01%7C01%7C%7C3b49ebb635874fcfd9b508d67c6a56a8%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=KYAlw2KV%2F9JQ5KPVP%2BnpkHx0kHuN1abrmdoG%2B5UFU6g%3D&reserved=0 This also allows you to push documents rather than have them pulled from an accessible URL. Regards David Newman On Thu, 2019-01-17 at 10:09 +0000, James Kerwin via Eprints-tech wrote: > Hi All, > > The University I work at is currently exploring options for > digitising our collection of theses, with an aim of them going into > the institutional repository and I have some questions if anybody > could lend me some of their experience and opinions. > > 1) I've noticed some organisations have a separate instance of > EPrints for theses. We currently put each thesis into the > institutional repository along with all other types of item. Is there > a benefit to separating them out? > > 2) Does EPrints facilitate any sort of bulk upload of Documents and > EPrint record creation? I've had a quick look around and found the > following from Tomasz Neugebauer and Bin Han: > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.researchgate.net%2Fpublication%2F291251891_Batch_Ingesting_in&data=01%7C01%7C%7C3b49ebb635874fcfd9b508d67c6a56a8%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=rRR5KC%2BR%2FB0wP6GuvC2T2RZmUeSejXCymDZJjQzuYHk%3D&reserved=0 > to_EPrints_Digital_Repository_Software > > I'm curious to see if this is still relevant (it's very thorough) or > if there are any other methods or potential pitfalls to avoid. > > 3) Following on from Q2, is there a preferred/ideal format of > metadata? The article makes it clear that many different formats are > supported, but again I'm wondering if there are any pros and cons to > any particular format. > > The digitising won't be complete for some time so I'm taking the > opportunity to get ahead of it and be ready. > > Thanks, > James > > *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints- > tech > *** Archive: https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&data=01%7C01%7C%7C3b49ebb635874fcfd9b508d67c6a56a8%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=VCDZ1sR66ByRZ%2BWIOqPsQxrge9wPg2AEG4tFEL%2FU1d8%3D&reserved=0 > *** EPrints community wiki: https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&data=01%7C01%7C%7C3b49ebb635874fcfd9b508d67c6a56a8%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=Hy5jflNtkOJlDpN6j9xx%2Bm5pHqz3bDp%2BMUtYASgEcVY%3D&reserved=0 > *** EPrints developers Forum: https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&data=01%7C01%7C%7C3b49ebb635874fcfd9b508d67c6a56a8%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=SUsV28TkfIVF3BgPI0GAXbhuScAF2RyFHtZBc68KMT4%3D&reserved=0
- References:
- [EP-tech] Thesis Bulk Upload/Import
- From: James Kerwin <jkerwin2101@gmail.com>
- [EP-tech] Thesis Bulk Upload/Import
- Prev by Date: [EP-tech] Thesis Bulk Upload/Import
- Next by Date: [EP-tech] Antwort: Thesis Bulk Upload/Import
- Previous by thread: [EP-tech] Sort view with creators_name and corp_creators
- Index(es):