EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #08795
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] Altmetric explorer harvest
- To: <eprints-tech@ecs.soton.ac.uk>, Ranju Upadhyay <Ranju.Upadhyay@mu.ie>
- Subject: Re: [EP-tech] Altmetric explorer harvest
- From: David R Newman <drn@ecs.soton.ac.uk>
- Date: Wed, 24 Nov 2021 23:57:27 +0000
Hi Ranju,
It looks like you have unsupported (in XML) ASCII characters like Vertical Tab (0x0B) in your abstracts. You could make the changes in the following Git commit to fix the issue for the oai_dc format, which is the one that Altmetric would use to import from your repository's OAI-PMH interface:
https://github.com/eprints/eprints3.4/commit/94b2b57bb13796b812f516f0f457b43dccd047c2
It also introduces a new EPrints::XML:remove_invalid_chars function to more general tidy up ASCII characters that cannot be represented in XML, mainly control codes [1]. This does not fix the rdf and mets formats in OAI-PMH but I don't think these are generally used. If I get a chance, I may look into whether these can be similarly fixed.
Regards
David Newman
[1] https://en.wikipedia.org/wiki/C0_and_C1_control_codes
CAUTION: This e-mail originated outside the University of Southampton.Hi Team,
Our University has purchased Altmetric Explorer and now they want to harvest metadata from our IR
But they seem to have come across some items (I think old ones) that have some characters that are corrupt or not encoded properly in some fileds (mostly in Abstract field) and that gives invalid XML error, when they try to harvest. There are several thousand items in our IR so bit difficult for me to check what items might have that issue, is there any script or something that I could run against our items and spot those ?
Any help is appreciated.
Best regards,
Ranju
Ranju Upadhyay Rai
Library Programmer
University Library
Ollscoil Mhá Nuad, Maigh Nuad, Co. Chill Dara, Éire, W23 VP22
Maynooth University, Maynooth, Co. Kildare, Ireland, W23 VP22
T: +353 1 708 3378 M: +353 87 98 43811
All personal data received by the Library will be held safely and securely in compliance with the EU General Data Protection Regulation (GDPR) and the Data Protection Act (Ireland) 2018. The Library may retain personal data for operational, statistical and archival purposes. For further information please consult the University's Data Protection Policy: www.maynoothuniversity.ie/data-protection or contact the MU Data Protection Officer, at dataprotection@mu.ie.
An t-eolas pearsanta ar fad a fhaigheann an Leabharlann bíonn sé slán agus sabháilte faoina coimirce ar mhaithe le Rialachán Ginearálta maidir le Cosaint Sonraí an Aontais Eorpaigh agus leis an Acht um Chosaint Sonraí (Éire) 2018 a chomhlíonadh. D’fhéadfadh an Leabharlann eolas pearsanta a choimead ar chúiseanna oibríochta, staitistiúil agus cartlainne. Le haghaidh tuilleadh eolais, téigh i gcomhairle le Polasaí na hOllscoile um Chosaint Sonraí: www.maynoothuniversity.ie/data-protection nó téigh i dteagmháil le hOifigeach Cosanta Sonraí na hOllscoile ag dataprotection@mu.ie.
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech *** Archive: http://www.eprints.org/tech.php/ *** EPrints community wiki: http://wiki.eprints.org/
- References:
- [EP-tech] Altmetric explorer harvest
- From: Ranju Upadhyay <Ranju.Upadhyay@mu.ie>
- [EP-tech] Altmetric explorer harvest
- Prev by Date: [EP-tech] Altmetric explorer harvest
- Next by Date: Re: [EP-tech] [EXTERNAL] Re: Altmetric explorer harvest
- Previous by thread: [EP-tech] EPrints/CRIS
- Next by thread: [EP-tech] DOI handling in orcid_support_advance
- Index(es):