EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #08795


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Altmetric explorer harvest


Hi Ranju,


It looks like you have unsupported (in XML) ASCII characters like Vertical Tab (0x0B) in your abstracts.  You could make the changes in the following Git commit to fix the issue for the oai_dc format, which is the one that Altmetric would use to import from your repository's OAI-PMH interface:


https://github.com/eprints/eprints3.4/commit/94b2b57bb13796b812f516f0f457b43dccd047c2


It also introduces a new EPrints::XML:remove_invalid_chars function to more general tidy up ASCII characters that cannot be represented in XML, mainly control codes [1].  This does not fix the rdf and mets formats in OAI-PMH but I don't think these are generally used.  If I get a chance, I may look into whether these can be similarly fixed.


Regards


David Newman

[1] https://en.wikipedia.org/wiki/C0_and_C1_control_codes


On 24/11/2021 15:00, Ranju Upadhyay via Eprints-tech wrote:
CAUTION: This e-mail originated outside the University of Southampton.
Hi Team,

Our University has purchased  Altmetric Explorer and now they want to harvest metadata from our IR

But they seem to have come across some items (I think old ones) that have some characters that are corrupt or not encoded properly in some fileds (mostly in Abstract field) and that gives invalid XML error, when they try to harvest. There are several thousand items in our IR so bit difficult for me to check what items might have that issue, is there any script or something that I could run against our items and spot those ?


Any help is appreciated.

Best regards,
Ranju

Ranju Upadhyay Rai

Library Programmer

University Library

 

Ollscoil Mhá Nuad, Maigh Nuad, Co. Chill Dara, Éire, W23 VP22

Maynooth University, Maynooth, Co. Kildare, Ireland, W23 VP22


T: +353 1 708 3378  M: +353 87 98 43811

Ranju.Upadhyay@mu.ie W: www.maynoothuniversity.ie/library


All personal data received by the Library will be held safely and securely in compliance with the EU General Data Protection Regulation (GDPR) and the Data Protection Act (Ireland) 2018. The Library may retain personal data for operational, statistical and archival purposes. For further information please consult the University's Data Protection Policy: www.maynoothuniversity.ie/data-protection or contact the MU Data Protection Officer, at dataprotection@mu.ie.

 

An t-eolas pearsanta ar fad a fhaigheann an Leabharlann bíonn sé slán agus sabháilte faoina coimirce ar mhaithe le Rialachán Ginearálta maidir le Cosaint Sonraí an Aontais Eorpaigh agus leis an Acht um Chosaint Sonraí (Éire) 2018 a chomhlíonadh. D’fhéadfadh an Leabharlann eolas pearsanta a choimead ar chúiseanna oibríochta, staitistiúil agus cartlainne. Le haghaidh tuilleadh eolais, téigh i gcomhairle le Polasaí  na hOllscoile um Chosaint Sonraí: www.maynoothuniversity.ie/data-protection nó téigh i dteagmháil le hOifigeach Cosanta Sonraí na hOllscoile ag dataprotection@mu.ie.



*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/

Virus-free. www.avg.com