EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #08398

< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] DataCite/minting problem

CAUTION: This e-mail originated outside the University of Southampton.
Hi David,

Thank you so much for your help and advice, you're an EPrints hero!

It was the hyphen after all. I've gone through my XML generating script and regexed the "bad" hyphen for a "good" hyphen". I like to think I would have worked it out myself, but in all likelihood I wouldn't have. I'll look at a way to properly fix it in the new year. Encoding problems are not my favourite at all as it's a very grey area for me.

Not a bad thing to get good at considering all the APIs available...

Thanks again for your help, as usual it's genuinely appreciated. I'm hoping I might contribute something useful to this list one day!

On Fri, Dec 4, 2020 at 11:57 AM David R Newman <drn@ecs.soton.ac.uk> wrote:

Hi James,

At very cursory glance the hyphen used in 1189 looks like a long hyphen (e.g. an em dash) when as 1199 looks to be a regular hypen (i.e. typically on th key to the right of the 0 key on a regular keyboard).  I know that EPrints wraps up the metadata it sends to DataCite in XML.  It is possible that an encoding issue caused a problem with generating valid XML to send to DataCite's API.

I would try fixing the long hyphen in the title (for 1189) and also checking author's names and other fields that may have special characters (i.e. cannot be type form a standard QWERTY UK/US keyboard) and replace these if you can.  I am not exactly sure off hand what fields you need to fix, although the abstract may be worth a check as well.

It is likely there is a failed event queue task for minting a DOI for 1189, so rather than clicking the "Coin DOI" button in the actions menu for this item, it is worth going into the event queue via the admin menu and resetting the appropriate task to "Waiting" and see if this runs successfully a second time.

In general, I have encountered encoding issues a lot when working with third party applications.  I have been trying to do a few things to address encoding issues in general for the next release of EPrints 3.4 (i.e. the wide character error message that often occurs in the Apache error logs) but I have ot had a chance to see if these improvements help with third-party applications like DataCite and Twitter.


David Newman

On 04/12/2020 11:17, James Kerwin via Eprints-tech wrote:
CAUTION: This e-mail originated outside the University of Southampton.
Hi everyone,

I hope we are all in good health and good spirits.

I'm having some difficulty with DataCite minting DOIs for a batch of records I uploaded to our data repository. I was awake until around 3am trying to sort it and did not get far.

These are records I imported myself by parsing an excel spreadsheet, writing the EPrints XML and importing them. I am almost 100% sure that the fault lies with me.  

Some are minting and some aren't. I did suspect that it may be characters in the title field such as brackets, colons, hyphens etc. but that doesn't appear to be the case. (Please excuse the state of the abstract pages, I'm working on it.) 

Oddly it populates the DOI field and shows on the abstract. The unminted DOIs are not on the DataCite dashboard.

Any pointers would be welcome, otherwise I might need to mint 100 DOIs manually, which I'm not super excited about.


*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/

Virus-free. www.avg.com