EPrints Technical Mailing List Archive

Message: #05042


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: Possible importing BibTeX bug in monograph


I think I've worked it out!
Does this: https://github.com/eprints/eprints/pull/356/files
solve it (without the previous fix in place)?

The gory details
--------------------
The $epdata hash is populated in the convert_input method.
At the end of this method there is this:
https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/Plugin/Import/BibTeX.pm#L508
 - which calls '_decode_bibtex' on $epdata.

$epdata is a hash ref - so will be caught here: https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/Plugin/Import/BibTeX.pm#L524-L531

On line #528, the 'type' key is treated as a special case (I'm guessing as it's created in the convert_input method - and doesn't come directly from the BibTex file itself).
I propose that the 'monograph_type' is also a special case (it's always created in convert_input method; never copied from the BibTeX), and should not be passed to the '_decode_bibtex' function on line 529 either...

The previous suggestion just defines the monograph type in a way that it will decode to the right value.
Anything that uses $entry->field( "xxx" ) for its value *should* be decoded.

These lines:
https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/Plugin/Import/BibTeX.pm#L452-L456
are worth a mention too - in case someone (in a few years time) maps something directly to 'monograph_type' from $entry->field - and doesn't realise that it will be skipped in the decoding.

Cheers,
John

-----Original Message-----
From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of George Mamalakis
Sent: 30 October 2015 11:09
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] Re: Possible importing BibTeX bug in monograph

Hi John,

I tried importing the BibTeX file through the web form, as you 
suggested, and it breaks the same way. Good question though!

On 30/10/2015 11:43 πμ, John Salter wrote:
> George,
> One quick question: does the import break in the same way if you try to import from the commandline and the Import screen?
>
> Cheers,
> John
>
> -----Original Message-----
> From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of George Mamalakis
> Sent: 29 October 2015 16:54
> To: eprints-tech@ecs.soton.ac.uk
> Subject: [EP-tech] Re: Possible importing BibTeX bug in monograph
>
> Bravo John!
>
> That was the problem. I tested it with your "patch" and it worked.
>
> The only thing that now remains to be answered is why was this
> deliberately changed...
>
> I'll file the bug report tomorrow (except if somebody else wishes to do
> it), I will propose your patch, and will mention your comments, so when
> they post an answer I'll come with feedback.
>
> Good afternoon everybody, and thanks again John!
>
> On 29/10/2015 06:37 μμ, John Salter wrote:
>> The default EPrints config:
>> https://github.com/eprints/eprints/blob/3.3/lib/defaultcfg/cfg.d/eprint_fields.pl#L105-L118
>> uses ' technical_report', not ' technicalreport' - which is the root of this issue.
>>
>> These lines:
>> https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/Plugin/Import/BibTeX.pm#L340-L344
>> should be mapping that value OK.
>>
>> It looks like somehow, the underscore is being stripped out?
>> In this commit: https://github.com/eprints/eprints/commit/cf70ab4ee1ce48e88c2eaee2a2b369b331ce4de9
>> the underscores were escaped with a '\\'.
>> I'm not sure why - or whey this was removed.
>>
>> Can you try altering this:
>> $epdata->{monograph_type} = "technical_report";
>> to this:
>> $epdata->{monograph_type} = "technical\\_report";
>>
>> to see if it helps?
>>
>> NB This is guess-work. I've got a 3.3.10 repo to tinker with at the moment - before this change was made.
>>    
>> Cheers,
>> John
>>
>>
>>
>> -----Original Message-----
>> From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Thomas Lauke
>> Sent: 29 October 2015 16:08
>> To: eprints-tech@ecs.soton.ac.uk
>> Subject: [EP-tech] Re: Possible importing BibTeX bug in monograph
>>
>> Hi George,
>>
>>> you must see it's empty.
>> no, it's filled :)
>> Please check your XML export carefully (not in the morning ;)):
>> It should contain the language independent
>> <monograph_type>technicalreport</monograph_type> ...!?
>>
>>> If you confirm this (or anybody else), I'll file a bug report to EPrints
>> the one and only thing that's missing now:
>> <EprintsPath>/archives/<repoID>/cfg/lang/en/phrases/local.xml
>> should contain a line with
>> <epp:phrase id="eprint_fieldopt_monograph_type_technicalreport" ref="Technical report" />
>> Probably all other options are missing also ...!?
>>
>> This line should be integrated into (maybe implied by these lines ...)
>> <EprintsPath>/lib/lang/en/phrases/system.xml.
>>
>> Hth
>> TL
>>
>> BTW: Some additional lines missing in <EprintsPath>/lib/lang/en/phrases/system.xml follows:
>> eprint_fieldhelp_editors (en) at line 477 in /usr/share/eprints3/perl_lib/EPrints/MetaField.pm
>> <epp:phrase id="eprint_fieldhelp_editors">Specify the responsible person for publication.</epp:phrase>
>>
>> lib/searchfield:help_counter (en) at line 757 in /usr/share/eprints3/perl_lib/EPrints/Search/Field.pm
>> <epp:phrase id="lib/searchfield:help_counter">Enter a specific number or a range (Min.., Min..Max, ..Max) for the unique database identifier.</epp:phrase>
>>
>> ... and maybe (unfortunately I couldn't recover my modifications) due to phrase renaming
>> <epp:phrase id="eprint:workflow:stage:type:title">Type</epp:phrase>
>>       <epp:phrase id="eprint:workflow:stage:files:title">Upload</epp:phrase>
>>       <epp:phrase id="eprint:workflow:stage:core:title">Details</epp:phrase>
>>       <epp:phrase id="eprint:workflow:stage:pubinfo:title">Publication</epp:phrase>
>>       <epp:phrase id="eprint:workflow:stage:status:title">Status</epp:phrase>
>>       <epp:phrase id="eprint:workflow:stage:event:title">Event</epp:phrase>
>>       <epp:phrase id="eprint:workflow:stage:abstract:title">Abstract</epp:phrase>
>>       <epp:phrase id="eprint:workflow:stage:subjects:title">Subjects</epp:phrase>
>>       <epp:phrase id="eprint:workflow:stage:notes:title">Extras</epp:phrase>
>>       <epp:phrase id="eprint:workflow:stage:local:title">Custom</epp:phrase>
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/
>> *** EPrints developers Forum: http://forum.eprints.org/
>>
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/
>> *** EPrints developers Forum: http://forum.eprints.org/
>


-- 
George Mamalakis

IT and Security Officer,
Electrical and Computer Engineer (Aristotle Univ. of Thessaloniki),
PhD (Aristotle Univ. of Thessaloniki),
MSc (Imperial College of London)

School of Electrical and Computer Engineering
Aristotle University of Thessaloniki

phone number : +30 (2310) 994379



*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/