EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #04330


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: DSpace import plugin doesn't seem to parse UTF-8 correctly


Hi George,

Try this patch:

--- a/perl_lib/EPrints/Plugin/Import/DSpace.pm
+++ b/perl_lib/EPrints/Plugin/Import/DSpace.pm
@@ -235,7 +235,7 @@ sub retrieve_dcq
                return undef;
        }
 
-       my $dc = $self->find_dc_pairs( $r->content );
+       my $dc = $self->find_dc_pairs( $r->decoded_content );
        return undef unless defined $dc;
 
        $self->{errurl} = $self->{errmsg} = undef;

If it works, please pay it forward by submitting a bug report at http://github.com/eprints/eprints and adding the patch to the ticket as a proposed fix.

Thanks,

Tim

Timothy Miles-Board
Web & Repositories Development Specialist, University of London Computer Centre
020 7863 1342  |  07742 970 351  | timothy.miles-board@london.ac.uk | @drtjmb
The University of London is an exempt charity in England and Wales

________________________________________
From: eprints-tech-bounces@ecs.soton.ac.uk <eprints-tech-bounces@ecs.soton.ac.uk> on behalf of George Mamalakis <mamalos@eng.auth.gr>
Sent: 18 June 2015 12:16 PM
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] DSpace import plugin doesn't seem to parse UTF-8 correctly

Hello everybody,

I am trying to use the DSpace Import plugin from EPrints by giving a URL
to the web interface. While the system import the record, the character
sets seem to not be imported correctly. My import is performed from a
Greek DSpace server, and a test record where should be imported is this:
https://dspace.lib.uom.gr/handle/2159/323.

If you give the above URL into the DSpace import form, you'll see that
the record is imported but the character set is messed up.

Thanks all for your help in advance,

George.

--
George Mamalakis

IT and Security Officer,
Electrical and Computer Engineer (Aristotle Univ. of Thessaloniki),
PhD (Aristotle Univ. of Thessaloniki),
MSc (Imperial College of London)

School of Electrical and Computer Engineering
Aristotle University of Thessaloniki

phone number : +30 (2310) 994379


*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/