EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #03830

< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Antwort: Re: {Disarmed} International characters in advanced search fail for XML-Export

Dear Adam,

because Jens is out of the office for a few days, I jump in.

I take one of the records that is found by the saved search and where there is a German umlaut in the creator's name (Krüger, G):

bin/export zora eprint XMLforCMS2 95663

just exports fine, and we obtain the expected XML:

<?xml version="1.0" encoding="utf-8" ?>
<eprints xmlns="http://eprints.org/ep2/data/2.0">
    <eprint id="http://www.zora.uzh.ch/id/eprint/95663">
      <title>Krabben, Würmer, Schwein und Hund. Wie machen Tiere Geschichte?</title>
      <creators__editors_if_edited_scientific_work>Krüger, Gesine</creators__editors_if_edited_scientific_work>
      <first_creator__or__first_editor_if_edited_scientific_work>Krüger, Gesine</first_creator__or__first_editor_if_edited_scientific_work>
      <type_in_text>Book Section</type_in_text>
      <citation>Krüger, Gesine (2014). &lt;a href="" target="_blank" class="uzh" title="zoracitationlink 95663"&gt;Krabben, Würmer, Schwein und Hund. Wie machen Tiere Geschichte?&lt;/a&gt; In: Grumblies, Florian; Weise, Anton. Unterdrückung und Emanzipation in der Weltgeschichte. Zum Ringen um Freiheit, Kaffee und Deutungshoheit. Hannover, 26-41. ISBN 978-3-944342-47-4.</citation>

If we open the saved search with the "offending" umlaut (by clicking the link in the "Name of search" column), the search is execute and yields a result list.
You can then export the results by choosing an export plugin from the drop down menu. All export plugins (including XMLforCMS2) do work this way.

In the last column of the saved search table there is a special button that calls cgi/saved_search by passing savedsearch_id as parameter.
This button and the saved_search cgi script (seem to) have been extended by EPrints Services for us.

Jens has opened a support case with Justin to check this script - we assume that the problem is somewhere generated in the line

print $saved_search->make_searchexp->perform_search->export( $format );

when a "virtual" dataset is passed to the export plugin and there is an umlaut in the originating query.

This problem not only happens with the XMLforCMS2 export - it happens with any export format that is passed to the extended saved_search CGI script.

Best regards,


Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Winterthurerstr. 190
CH-8057 Zürich

mail: martin.braendle@id.uzh.ch
phone: +41 44 63 56705
fax: +41 44 63 54505

Inactive hide details for "Field A.N." ---22/01/2015 15:56:15---What happens if you export the record on the command line? --"Field A.N." ---22/01/2015 15:56:15---What happens if you export the record on the command line? --

Von: "Field A.N." <af05v@ecs.soton.ac.uk>
An: eprints-tech@ecs.soton.ac.uk
Datum: 22/01/2015 15:56
Betreff: [EP-tech] Re: {Disarmed} International characters in advanced search fail for XML-Export
Gesendet von: eprints-tech-bounces@ecs.soton.ac.uk

What happens if you export the record on the command line?

Adam Field
Business Relationship Manager and Community Lead
EPrints Services

On 20 Jan 2015, at 16:13, jens.vieler@id.uzh.ch wrote:

> Hi together,
> (using ePrints V3.3.12)
> found a strange behaviour in combination Advanced Search / Saved Search / XML-Export whithin context of international characters: If we use a saved search on a author/creator with german Umlauts (international encoding), the XML-Export-Plugin returns an empty XML-Dataset. Database entry savedsearch|spec looks like smart utf8 to us (look at the bottom of this message).
> Does anybody know this behaviour ...or better know how to fix it? :)
> Cheers
>  Jens
> In detail:
> 1.) Creating an Advanced Searching for an author/creator WITHOUT German Umlauts (e.g. "Vieler")
> - Database shows spec:
> ?plugin=Internal&searchid=advanced&dataset=archive&exp=0%7C1%7C-date%2Fcreators_name%2Ftitle%7Carchive%7C-%7Ccreators_name%3Acreators_name%3AALL%3AEQ%3AVieler%7C-%7Ceprint_status%3Aeprint_status%3AANY%3AEQ%3Aarchive%7Cmetadata_visibility%3Ametadata_visibility%3AANY%3AEQ%3Ashow
> - Screen-View:
> will be redirected to
> MailScanner has detected a possible fraud attempt from "
www.zoratest.uzh.ch" claiming to be http://www.<eprint-server>.ch/cgi/search/archive/advanced?_action_search=1&dataset=archive&exp=0|1|-date%2Fcreators_name%2Ftitle|archive|-|creators_name%3Acreators_name%3AALL%3AEQ%3AVieler|-|eprint_status%3Aeprint_status%3AANY%3AEQ%3Aarchive&order=-date%2Fcreators_name%2Ftitle
> and works!
> - XML-Export for our CMS:
> will be redirected to
> MailScanner has detected a possible fraud attempt from "
www.zoratest.uzh.ch" claiming to be https://www.<eprint-server>.ch/cgi/saved_search/export_zora_XMLforCMS2.xml?savedsearchid=<savedsearch_id>&_action_export=1&_output=XMLforCMS2
> and works!
> 2.) Creating an Advanced Searching for an author/creator WITH German Umlauts (e.g. "Krüger,G")
> - Database shows spec:
> ?plugin=Internal&searchid=advanced&dataset=archive&exp=0%7C1%7C-date%2Fcreators_name%2Ftitle%7Carchive%7C-%7Ccreators_name%2Feditors_name%3Acreators_name%2Feditors_name%3AALL%3AEQ%3AKr%C3%BCger%2C+G%7C-%7Ceprint_status%3Aeprint_status%3AANY%3AEQ%3Aarchive%7Cmetadata_visibility%3Ametadata_visibility%3AANY%3AEQ%3Ashow
> (so "Kr%C3%BCger" looks like good old utf8 stuff to me)
> - Screen-View:
> will be redirected to
> MailScanner has detected a possible fraud attempt from "
www.zoratest.uzh.ch" claiming to be http://www.<eprint-server>.ch/cgi/search/archive/advanced?_action_search=1&dataset=archive&exp=0|1|-date%2Fcreators_name%2Ftitle|archive|-|creators_name%2Feditors_name%3Acreators_name%2Feditors_name%3AALL%3AEQ%3AKr%C3%BCger%2C+G|-|eprint_status%3Aeprint_status%3AANY%3AEQ%3Aarchive&order=-date%2Fcreators_name%2Ftitle
> and works!
> - XML-Export for our CMS:
> will be redirected to
> MailScanner has detected a possible fraud attempt from "
www.zoratest.uzh.ch" claiming to be https://www.<eprint-server>.ch/cgi/saved_search/export_zora_XMLforCMS2.xml?savedsearchid=<savedsearch_id>&_action_export=1&_output=XMLforCMS2
> and fails... let's say: It's empty:
> <?xml version="1.0" encoding="utf-8" ?>
> <eprints xmlns="
> </eprints>
> --
> Jens Vieler
> Informatikdienste
> Universität Zürich
> Winterthurerstr. 190
> CH-8057 Zürich
> mail:  jens.vieler@id.uzh.ch
> phone: +41 44 63 56777
> *** Options:
> *** Archive:
> *** EPrints community wiki:
> *** EPrints developers Forum:

*** Options:
*** Archive:
*** EPrints community wiki:
*** EPrints developers Forum: