EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #02695


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: harvester (question)


Hi,

no, hélas,  I already tried :
/appli/eprints/bin/harvest oatao --plugin=OAIPMH::OAI_TEF --conf=stub4
Are you sure you want to make bulk changes to the eprint table in the oatao repository [yes/no] ? yes
XPath error : Undefined namespace prefix
 error : xmlXPathCompiledEval: evaluation failed

Thanks you for the idea.

Jean-Marie

Le 03/03/2014 10:07, John Salter a écrit :

Hi Jean-Marie,

I think it’s probably a namespace problem (see references below). If you try

$xml->findnodes( “//tef:auteur/tef:nom/*” )

do you get any results?

 

You could also do this via xslt – if you have any experience of this?

I’m guessing it’s something like this you’re starting with: http://www.abes.fr/abes/documents/tef/recommandation/ex1_theseSimplePDF.xml

 

 

These might explain a bit more about namespaces:

http://stackoverflow.com/a/4083929/2455451

http://stackoverflow.com/questions/2673370/why-should-i-use-xpathcontext-with-perls-xmllibxml/2673452#2673452

 

Cheers,

John

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Jean-Marie Le Bechec
Sent: 03 March 2014 08:18
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] harvester (question)

 

hi Seb,

I need to harvest an OAI server in a format other than Dublin Core (TEF format). I can not get specific metadata with the same name.

For example :
...
<tef:thesisAdmin>
                    <tef:auteur>
                      <tef:nom>nom1</tef:nom>
...

and
...
<tef:directeurThese>
                      <tef:nom>nom2</tef:nom>
                      <tef:prenom>Carine</tef:prenom>
                      <tef:autoriteInterne>MADS_DIRECTEUR_DE_THESE_1</tef:autoriteInterne>
                      <tef:autoriteExterne autoriteSource="Sudoc">073367826</tef:autoriteExterne>
                    </tef:directeurThese>
                    <tef:directeurThese>
                      <tef:nom>nom3</tef:nom>
                      <tef:prenom>Louise</tef:prenom>
                      <tef:autoriteInterne>MADS_DIRECTEUR_DE_THESE_2</tef:autoriteInterne>
                      <tef:autoriteExterne autoriteSource="Sudoc">035036672</tef:autoriteExterne>
                    </tef:directeurThese>
...
in the same record !

I need to extract all this data.

I tried things like :

my $nom;
foreach my $node ($xml->findnodes( "//auteur/nom/*" ))
       {
               $nom = $node->textContent;     
       }

but it does not work (no result)

any idea ?


Thanks !

Jean-Marie


-- 
 
***********************************************
Jean Marie Le Bechec
Service Commun de la Documentation
Responsable ingenierie documentaire
&
Direction du Systeme d'Information
Referent Etudes
 
Institut National Polytechnique de Toulouse
6 allee Emile Monso - bp 34038 -
31029 Toulouse cedex 4
Tel : 05 34 32 31 16
Mail : lebechec@inp-toulouse.fr
*********************************************** 


*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/

-- 

***********************************************
Jean Marie Le Bechec
Service Commun de la Documentation
Responsable ingenierie documentaire
&
Direction du Systeme d'Information
Referent Etudes

Institut National Polytechnique de Toulouse
6 allee Emile Monso - bp 34038 -
31029 Toulouse cedex 4
Tel : 05 34 32 31 16
Mail : lebechec@inp-toulouse.fr
***********************************************