EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #05498
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
[EP-tech] Antwort: Speed up citation
- To: eprints-tech@ecs.soton.ac.uk
- Subject: [EP-tech] Antwort: Speed up citation
- From: jens.vieler@id.uzh.ch
- Date: Wed, 16 Mar 2016 08:47:24 +0100
Dear List,
in addition to my last post, we did a pre-render_citation() for all EPrints and stored the result in each EPrint DB-field "citation". Then we did a little benchmark, comparing the time we need to Export the DB-field vs. render_citation() on the fly. Here are the results on different collections:
- collection id 11124 (ca. 100 EPrints): buildlist 0.004 sec., render_citation() 1.4 sec., DB-Field citation 0.19 sec
- collection id 10046 (ca. 1000 EPrints): buildlist 0.02 sec., render_citation() 13 sec., DB-Field citation 1.4 sec
- collection id 10170 (ca. 25000 EPrints): buildlist 0.39 sec., render_citation() 357.9 sec., DB-Field citation 38.9 sec
All in all we're 10-12 times faster exporting pre-generated citation lists. So we're doing an initial pre-fill of DB-field "citation" and we have ToDo an update within every workflow edit step.
Does anybody has any comments, experiance or hints, to speed up render_citation in another way? Does anybody use COINS in a similar way?
Cheers
Jens
--
Jens Vieler
Informatikdienste
Universität Zürich
Stampfenbachstrasse 73
CH-8006 Zürich
mail: jens.vieler@id.uzh.ch
phone: +41 44 63 56777
http://www.id.uzh.ch
Jens-Patrick Vieler---24.02.2016 13:51:27---Dear List we're building a new cgi/export-plugin solution to support publication lists in our web. D
Von: Jens-Patrick Vieler/at/UZH
An: eprints-tech@ecs.soton.ac.uk
Datum: 24.02.2016 13:51
Betreff: Speed up citation
Dear List
we're building a new cgi/export-plugin solution to support publication lists in our web. Doing some searches, building some lists, remainder/union/intersect them together, and finaly, we export a kind of XML, including metadata and a citation.
Actually everything works quite well, BUT if the result turns into long lists, we are running into performance problems.
We did some benchmarking and here is the result over a typically 1000-item-list:
- building up the result needs 15 sec.
- search and generate lists 0.015sec
- merge lists 0.08sec
- generation of export data 14sec
congrats: dealing with search and lists is very fast within eprints :-)
so we took a closer look at what happens while building the output.
first of all: it grows linearly with the use of render_citation.
1 time '$citation = EPrints::Utils::tree_to_utf8($dataobj->render_citation("default"));' takes 14sec
2 times '$citation = EPrints::Utils::tree_to_utf8($dataobj->render_citation("default"));' takes 29sec
4 times '$citation = EPrints::Utils::tree_to_utf8($dataobj->render_citation("default"));' takes 54sec
second: it speeds up while reducing the citation XML file (default) to a minimum; when it only includes the title, the export needs 2sec for 1000 items.
So my question is: Is there a way to speed up render_citation? Does it always interpret the whole citation XML file? Has anybody thought about a way to compile the XML to a perl routine or cache things like this?
Any help is welcome
Jens
--
Jens Vieler
Informatikdienste
Universität Zürich
Stampfenbachstrasse 73
CH-8006 Zürich
mail: jens.vieler@id.uzh.ch
phone: +41 44 63 56777
http://www.id.uzh.ch
- Prev by Date: Re: [EP-tech] Error installing IRStats2
- Next by Date: [EP-tech] EPC and building Summary pages
- Previous by thread: [EP-tech] Error installing IRStats2
- Next by thread: [EP-tech] EPC and building Summary pages
- Index(es):