EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #02085
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
[EP-tech] Re: Memory usage in 3.2, Sword 1.3 and epdata packages
- To: eprints-tech@ecs.soton.ac.uk
- Subject: [EP-tech] Re: Memory usage in 3.2, Sword 1.3 and epdata packages
- From: Ian Stuart <Ian.Stuart@ed.ac.uk>
- Date: Fri, 12 Jul 2013 08:26:45 +0100
With no real knowledge, and certainly no investigation.... I would suspect the problem is actually with how the base64 files are handled, rather then being an EPrints memory leak per sae.
From the SWORD importers I've written, the process seems to be to 1) read in the deposit 2) unpack the deposit (zip into disk space, XML into memory) 3) create the eprint object 4) attach the files 5) write everything outSo I would suspect that what's happening is that all your base64 files are created (in memory) from the XML (which is also in memory)
On 12/07/13 03:57, Mark Gregson wrote:
We’re using SWORD with epdata packages to deposit documents and multimedia into our repository (3.2). This works fine for small file sizes but at CPU and memory increases quickly until with a ~200MB file the httpd process consumes all available memory and dies. This is on a RHEL5 64bit box with 8GB memory with a separate DB server. Clearly, the epdata format is not the most appropriate for this size file due to the increased file size as a result of the base64 encoding and because the document is embedded within the XML. Changing package format may alleviate/resolve the problem but as this is definitely going to be a challenge in our environment I’m hoping it will be easier to deal with the issue within EPrints. Note, I’ve already ascertained that is not related to libxm2’s XML_PARSE_HUGE option being disabled, the failure occurs trying to run df. I’m about to start hunting for memory leaks and then doing additional memory profiling. If anyone has any suggestions about likely locations for memory leaks in the code, information about expected memory usage for SWORD with epdata packages, data from previous profiling, etc, it would be very valuable.
-- Ian Stuart. Developer: ORI, RJ-Broker, and OpenDepot.org Bibliographics and Multimedia Service Delivery team, EDINA, The University of Edinburgh. http://edina.ac.uk/ This email was sent via the University of Edinburgh. The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
- Follow-Ups:
- [EP-tech] Re: Memory usage in 3.2, Sword 1.3 and epdata packages
- From: Tim Brody <tdb2@ecs.soton.ac.uk>
- [EP-tech] Re: Memory usage in 3.2, Sword 1.3 and epdata packages
- References:
- [EP-tech] Memory usage in 3.2, Sword 1.3 and epdata packages
- From: Mark Gregson <mark.gregson@qut.edu.au>
- [EP-tech] Memory usage in 3.2, Sword 1.3 and epdata packages
- Prev by Date: [EP-tech] Memory usage in 3.2, Sword 1.3 and epdata packages
- Next by Date: [EP-tech] EPrints for research and Open Educational Resources...?
- Previous by thread: [EP-tech] Memory usage in 3.2, Sword 1.3 and epdata packages
- Next by thread: [EP-tech] Re: Memory usage in 3.2, Sword 1.3 and epdata packages
- Index(es):