EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #06382
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
[EP-tech] Scripted XML download?
- To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
- Subject: [EP-tech] Scripted XML download?
- From: Andy Reid <Andy.REID@lshtm.ac.uk>
- Date: Mon, 27 Mar 2017 13:51:32 +0000
Hi, I do some checking, analysis and visualisation of our repository in a third-party package, and I have it set up to ingest Eprints XML. I’d like to update this once a week or so, but if I download it all in one big go it takes about 3 hours,
1.5GB, and tends to fail halfway in. I have been doing it manually one year at a time, but that means 17 separate manual search-and-download operations, each taking ten minutes or so. I don’t have shell access to the server, so can’t script it command-line.
I have looked at the search page but after a search, the download form references a cached search id so I can’t just copy the URL in the download form.
Can anyone give me a template for a URL that would work in a single pass in wget or libwww, that I could then cron to fetch the EPXML ? Obviously I have to be able to authenticate as well… ? Andy Reid Research Information Manager Executive Office, Room G40a London School of Hygiene and Tropical Medicine Keppel St, LONDON, WC1E 7HT 0207-927-2618 (Internal/Teleworker x2618)
- Prev by Date: Re: [EP-tech] Apache log getting a lot of errors and Mysql Going away
- Next by Date: Re: [EP-tech] Apache log getting a lot of errors and Mysql Going away
- Previous by thread: [EP-tech] Apache log getting a lot of errors and Mysql Going away
- Next by thread: Re: [EP-tech] Scripted XML download?
- Index(es):