EPrints Technical Mailing List Archive
Message: #06718
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] Making a static copy of an EPrints repo
- To: eprints-tech@ecs.soton.ac.uk
- Subject: Re: [EP-tech] Making a static copy of an EPrints repo
- From: Matthew Kerwin <matthew@kerwin.net.au>
- Date: Tue, 18 Jul 2017 20:43:29 +1000
On 18 July 2017 at 19:04, Ian Stuart <Ian.Stuart@ed.ac.uk> wrote: > I need to make a read-only, static, copy of an old repo (the hardware is > dying, the installation was heavily tailored for the environment, and I > don't have the time to re-create in a new environment.) > > I can grab all the active pages: > > wget --local-encoding=UTF-8 --remote-encoding=UTF-8 --no-cache > --mirror -nc -k http://my.repo/ > > This is good, however it doesn't edit all the absolute URLs in the view > pages, so we need to modify them: > > find my.repo -type f -exec sed -i 's_http://my.repo/_/_g' {} + > > However this leaves me with the problem that the http://my.repo/nnn/ > pages haven't been pulled down! > > Any suggestions on how to do this? > > Cheers > Depends how many records there are, and how sparse. Do you have a sitemap? It might be worth parsing that, and fetching them one by one. If you're desperate, there's always: for id in {1..12345} ; do wget --etc http://my.repo/$id ; done Cheers -- Matthew Kerwin http://matthew.kerwin.net.au/
- References:
- [EP-tech] Making a static copy of an EPrints repo
- From: Ian Stuart <Ian.Stuart@ed.ac.uk>
- [EP-tech] Making a static copy of an EPrints repo
- Prev by Date: Re: [EP-tech] Making a static copy of an EPrints repo
- Next by Date: Re: [EP-tech] Making a static copy of an EPrints repo
- Previous by thread: Re: [EP-tech] Making a static copy of an EPrints repo
- Next by thread: Re: [EP-tech] Making a static copy of an EPrints repo
- Index(es):