EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #06717
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] Making a static copy of an EPrints repo
- To: eprints-tech@ecs.soton.ac.uk
- Subject: Re: [EP-tech] Making a static copy of an EPrints repo
- From: Yuri <yurj@alfa.it>
- Date: Tue, 18 Jul 2017 12:33:29 +0200
I would use: wget --no-parent \ --no-check-certificate \ --html-extension \ --convert-links \ --restrict-file-names=windows \ --recursive \ --level=inf \ -N \ --page-requisites \ -e robots=off \ --wait=0 \ --quota=inf \ I think --convert-links will do the job of converting links. Il 18/07/2017 11:04, Ian Stuart ha scritto:
I need to make a read-only, static, copy of an old repo (the hardware is dying, the installation was heavily tailored for the environment, and I don't have the time to re-create in a new environment.) I can grab all the active pages: wget --local-encoding=UTF-8 --remote-encoding=UTF-8 --no-cache --mirror -nc -k http://my.repo/ This is good, however it doesn't edit all the absolute URLs in the view pages, so we need to modify them: find my.repo -type f -exec sed -i 's_http://my.repo/_/_g' {} + However this leaves me with the problem that the http://my.repo/nnn/ pages haven't been pulled down! Any suggestions on how to do this? Cheers
- References:
- [EP-tech] Making a static copy of an EPrints repo
- From: Ian Stuart <Ian.Stuart@ed.ac.uk>
- [EP-tech] Making a static copy of an EPrints repo
- Prev by Date: [EP-tech] Making a static copy of an EPrints repo
- Next by Date: Re: [EP-tech] Making a static copy of an EPrints repo
- Previous by thread: [EP-tech] Making a static copy of an EPrints repo
- Next by thread: Re: [EP-tech] Making a static copy of an EPrints repo
- Index(es):