EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #08877
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] apostrophe in file names of uploaded/deposited file
- To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
- Subject: Re: [EP-tech] apostrophe in file names of uploaded/deposited file
- From: Tomasz Neugebauer <Tomasz.Neugebauer@concordia.ca>
- Date: Tue, 1 Mar 2022 16:03:49 +0000
CAUTION: This e-mail originated outside the University of Southampton.
Hi everyone,
This might be useful for others, I solved the issue with a couple of REGEX:
$filename =~ s/\x27/=0027/g;
$filename =~ s/\x22/=0022/g;
to replace the quote and double-quote in what is returned by this function:
file->get_value("filename")
>From a digital preservation perspective, I think it is significant to note that "filename" in this object:
does not necessarily refer to the "filename" on disk.
What is the function or property (is there one?) in EPrints objects that is identical to the filename of the file as it is on the filesystem?
Tomasz
From: David R Newman <drn@ecs.soton.ac.uk>
Sent: Sunday, February 20, 2022 4:28 PM To: eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>; Tomasz Neugebauer <Tomasz.Neugebauer@concordia.ca> Subject: Re: [EP-tech] apostrophe in file names of uploaded/deposited file Attention This email originates from outside the concordia.ca domain. // Ce courriel provient de l'exterieur du domaine de concordia.ca
Hi Tomasz, There are two ways to work round this issue. One has been in EPrints for quite a while, another I introduced in 3.4.3 to help deal retrospectively with this issue. 1. https://wiki.eprints.org/w/Optional_filename_sanitise.pl allows you to set characters that should be removed before a filename is recorded in the database or saved to disk. I have to admit I did not know about this until fairly recently, so I have not tested how well it will work or solve your problem. If you look at /opt/eprints3/lib/cfg,d/optional_filename_sanitise.pl there is a function that can be added under $c->{optional_filename_sanitise}. The default (albeit commented out) function will remove white space, brackets and @ signs into underscores. You could add a line like below to deal with apostrophes. $filepath =~ s!\x27!_!g; 2. The new functionality I added for 3.4.3, is to allow files on disk to be found under the filename <fileid>.bin. This allows you to fix this sort of issue by renaming the file on disk to <fileid>.bin. Also, you can enable it so that future files are automatically saved in the format <fileid>.bin by setting: $c->{generic_filenames} = 1; I would probably advise against doing this on a live repository, especially if you have unusual uploads like uploading multiple files an once through "Upload from URL". If you want to test this on a development repo, then please do, as any real-world-ish feedback on this feature would be useful. Regards David Newman On 20/02/2022 20:32, Tomasz Neugebauer via Eprints-tech wrote:
|
- Follow-Ups:
- Re: [EP-tech] apostrophe in file names of uploaded/deposited file
- From: Tomasz Neugebauer <Tomasz.Neugebauer@concordia.ca>
- Re: [EP-tech] apostrophe in file names of uploaded/deposited file
- References:
- [EP-tech] apostrophe in file names of uploaded/deposited file
- From: Tomasz Neugebauer <Tomasz.Neugebauer@concordia.ca>
- Re: [EP-tech] apostrophe in file names of uploaded/deposited file
- From: Tomasz Neugebauer <Tomasz.Neugebauer@concordia.ca>
- [EP-tech] apostrophe in file names of uploaded/deposited file
- Prev by Date: [EP-tech] advanced search for eprints by document language
- Next by Date: [EP-tech] field '' does not exist on dataset 'eprint'
- Previous by thread: [EP-tech] EPrints/CRIS
- Next by thread: [EP-tech] DOI handling in orcid_support_advance
- Index(es):