EPrints Technical Mailing List Archive
Message: #00655
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
[EP-tech] Re: Base64 decoding in 3.3
- To: eprints-tech@ecs.soton.ac.uk
- Subject: [EP-tech] Re: Base64 decoding in 3.3
- From: Tim Brody <tdb2@ecs.soton.ac.uk>
- Date: Wed, 30 May 2012 15:07:22 +0100
Hi, EPrints assumes a line-length of 77 (76 chars + LF). %4 will break if the returned data happens to fall over a line break + 2 chars. Here is hopefully a comprehensive fix: http://trac.eprints.org/eprints/changeset/7764 Which ignores all whitespace and consumes modulo 4 chars for each chunk. If you are talking to an EPrints instance that doesn't have this fix you will need to format your Base64 into 76+LF lines. I would like to say you should be doing this anyway, but I missed the CR that the spec defines!: http://en.wikipedia.org/wiki/Base64#Implementations_and_history (So it ought to be 76+CR+LF i.e. modulo 78) /Tim. On Wed, 2012-05-30 at 14:06 +0100, James Colhoun wrote: > Hi Tim, > > > I have sent you the files. I have also been able to fix it I changed > within File.pm I have changes "sub characters" FROM: > > > print $tmpfile MIME::Base64::decode_base64( substr($_,0,length($_) - > length($_)%77) ); > > > TO > > > print $tmpfile MIME::Base64::decode_base64( substr($_,0,length($_) - > length($_)%4) ); > > > this seem to stop the chunking from breaking up individual byes and > causing the problem. I am still testing this but would be great to > know what you think. > > > Jim > > > > > -----eprints-tech-bounces@ecs.soton.ac.uk wrote: ----- > To: eprints-tech@ecs.soton.ac.uk > From: Tim Brody > Sent by: eprints-tech-bounces@ecs.soton.ac.uk > Date: 05/29/2012 03:50PM > Subject: [EP-tech] Re: Base64 decoding in 3.3 > > On Tue, 2012-05-29 at 12:18 +0100, James Colhoun wrote: > > Hi, > > > > > > I am uploading publications via sword, full text files are added to > > the upload xml and encoded in base64 this worked fine until we > > upgraded to 3.3. Now we get errors in the log: > > > > > > failed: expected 3151 bytes but actually got 3149 bytes > > > > > > So it seems the decoding of base64 is no longer working correctly. > > Inside EPrints/DataObj/File.pm the functions: end_element, > characters > > and start_element seems to create a tmp file that is corrupt. If I > > add a write to file inside "sub characters" (see below) the pdf is > > created correctly so I know the data is passed in correctly, there > > seems to be something fundamentally broken with the way the decoding > > to tmpfile is working. Has anyone seen this are have a fix for it? > > > Hi, > > I can't replicate this. I did find a bug in XMLFiles for *producing* > base64 encoded files, fixed by this: > http://trac.eprints.org/eprints/ticket/4057 > > This could be an edge case - can you post your XML somewhere or email > it > to me directly (if not too big)? > > -- > All the best, > Tim > > *** Options: > http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech > *** Archive: http://www.eprints.org/tech.php/ > *** EPrints community wiki: http://wiki.eprints.org/ > > > > [attachment "signature.asc" removed by James > Colhoun/sisjc5/CardiffUniversity] > *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech > *** Archive: http://www.eprints.org/tech.php/ > *** EPrints community wiki: http://wiki.eprints.org/
Attachment:
signature.asc
Description: This is a digitally signed message part
- References:
- [EP-tech] Base64 decoding in 3.3
- From: James Colhoun <ColhounJ@cardiff.ac.uk>
- [EP-tech] Re: Base64 decoding in 3.3
- From: Tim Brody <tdb2@ecs.soton.ac.uk>
- [EP-tech] Re: Base64 decoding in 3.3
- From: James Colhoun <ColhounJ@cardiff.ac.uk>
- [EP-tech] Base64 decoding in 3.3
- Prev by Date: [EP-tech] ETHOS Import plug in
- Next by Date: [EP-tech] Re: Installing IRStat
- Previous by thread: [EP-tech] Re: Base64 decoding in 3.3
- Next by thread: [EP-tech] Request 3.3.10 Debian Package
- Index(es):