EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #04547
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
[EP-tech] digital preservation - indexing errors
- To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
- Subject: [EP-tech] digital preservation - indexing errors
- From: Tomasz Neugebauer <Tomasz.Neugebauer@concordia.ca>
- Date: Thu, 30 Jul 2015 15:42:55 +0000
We just completed an upgrade of our repository, which includes a re-indexing phase of all the contents. It was a good opportunity to take note of the errors that come up during indexing. Here is a list of the common errors that occurred during indexing: 1.
Error: Illegal entry in bfrange block in ToUnicode CMap 2.
Error: Invalid Font Weight 3.
Error (##): Illegal character <##> in hex string 4.
Error: Can't create transform 5.
Error: Couldn't link the profiles There are also some of these: 6.
Use of uninitialized value $data in substr at /opt/eprints3/tools/../perl_lib/Text/Extract/Word.pm line 68. Use of uninitialized value $magic in numeric eq (==) at /opt/eprints3/tools/../perl_lib/Text/Extract/Word.pm line 69. Use of uninitialized value $magic in sprintf at /opt/eprints3/tools/../perl_lib/Text/Extract/Word.pm line 69. This does not seem to be a Word document, but it is pretending to be one: 0 at /opt/eprints3/tools/doc2txt line 68 Error 255 from doc2txt command: […] Error #1 and #3 look to be the most common. Have you encountered these types of indexing errors? How serious are they in terms of digital preservation? Do you use any specific strategies/workflows for dealing with these? Do the EPrints preservation (http://files.eprints.org/696/) plugins help with identifying/solving these issues? Thanks for any comments/suggestions about this. Tomasz |
- Prev by Date: [EP-tech] Re: [SOLVED] Re: See other users' deposits (admin) or continuing part of: Per user-group workflow and permissions
- Next by Date: [EP-tech] Debian package 3.3.14
- Previous by thread: [EP-tech] Drag-and-drop interface
- Next by thread: [EP-tech] Debian package 3.3.14
- Index(es):