EPrints Technical Mailing List Archive
Message: #06253
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] Import problems!
- To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
- Subject: Re: [EP-tech] Import problems!
- From: Andrew Beeken <anbeeken@lincoln.ac.uk>
- Date: Wed, 8 Feb 2017 11:11:51 +0000
Further down the rabbit hole… I’ve tried a full archive import (without files! I don’t have an infinite amount of space on my dev environment!) and it’s imported around 6500 records from about 14500 – so less than half! I’ve also done some searching on the tech list and found this post:
http://www.eprints.org/tech.php/13616.html which suggests that the errors are coming from non-ASCII characters in the filenames etc. Going back to my text export:
https://drive.google.com/open?id=0B67FaE28LeB-c21LZ0Y5YmNHRTQ I’ve taken a look at the three records that won’t import. Now, on the basis of that post I removed the document details
from one and – the record imported. So the issue lies in the documents but not ALL of the documents. The document from that test batch that imported had no spaces in the filename so – perhaps that’s something? The one annoying factor in this is that even with web imports off the import script still fails at these records – I would have expected it to have created the metadata record but ignore the file,
but that’s not the case. Has anyone encountered this particular snafu? I feel like I’m getting close to figuring this all out, though! Andrew From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken Ah, cool! Thanks! All sorted on that front and added “Export/Import Subjects…” into my checklist for migration! Now, if I could figure out what’s up with those records I think I’m about there… From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Lizz Jennings Not sure on how you work on the first half of that, but there’s a helpful video on working with subject trees: https://wiki.eprints.org/w/Training_Video:Subject_Trees You’ll probably need to copy your subject tree from the old repository over, and then run import_subjects pointing at that file. Lizz -- Lizz Jennings BA MSc ACLIP MCLIP (Revalidated 2015) Research Data Librarian (Systems) The Library 4.10, University of Bath, Bath, BA2 7AY UK Ext. 3570 (External 01225 383570) Research Data Management:
http://www.bath.ac.uk/research/data From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken I’m back! Progress has been made. So, I’ve tried this with clearing out the eprints table and starting the import from fresh – success! Of a kind… It’s imported one record; this one –
http://eprints.lincoln.ac.uk/25828. This is a complete record with related documents. Hurrah! However there should be a total of four records imported. The import threw up some new and interesting errors: Starting EPrints Repository. Connecting to DB ... done. Failed to retrieve
http://eprints.lincoln.ac.uk/25934/1/1602_Global%20Goals_French_Art6Clean.docx: 401 Authorization Required document.52811 failed to create subdataobj on document.files at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1013. eprint.25934 failed to create subdataobj on archive.documents at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1013. Failed to retrieve
http://eprints.lincoln.ac.uk/25845/1/BehavEcol2016.pdf: 401 Authorization Required document.52360 failed to create subdataobj on document.files at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1013. eprint.25845 failed to create subdataobj on archive.documents at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1013. Failed to retrieve
http://eprints.lincoln.ac.uk/25940/1/25940%20Proof_APPLAN_4392.pdf: 401 Authorization Required document.52899 failed to create subdataobj on document.files at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1013. eprint.25940 failed to create subdataobj on archive.documents at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1013. Number of records imported: 1 25828 Ending EPrints Repository. Now, my interpretation of this is that it’s unable to get the documents that are not open access – which is fine, however how can I get it to grab any protected documents? Can I provide username
and password alongside the import? Also, is this likely to be why the records are not being imported correctly or is there something else here I’m misinterpreting? Finally, looking at the repository I can access the record directly through the URL, however I can’t see anything through the browse by… view. I’ve regenerated views/abstracts and started the indexer.
The search gives this error: The Lincoln Repository has encountered an error: The top level subject (id=jacs) for field "subjects" does not exist. The site admin probably has not run import_subjects. See the documentation for more information. As always, any thoughts are appreciated! Andrew From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken Hi all! So, a week or so later and I’m back onto this. I’ve double verbosed it and exposed a bunch of extra debug stuff. One line in particular stands out for me on all 5 test records I’m trying to import: Database execute debug: INSERT INTO `eprint` (`eprintid`) VALUES (?) So… it looks like it’s not inserting anything into the database? I’ve put the error log as well as the XML snapshot I’m trying to import up onto Google Drive here if anyone has time to take a look:
https://drive.google.com/open?id=0B67FaE28LeB-Z3NZVGtRQkFsbVU I am trying to get through this without shouting for help at every turn, I promise! ;P Andrew From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken That’s interesting – I’ll have a check although this action is only being done on a subset of 4 or 5 records, all of which are fairly recent, so I would have hoped they’d have a status! The only
other alternative could possibly be missing status definitions etc? I shall take a look at bringing phrases over as well. From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of John Salter 'ne' means 'not equal' - when comparing strings ('!=' means not equal when comparing numbers). At a guess, something in your export doesn't have an eprint_status set (all EPrints should have this set). You may be able to find these in the database by trying something like: mysql> SELECT COUNT(*), eprint_status from eprint GROUP BY eprint_status; This should result in the eprint_status values of inbox, buffer, archive and deletion, with a count next to each. Any count for 'NULL' would show where the eprint_status is not set,
and therefore where the uninititalized string is on the import. Cheers, John From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken Cheers Adam, That seems to have sorted things out. However, I have another new and interesting error! Use of uninitialized value in string ne at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1090. The line in question is: if( $dataobj->get_value( "eprint_status" ) ne $self->{id} ) and the ne in that line is an operator if I’m not mistaken? So is this an issue with Perl running on the server? Andrew From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Adam Field If you’ve added all your metadata fields to the new repositories configuration, you need to do: <eprints_root>/bin/epadmin update <repositoryid> This will update your database structure. From:
<eprints-tech-bounces@ecs.soton.ac.uk> on behalf of Andrew Beeken <anbeeken@lincoln.ac.uk> Hello all! Thanks for the pointers yesterday regarding import/export. I’m at a point where I now have exported data and am trying to import it. Initially I was getting errors relating to missing
metadata fields which I’d thought I’d corrected, however I am now finding errors relating to missing SQL tables: SQL ERROR (execute): SELECT `eprintid`,`pos`,`subjects_loc` FROM `eprint_subjects_loc` WHERE `eprintid` IN (25940) SQL ERROR (execute): Table 'lirolem.eprint_subjects_loc' doesn't exist DBD::mysql::st fetchrow_array failed: fetch() without execute() at /usr/share/eprints3/bin/../perl_lib/EPrints/Database.pm line 2674. DBD::mysql::st execute failed: Table 'lirolem.eprint_creators_browse_id' doesn't exist at /usr/share/eprints3/bin/../perl_lib/EPrints/Database.pm line 3211. SQL ERROR (execute): SELECT `eprintid`,`pos`,`creators_browse_id` FROM `eprint_creators_browse_id` WHERE `eprintid` IN (25940) SQL ERROR (execute): Table 'lirolem.eprint_creators_browse_id' doesn't exist DBD::mysql::st fetchrow_array failed: fetch() without execute() at /usr/share/eprints3/bin/../perl_lib/EPrints/Database.pm line 2674. Now, my question is, do I need to do something to rebuild the tables relating to the metadata in some way prior to an import? Or is there something else I’m missing here? Maybe not all the field
definitions? Andrew
|
- References:
- Re: [EP-tech] Import problems!
- From: Adam Field <Adam.Field@jisc.ac.uk>
- Re: [EP-tech] Import problems!
- From: Andrew Beeken <anbeeken@lincoln.ac.uk>
- Re: [EP-tech] Import problems!
- From: John Salter <J.Salter@leeds.ac.uk>
- Re: [EP-tech] Import problems!
- From: Andrew Beeken <anbeeken@lincoln.ac.uk>
- Re: [EP-tech] Import problems!
- From: Andrew Beeken <anbeeken@lincoln.ac.uk>
- Re: [EP-tech] Import problems!
- From: Andrew Beeken <anbeeken@lincoln.ac.uk>
- Re: [EP-tech] Import problems!
- From: Lizz Jennings <E.Jennings@bath.ac.uk>
- Re: [EP-tech] Import problems!
- From: Andrew Beeken <anbeeken@lincoln.ac.uk>
- Re: [EP-tech] Import problems!
- Prev by Date: Re: [EP-tech] Import problems!
- Next by Date: [EP-tech] Error: Could not find plugin Export::JSON
- Previous by thread: Re: [EP-tech] Import problems!
- Next by thread: Re: [EP-tech] Import problems!
- Index(es):