EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #09906


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Bulk PDF Upload Issue


CAUTION: This e-mail originated outside the University of Southampton.

  1. How I found this email address:
    1. I following the information to subscribe at this link: https://wiki.eprints.org/w/Eprints-tech_Mailing_List
    2. I sent a "SUBSCRIBE eprints-tech" message to sympa@ecs.soton.ac.uk
    3. In the welcome message, I did not see how to post to the list, so I sent a message to the email address the welcome message came from.
  2. The upload issue:
    1. I do include an --enable-file-imports argument in the command which is suppose to:

Thanks,
Joel


From: David R Newman <drn@ecs.soton.ac.uk>
Sent: Saturday, December 21, 2024 2:46 AM
To: Joel Brown <jbrown@crsvarc.com>; eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>
Subject: Re: Bulk PDF Upload Issue
 
You don't often get email from drn@ecs.soton.ac.uk. Learn why this is important
Hi Joel,

I think you need to set (to 1) the enable_file_imports option:

https://wiki.eprints.org/w/Miscellaneous_Config_Options#E

I did spend some back in July this year going through the codebase to make sure all configuration options that were not set in the configuration files included in a default installation were documented, (as well as any of those that are defined in configuration files [1]).  As after quite a few years, I was still discovering configuration options I did not know existed and in some cases had created very similar functionality and configuration to solve the same problem.

Regards

David Newman
[1] https://wiki.eprints.org/w/Config_Options_by_File


On 20/12/2024 6:57 pm, Joel Brown wrote:
CAUTION: This e-mail originated outside the University of Southampton.
CAUTION: This e-mail originated outside the University of Southampton.
Hey all,

We're in the process of setting up a new repository and are preparing to bulk import ~1300 articles to get it started. We are using XML to bulk import the entries (and this works fine). However, it has been tricky to find how to import a PDF along with each entry. I imagine the workflow would look something like this:
  1. Upload PDFs to an accessible location on the server.
  2. Include a URL reference to each individual PDF in the XML upload.
  3. During the upload eprints will pull the PDF from the specified location and add it to the eprints3/archives/<archive>/documents folder 

I've performed a simple test to do an XML upload that includes "foo.png". Here's what I've done:
  • Include foo.png in the /opt/eprints3 folder.
  • Include /opt/eprints3/foo.png in the document heading of the XML file:
<documents>
<document>
<docid>1</docid>
<rev_number>1</rev_number>
<pos>1</pos>
<format>image/png</format>
<language>en</language>
<security>public</security>
<main>foo.png</main>
<files>
<file>
<filename>foo.png</filename>
<data href="">
</file>
</files>
</document>
</documents>
  • Run the following command:
./bin/import --user 6 --enable-file-imports crsq eprint XML eprint.xml

Results: The upload works (the entry is created), but the foo.png doesn't actually link to anything. What am I missing?


Thanks,
Joel