EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #04933


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: Check uploaded document for PDF/A compatibility


Hello,

I added two fields in document_fields.pl (val_report and val_good), but the code below (in document_validate.pl), fails to populate them. 

For checking reasons I tried to write to document-field formatdesc - also to no success.

 $document->set_value  doesn't seem to work  here....


  # CHECKS IN HERE
  #
  # PDF/A check
  my $pdf_file_path = $document->file_path;
  my $cmd = '/srv/bin/pdf-check/is-it-pdfa.sh';
  my $output = `$cmd $pdf_file_path`;         # whatever you need to do to call the command
  #my $output = "XXX";
  $document->set_value( "val_report", $output );
  #$document->set_value( "val_report", "TESTwrite to val_report NOW" );
  $document->set_value( "formatdesc", "TESTwrite to formatdesc NOW" ); # check if I can write to formatdesc
  if ($output && $output =~ m/INVALID/)       # match output indicating failure
  {
    $document->set_value( "val_good", 'FALSE' );
    push @problems, $repository->html_phrase('validate:pdf_not_ideal');
  } else {
    $document->set_value( "val_good", 'TRUE' );
  }

.......................................
Roland Roth-Steiner
M.Sc. Wirtsch.-Inf., Dipl.-Bibl.
. Univ.- und Landesbibliothek
... Elektronische Informationsdienste
... Leitung Digitalisierungszentrum
... Fachreferat Wirtschaft
. Magdalenenstr. 8, 64289 Darmstadt
+49 (0)6151 16-76280
.......................................

________________________________________
Von: eprints-tech-bounces@ecs.soton.ac.uk [eprints-tech-bounces@ecs.soton.ac.uk]&quot; im Auftrag von &quot;Field A.N. [af05v@ecs.soton.ac.uk]
Gesendet: Dienstag, 13. Oktober 2015 10:39
An: eprints-tech@ecs.soton.ac.uk
Betreff: [EP-tech] Re: Check uploaded document for PDF/A compatibility

The cfg.d/document_validate.pl configuration file is probably what you want

https://github.com/eprints/eprints/blob/3.3/lib/defaultcfg/cfg.d/document_validate.pl


Don't forget you can use backticks to call command-line utitilites if there isn't a perl library to do the work.  Something like:


my $pdf_file_path = $document->get_main->path;
my $cmd = '/usr/local/bin/thingy';
my $output = `$cmd --file=$pdf_file_path --verbose`; #whatever you need to do to call the command
if ($output && $output =~ m/this is a bad file/) #match output indicating failure
{
        push @problems, $repository->html_phrase('validate:pdf_not_ideal');
}


--
Adam Field
Business Relationship Manager and Community Lead
EPrints Services
+44 (0)23 8059 8814





On 13 Oct 2015, at 09:20, Roth-Steiner, Roland wrote:

> Hi,
>
> I would like to have it checked directly after the upload - so we can inform the user, that we need valid PDF/A and point him to an FAQ with the right howto.
>
> Since there will be a huge number of uploads, this needs to be fully automated and, as mentioned, instantly after the upload stage.
>
> So where should I best place the call for the PDF/A validation script to have it run directly after document upload?
>
> Thanks again
>
> .......................................
> Roland Roth-Steiner
> M.Sc. Wirtsch.-Inf., Dipl.-Bibl.
> . Univ.- und Landesbibliothek
> ... Elektronische Informationsdienste
> ... Leitung Digitalisierungszentrum
> ... Fachreferat Wirtschaft
> . Magdalenenstr. 8, 64289 Darmstadt
> +49 (0)6151 16-76280
> .......................................
>
> ________________________________________
> Von: eprints-tech-bounces@ecs.soton.ac.uk [eprints-tech-bounces@ecs.soton.ac.uk]&quot; im Auftrag von &quot;Field A.N. [af05v@ecs.soton.ac.uk]
> Gesendet: Montag, 12. Oktober 2015 17:37
> An: eprints-tech@ecs.soton.ac.uk
> Betreff: [EP-tech] Re: Check uploaded document for PDF/A compatibility
>
> You could also define a new issue and have it run by the issues infrastructure.
>
> --
> Adam Field
> Business Relationship Manager and Community Lead
> EPrints Services
> +44 (0)23 8059 8814
>
>
>
>
>
> On 12 Oct 2015, at 15:53, Roth-Steiner, Roland wrote:
>
>> Hello,
>>
>> VeraPDF really looks promising + highly configurable....
>>
>> Where would I best clip in the PDF-checker-script?
>>
>> In documents_fields_automatic.pl, document_upload.pl, document_validate.pl ?
>>
>> eprint_validate or eprint_warnings.pl ?
>>
>> Or maybe in the deposit-stage... ?
>>
>>
>> Thanks
>>
>> .......................................
>> Roland Roth-Steiner
>> M.Sc. Wirtsch.-Inf., Dipl.-Bibl.
>> . Univ.- und Landesbibliothek
>> ... Elektronische Informationsdienste
>> ... Leitung Digitalisierungszentrum
>> ... Fachreferat Wirtschaft
>> . Magdalenenstr. 8, 64289 Darmstadt
>> +49 (0)6151 16-76280
>> .......................................
>>
>> ________________________________________
>> Von: eprints-tech-bounces@ecs.soton.ac.uk [eprints-tech-bounces@ecs.soton.ac.uk]&quot; im Auftrag von &quot;John Salter [J.Salter@leeds.ac.uk]
>> Gesendet: Donnerstag, 8. Oktober 2015 14:14
>> An: 'eprints-tech@ecs.soton.ac.uk'
>> Betreff: [EP-tech] Re: Check uploaded document for PDF/A compatibility
>>
>> Hi,
>> I haven't, but I am keeping track of the development of VeraPDF: http://verapdf.org/home/ - which you may be interested in.
>>
>> Their roadmap is here: http://verapdf.org/roadmap/ - which looks like December 2016 for the first Release - but if anyone wants to get to grips with one of the beta versions - do (and let us know how it goes!).
>>
>> Cheers,
>> John
>>
>>
>> -----Original Message-----
>> From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Roth-Steiner, Roland
>> Sent: 08 October 2015 12:23
>> To: Eprints Tech Mailing List
>> Subject: [EP-tech] Check uploaded document for PDF/A compatibility
>>
>> Hi list,
>>
>> i would like to check a document directly after upload (for pdf/A compatibility).
>>
>> Has anyone already done this?
>>
>> Thanks
>>
>> Roland
>>
>> .......................................
>> Roland Roth-Steiner
>> M.Sc. Wirtsch.-Inf., Dipl.-Bibl.
>> . Univ.- und Landesbibliothek
>> ... Elektronische Informationsdienste
>> ... Leitung Digitalisierungszentrum
>> ... Fachreferat Wirtschaft
>> . Magdalenenstr. 8, 64289 Darmstadt
>> +49 (0)6151 16-76280
>> .......................................
>>
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/
>> *** EPrints developers Forum: http://forum.eprints.org/
>>
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/
>> *** EPrints developers Forum: http://forum.eprints.org/
>>
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/
>> *** EPrints developers Forum: http://forum.eprints.org/
>
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/


*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/