EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #07006
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] REF Compliance Checker Plugin
- To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
- Subject: Re: [EP-tech] REF Compliance Checker Plugin
- From: John Salter <J.Salter@leeds.ac.uk>
- Date: Fri, 1 Dec 2017 14:06:10 +0000
In answer to your first email, yes, run it once to populate, let 'normal' running do the rest. In answer to this, I think the idea behind the REF_CC tool was that it was a 'standard' implementation - and didn't allow for people to 'invent' data to help compliance. We could allow humans to put dates in. We can intrinsically trust them to do an honest job. But, why do that when the data we need is already in the system - and can be obtained. There was a post on the one of the other repo mailing lists about the REF_CC tool - in relation to the new exception (that will take effect from Apr'18). If a script should be included with the tool to 'populate FOA dates from history dataset', then 'we' (not quite sure who that is) could add it. The 'proper' answer to this question won't be apparent until HEfCE announce how they will be measuring this compliance data. If they will require some form of 'proof', then the history dataset route will become more valuable than a human-filled date input. As a first step, I'd suggest checking whether you do have the 'lift embargos' entries in the history database table. If it makes the powers-that-be happy, maybe add a date field to the document to store the data - but also suggest that there may be a *better* way. If they want a better way, make sure they engage with the discussion around issues/improvements to the REF_CC tool. Cheers, John From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken What about the idea of adding an override for the Open Access date to the REF tab and allowing repository staff to override that value? (I’m trying to consider the path of least resistance here as
I’m running out of time!) From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of John Salter I have some magic (verging on voodoo) that may help here :o) Changes to an EPrint are logged in the 'history' dataset. Normally the 'actor' recorded is the user who was logged in when the change was made. In the case of a script making the change, the name/path of the script is used as the actor. In the database, run a query like: SELECT DISTINCT actor from history WHERE actor LIKE '%lift_embargo%'; I get a few values like: /opt/eprints3/bin/lift_embargos /opt/eprints/bin/lift_embargos bin/lift_embargos wrro /usr/share/eprints/bin/lift_embargos wrro -the variations come from moving from Solaris to RHEL, and different ways the script has been run in the past. For a given EPrint, you can construct a search of the history dataset, for an actor returned in the query above. From the search results, you should be able to: - get the revision XML file for this change - get the revision XML file for the previous change - run an XPath query to see if the document that has had an embargo lifted is a document version (in your situation, 'stage') that you're interested in (accepted or published). This outline process was the result of a discussion with LSE on a similar theme. I think this could be implemented as a bin script. I would not recommend that this forms part of the commit trigger - I think it might be a bit heavy/slow for that process. It hasn't been turned into a 'full' solution - but this could be done if there is a need for it. Hope that glimpse of the dark-arts has helped! John From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken Just looking at this again, in its raw form how is the plugin expected to handle any deposits that are valid but, for example, have come out of embargo before the plugin has been installed? For example,
if I have a record that was added to the repository on 21/2/2016 and left embargo on 6/11/2016, I’d want the date of First Compliant Deposit to be 21/2/2016 and the Date of First Compliant Open Access to be 6/11/2016. It seems that if I install the plugin
on 1/12/2017, and recommitted the records then it would set both dates to be 1/12/2017? Am I getting this backwards, as my tests on existing data are showing that to be the case… From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken So what we need is some way of checking that open access rule for the embargo date – I guess if we want to detect whether it’s the embargo date or the published date that we need is going to be tricky?
Maybe adding a field to the REF tab would be better and letting the repository staff manually enter that? From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of John Salter As long as you're not poking data straight into the database, then everything should be handled without any regular jobs. The calculations all happen when an EPrint is committed (which is also cascaded when a document is committed). From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken Yeah, I thought so – so in a realistic scenario we’d never be able to use the embargo date as the date of first open access – BUT, I guess that’s where setting the date on the trigger to the CURRENT
date comes in. With this plugin should we be running a regular recommit (i.e. cron job) to catch updated information? From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Alan.Stiles It removes it in the lift_embargos script (<EPRINTS_ROOT>/bin/lift_emabargos) From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken Okay… so would my logic work in this instance to use the embargo date for the open access date? But it would only activate that POST embargo? Or does EPrints remove the embargo date once the embargo
has been lifted? From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of John Salter This is why the 'pending compliance' measure was introduced. This will be displayed if the embargo period is appropriate for the panel selected - but that the embargo is still active. Cheers, John From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken Thanks Andy, I’ve passed this to our repository manager for her thoughts as to how she was expecting it to work. I don’t know if she wasn’t hoping to see the embargo date appear here regardless… From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andy Reid Hi, My understanding is that it does not set the date of first compliant open access until it IS open access, i.e when the embargo has passed, and the lift-embargos script has set security
to public. It is not intended to set that ahead of time on the assumption the embargo will be lifted at the date projected
Andy From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken Ah, so if it’s setting the value incorrectly based on our set up then that value will always be incorrect unless we put in something to override that. Okay, so I’ve tested some things by commenting out that part of the logic and it’s definitely running my script. I think the issue is certainly that I was testing it on records that had already had
their values set. So, what we need to do, I guess, is make sure our custom logic is right, put the plugin on live, add our custom code and then hit recommit before anyone does anything that’s going to add these two dates? The logic we’re working on is: For First Compliant Deposit –
Is the document accepted or published? Set the date based on the datestamp of the record For First Compliant Open Access – Is the document a Whole Document? Is it accepted or published? Is it public? Set the date based on the embargo date of the document OR the datestamp of the record if there is no embargo date on the document And my code is: # date of first compliant deposit - this uses our value of "stage" $c->add_dataset_trigger( 'eprint', EPrints::Const::EP_TRIGGER_BEFORE_COMMIT, sub { my( %args ) = @_;
my( $repo, $eprint, $changed ) = @args{qw( repository dataobj changed )}; # trigger only applies to repos with hefce_oa plugin enabled return unless $eprint->dataset->has_field( "hoa_compliant" ); return if $eprint->is_set( "hoa_date_fcd" ); return if $eprint->value( "eprint_status" ) eq "inbox"; for( $eprint->get_all_documents ) { next unless $_->is_set( "stage" ); next unless $_->value( "stage" ) eq "accepted" || $_->value( "stage" ) eq "published"; $eprint->set_value( "hoa_date_fcd", $eprint->value( "datestamp" ) ); $eprint->set_value( "hoa_version_fcd", $_->value( "stage" ) eq "accepted" ? "AM" : "VoR" ); } }, priority => 100 ); # date of first compliant open access $c->add_dataset_trigger( 'eprint', EPrints::Const::EP_TRIGGER_BEFORE_COMMIT, sub { my( %args ) = @_;
my( $repo, $eprint, $changed ) = @args{qw( repository dataobj changed )}; # trigger only applies to repos with hefce_oa plugin enabled return unless $eprint->dataset->has_field( "hoa_compliant" ); return unless $eprint->is_set( "hoa_date_fcd" ); return if $eprint->is_set( "hoa_date_foa" ); for( $eprint->get_all_documents ) { next unless $_->is_set( "content" ); next unless $_->value( "content" ) eq " whole_document"; next unless $_->is_set( "stage" ); next unless $_->value( "stage" ) eq "accepted" || $_->value( "stage" ) eq "published"; next unless $_->is_public; if($_->isset( "date_embargo" )) { $eprint->set_value( "hoa_date_foa", $_->value( "date_embargo" ) ); } else { $eprint->set_value( "hoa_date_foa", $eprint->value( "datestamp" ) ); } } }, priority => 200 ); However I can’t seem to get the trigger for the foa to register anything. I’ll try your suggestion of dropping some outputs into the loop to see what steps it’s hitting but if you can see anything
glaring in my logic please let me know. I’m also not sure if the logic will really work with embargo dates as they need to be in the future – if that is the case then the document won’t be public and won’t satisfy that step of the logic… I think I’m finally starting to understand some of this, though… literally weeks before I’m due to move away from it… :rolleyes: From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Alan.Stiles It does exactly that – it sets it the first time and then doesn’t change it. The easiest (dirty) way I’ve found to check it’s processing my override code is to put a
print STDERR “My Local Override_Function_Name: Doing this bit now\n”; in the function, which should then appear in the apache error log. From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken Actually, looking at the code in more detail, for example this trigger (my modified version): # date of first compliant deposit - this uses our value of "stage" $c->add_dataset_trigger( 'eprint', EPrints::Const::EP_TRIGGER_BEFORE_COMMIT, sub { my( %args ) = @_;
my( $repo, $eprint, $changed ) = @args{qw( repository dataobj changed )}; # trigger only applies to repos with hefce_oa plugin enabled return unless $eprint->dataset->has_field( "hoa_compliant" ); return if $eprint->is_set( "hoa_date_fcd" ); return if $eprint->value( "eprint_status" ) eq "inbox"; for( $eprint->get_all_documents ) { next unless $_->is_set( "stage" ); next unless $_->value( "stage" ) eq "accepted" || $_->value( "stage" ) eq "published"; $eprint->set_value( "hoa_date_fcd", $_->value( "datestamp" ) ); $eprint->set_value( "hoa_version_fcd", $_->value( "stage" ) eq "accepted" ? "AM" : "VoR" ); } }, priority => 100 ); Does the line return if $eprint->is_set( "hoa_date_fcd" ); mean that if the date is already set, it won’t be recalculated? From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of Andrew Beeken Thanks John! Okay, I’ve made some progress on this but I’m coming a little unstuck on the Date of First Compliant Deposit and Date of First Compliant Open Access. So, for both of these I’ve needed to make a alight adjustment as we store our “accepted”, “published” etc in a variable called “stage”. No problem, I’ve switched “content” up for “stage” and I’m
getting a date through now based, looking at the code, on EPrints::Time::get_iso_date(). The issue I have is that in all instances this is showing as todays date. Speaking to Bev about how she wants this to work, she’s expecting to see: First Compliant Deposit – the date the record was created, so I assume the datestamp value of the record? First Compliant Open Access – the date the record was created OR the embargo date if one is set. Once I’ve got these setting as expected I think I can get this deployed. Any thoughts? From:
eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
On Behalf Of John Salter Hi Andrew, From:
eprints-tech-bounces@ecs.soton.ac.uk <eprints-tech-bounces@ecs.soton.ac.uk>
on behalf of Andrew Beeken <anbeeken@lincoln.ac.uk> Hi all! Putting some final bits into place before I retire from the repository work in 2018, I’m now looking at the REF Compliance Checker Plugin (http://eprintsug.github.io/hefce_oa/). One of
the things I’ve noticed is that the plugin doesn’t seem to be picking up some of our specific workflow modifications which is to be expected. With the RIOXX plugin I was able to change the fields that were being looked at however I can’t seem to see an obvious
way to do this in the documentation. Any thoughts? Cheers! Andrew
-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority
in relation to its secondary activity of credit broking. |
- Prev by Date: Re: [EP-tech] REF Compliance Checker Plugin
- Next by Date: [EP-tech] Latest Additions in my repository is not show
- Previous by thread: Re: [EP-tech] REF Compliance Checker Plugin
- Next by thread: [EP-tech] validation on upload field
- Index(es):