EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #07006


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] REF Compliance Checker Plugin


In answer to your first email, yes, run it once to populate, let 'normal' running do the rest.

 

In answer to this, I think the idea behind the REF_CC tool was that it was a 'standard' implementation - and didn't allow for people to 'invent' data to help compliance.

We could allow humans to put dates in. We can intrinsically trust them to do an honest job.

But, why do that when the data we need is already in the system - and can be obtained.

 

There was a post on the one of the other repo mailing lists about the REF_CC tool - in relation to the new exception (that will take effect from Apr'18).

If a script should be included with the tool to 'populate FOA dates from history dataset', then 'we' (not quite sure who that is) could add it.

 

The 'proper' answer to this question won't be apparent until HEfCE announce how they will be measuring this compliance data.

If they will require some form of 'proof', then the history dataset route will become more valuable than a human-filled date input.

 

As a first step, I'd suggest checking whether you do have the 'lift embargos' entries in the history database table.

If it makes the powers-that-be happy, maybe add a date field to the document to store the data - but also suggest that there may be a *better* way.

If they want a better way, make sure they engage with the discussion around issues/improvements to the REF_CC tool.

 

Cheers,

John

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 01 December 2017 13:27
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

What about the idea of adding an override for the Open Access date to the REF tab and allowing repository staff to override that value? (I’m trying to consider the path of least resistance here as I’m running out of time!)

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of John Salter
Sent: 01 December 2017 12:55
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

I have some magic (verging on voodoo) that may help here :o)

 

Changes to an EPrint are logged in the 'history' dataset.

Normally the 'actor' recorded is the user who was logged in when the change was made.

In the case of a script making the change, the name/path of the script is used as the actor.

 

In the database, run a query like:

SELECT DISTINCT actor from history WHERE actor LIKE '%lift_embargo%';

 

I get a few values like:

/opt/eprints3/bin/lift_embargos

/opt/eprints/bin/lift_embargos

bin/lift_embargos wrro

/usr/share/eprints/bin/lift_embargos wrro

 

-the variations come from moving from Solaris to RHEL, and different ways the script has been run in the past.

 

For a given EPrint, you can construct a search of the history dataset, for an actor returned in the query above.

From the search results, you should be able to:

- get the revision XML file for this change

- get the revision XML file for the previous change

- run an XPath query to see if the document that has had an embargo lifted is a document version (in your situation, 'stage') that you're interested in (accepted or published).

 

This outline process was the result of a discussion with LSE on a similar theme.

I think this could be implemented as a bin script. I would not recommend that this forms part of the commit trigger - I think it might be a bit heavy/slow for that process.

It hasn't been turned into a 'full' solution - but this could be done if there is a need for it.

 

Hope that glimpse of the dark-arts has helped!

John

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 01 December 2017 12:22
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

Just looking at this again, in its raw form how is the plugin expected to handle any deposits that are valid but, for example, have come out of embargo before the plugin has been installed? For example, if I have a record that was added to the repository on 21/2/2016 and left embargo on 6/11/2016, I’d want the date of First Compliant Deposit to be 21/2/2016 and the Date of First Compliant Open Access to be 6/11/2016. It seems that if I install the plugin on 1/12/2017, and recommitted the records then it would set both dates to be 1/12/2017? Am I getting this backwards, as my tests on existing data are showing that to be the case…

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 01 December 2017 09:15
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

So what we need is some way of checking that open access rule for the embargo date – I guess if we want to detect whether it’s the embargo date or the published date that we need is going to be tricky? Maybe adding a field to the REF tab would be better and letting the repository staff manually enter that?

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of John Salter
Sent: 30 November 2017 16:32
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

As long as you're not poking data straight into the database, then everything should be handled without any regular jobs.

The calculations all happen when an EPrint is committed (which is also cascaded when a document is committed).

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 30 November 2017 16:22
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

Yeah, I thought so – so in a realistic scenario we’d never be able to use the embargo date as the date of first open access – BUT, I guess that’s where setting the date on the trigger to the CURRENT date comes in.

 

With this plugin should we be running a regular recommit (i.e. cron job) to catch updated information?

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Alan.Stiles
Sent: 30 November 2017 16:00
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

It removes it in the lift_embargos script (<EPRINTS_ROOT>/bin/lift_emabargos)

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 30 November 2017 15:39
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

Okay… so would my logic work in this instance to use the embargo date for the open access date? But it would only activate that POST embargo? Or does EPrints remove the embargo date once the embargo has been lifted?

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of John Salter
Sent: 30 November 2017 15:27
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

This is why the 'pending compliance' measure was introduced.

This will be displayed if the embargo period is appropriate for the panel selected - but that the embargo is still active.

 

Cheers,

John

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 30 November 2017 14:04
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

Thanks Andy,

 

I’ve passed this to our repository manager for her thoughts as to how she was expecting it to work. I don’t know if she wasn’t hoping to see the embargo date appear here regardless…

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andy Reid
Sent: 30 November 2017 12:53
To: 'eprints-tech@ecs.soton.ac.uk' <eprints-tech@ecs.soton.ac.uk>
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

Hi,

My understanding is that it does not set the date of first compliant open access until it IS open access, i.e when the embargo has passed, and the lift-embargos script has set security to public.  It is not intended to set that ahead of time on the assumption the embargo will be lifted at the date projected

 

Andy

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 30 November 2017 12:38
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

Ah, so if it’s setting the value incorrectly based on our set up then that value will always be incorrect unless we put in something to override that.

 

Okay, so I’ve tested some things by commenting out that part of the logic and it’s definitely running my script. I think the issue is certainly that I was testing it on records that had already had their values set. So, what we need to do, I guess, is make sure our custom logic is right, put the plugin on live, add our custom code and then hit recommit before anyone does anything that’s going to add these two dates?

 

The logic we’re working on is:

 

For First Compliant Deposit –

Is the document accepted or published?

Set the date based on the datestamp of the record

 

For First Compliant Open Access –

Is the document a Whole Document?

Is it accepted or published?

Is it public?

Set the date based on the embargo date of the document OR the datestamp of the record if there is no embargo date on the document

 

And my code is:

 

# date of first compliant deposit - this uses our value of "stage"

$c->add_dataset_trigger( 'eprint', EPrints::Const::EP_TRIGGER_BEFORE_COMMIT, sub

{

                my( %args ) = @_;

                my( $repo, $eprint, $changed ) = @args{qw( repository dataobj changed )};

 

                # trigger only applies to repos with hefce_oa plugin enabled

                return unless $eprint->dataset->has_field( "hoa_compliant" );

 

                return if $eprint->is_set( "hoa_date_fcd" );

                return if $eprint->value( "eprint_status" ) eq "inbox";

 

                for( $eprint->get_all_documents )

                {

                                next unless $_->is_set( "stage" );

                                next unless $_->value( "stage" ) eq "accepted" || $_->value( "stage" ) eq "published";

                                $eprint->set_value( "hoa_date_fcd", $eprint->value( "datestamp" ) );

                                $eprint->set_value( "hoa_version_fcd", $_->value( "stage" ) eq "accepted" ? "AM" : "VoR" );

                }

}, priority => 100 );

 

# date of first compliant open access

$c->add_dataset_trigger( 'eprint', EPrints::Const::EP_TRIGGER_BEFORE_COMMIT, sub

{

                my( %args ) = @_;

                my( $repo, $eprint, $changed ) = @args{qw( repository dataobj changed )};

 

                # trigger only applies to repos with hefce_oa plugin enabled

                return unless $eprint->dataset->has_field( "hoa_compliant" );

 

                return unless $eprint->is_set( "hoa_date_fcd" );

                return if $eprint->is_set( "hoa_date_foa" );

 

                for( $eprint->get_all_documents )

                {

                                next unless $_->is_set( "content" );

                                next unless $_->value( "content" ) eq " whole_document";

                                next unless $_->is_set( "stage" );

                                next unless $_->value( "stage" ) eq "accepted" || $_->value( "stage" ) eq "published";

                                next unless $_->is_public;

                                if($_->isset( "date_embargo" ))

                                {

                                                $eprint->set_value( "hoa_date_foa",  $_->value( "date_embargo" ) );

                                }

                                else

                                {

                                                $eprint->set_value( "hoa_date_foa",  $eprint->value( "datestamp" ) );

                                }

                }

}, priority => 200 );

 

However I can’t seem to get the trigger for the foa to register anything. I’ll try your suggestion of dropping some outputs into the loop to see what steps it’s hitting but if you can see anything glaring in my logic please let me know. I’m also not sure if the logic will really work with embargo dates as they need to be in the future – if that is the case then the document won’t be public and won’t satisfy that step of the logic…

 

I think I’m finally starting to understand some of this, though… literally weeks before I’m due to move away from it…

 

:rolleyes:

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Alan.Stiles
Sent: 30 November 2017 11:02
To:
eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

It does exactly that – it sets it the first time and then doesn’t change it.

 

The easiest (dirty) way I’ve found to check it’s processing my override code is to put a

 

print STDERR “My Local Override_Function_Name: Doing this bit now\n”;

 

in the function, which should then appear in the apache error log.

 

 

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 30 November 2017 10:34
To:
eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

Actually, looking at the code in more detail, for example this trigger (my modified version):

 

# date of first compliant deposit - this uses our value of "stage"

$c->add_dataset_trigger( 'eprint', EPrints::Const::EP_TRIGGER_BEFORE_COMMIT, sub

{

                my( %args ) = @_;

                my( $repo, $eprint, $changed ) = @args{qw( repository dataobj changed )};

 

                # trigger only applies to repos with hefce_oa plugin enabled

                return unless $eprint->dataset->has_field( "hoa_compliant" );

 

                return if $eprint->is_set( "hoa_date_fcd" );

                return if $eprint->value( "eprint_status" ) eq "inbox";

 

                for( $eprint->get_all_documents )

                {

                                next unless $_->is_set( "stage" );

                                next unless $_->value( "stage" ) eq "accepted" || $_->value( "stage" ) eq "published";

                                $eprint->set_value( "hoa_date_fcd", $_->value( "datestamp" ) );

                                $eprint->set_value( "hoa_version_fcd", $_->value( "stage" ) eq "accepted" ? "AM" : "VoR" );

                }

}, priority => 100 );

 

Does the line return if $eprint->is_set( "hoa_date_fcd" ); mean that if the date is already set, it won’t be recalculated?

 

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 29 November 2017 15:44
To:
eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

Thanks John!

 

Okay, I’ve made some progress on this but I’m coming a little unstuck on the Date of First Compliant Deposit and Date of First Compliant Open Access.

 

So, for both of these I’ve needed to make a alight adjustment as we store our “accepted”, “published” etc in a variable called “stage”. No problem, I’ve switched “content” up for “stage” and I’m getting a date through now based, looking at the code, on EPrints::Time::get_iso_date(). The issue I have is that in all instances this is showing as todays date.

 

Speaking to Bev about how she wants this to work, she’s expecting to see:

 

First Compliant Deposit – the date the record was created, so I assume the datestamp value of the record?

 

First Compliant Open Access – the date the record was created OR the embargo date if one is set.

 

Once I’ve got these setting as expected I think I can get this deployed.

 

Any thoughts?

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of John Salter
Sent: 29 November 2017 12:38
To:
eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] REF Compliance Checker Plugin

 

Hi Andrew,
Some parts of the REF_CC plugin can be overridden in the configuration.
Have a look at the stuff in ~/lib/cfg.d/ - anything that starts $c->{'something'} can be overridden in an archive specific config file.

If there are specific things you're trying to do that don't seem to be covered by this, you know where to ask!

Cheers,
John


From: eprints-tech-bounces@ecs.soton.ac.uk <eprints-tech-bounces@ecs.soton.ac.uk> on behalf of Andrew Beeken <anbeeken@lincoln.ac.uk>
Sent: 29 November 2017 12:12:41
To:
eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] REF Compliance Checker Plugin

 

Hi all!

 

Putting some final bits into place before I retire from the repository work in 2018, I’m now looking at the REF Compliance Checker Plugin (http://eprintsug.github.io/hefce_oa/). One of the things I’ve noticed is that the plugin doesn’t seem to be picking up some of our specific workflow modifications which is to be expected. With the RIOXX plugin I was able to change the fields that were being looked at however I can’t seem to see an obvious way to do this in the documentation. Any thoughts?

 

Cheers!

Andrew


Image removed by sender. TEF Gold

The University of Lincoln, located in the heart of the city of Lincoln, has established an international reputation based on high student satisfaction, excellent graduate employment and world-class research.


The information in this e-mail and any attachments may be confidential. If you have received this email in error please notify the sender immediately and remove it from your system. Do not disclose the contents to another person or take copies.

Email is not secure and may contain viruses. The University of Lincoln makes every effort to ensure email is sent without viruses, but cannot guarantee this and recommends recipients take appropriate precautions.

The University may monitor email traffic data and content in accordance with its policies and English law. Further information can be found at:
http://www.lincoln.ac.uk/legal.

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority in relation to its secondary activity of credit broking.