EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #09201


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Adding 301 redir_permanent for some migrated items before 404 kicks in.


CAUTION: This e-mail originated outside the University of Southampton.
Hi Matt,
Where/how are you curating the list of items that do need to be redirected, and how many will there be?

I've used various URL_REWRITE triggers over the years, and not had any issues when keeping to these guides:
  • if the trigger is not relevant to the request being made, return quickly (e.g. does it match a request for an EPrintID)
  • if possible, keep the the logic away from the database for as long as possible e.g. $eprint->value( "blah" ) is 'heavier' than matching just an EPrintID via regex.
If you know your redirected records exist in a specific range if EPrintIDs e.g. 1-10000, but it required info from the database to see which ones specifically, my logic would be:
  • does the incoming URL match a pattern with an EPrintID in it (think about cgi/export URLs too!)? If not, return.
  • is that EPrintID in the range of possible redirects? If not, return.
  • ...do logic based on eprint field values
Depending on how many redirects you need to put in place, using a config value, and using that in a grep is probably more than efficient enough:
$c->{eprints_to_redirect} = {
      1234 => 'https://somewhere.else/2345',
      1236 => 'https://new.record/321',
      ...
};

In addition to the above, the triggers are a 'stack' - so setting a priority on them could allow you to order them as you want.Having said that, I would suggest that (if you can do this without hitting the database - using a config hash instead), having it as the first trigger in the stack - and making sure it sets the trigger return value would be preferable - so the request does not pass through the rest of the stack.

It's also worth noting that there are document- and eprint- specific URL rewrite triggers.
If you do need to get data from the database, these would be preferable to use, as they are already passed the $eprint and $doc objects.

Hope that helps a bit - happy to provide examples if you want.

Cheers,
John



From: eprints-tech-bounces@ecs.soton.ac.uk <eprints-tech-bounces@ecs.soton.ac.uk> on behalf of Matthew Brady via Eprints-tech <eprints-tech@ecs.soton.ac.uk>
Sent: 15 February 2023 01:13
To: eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>
Subject: [EP-tech] Adding 301 redir_permanent for some migrated items before 404 kicks in.
 
CAUTION: This e-mail originated outside the University of Southampton.

Hi All,

 

Running 3.3.16.

 

Looking to migrate some records from our primary system to a secondary system, and redirect all the bots we’ve collected over the years, via redir_permanent (301), for the migrated items to the new location (rather than have the primary show a 404).

 

I have based my initial attempt which is working, on the logic in cfg/cfg.d/rewrite_url_demo.pl, but it fires for every uri hit the server sees,… valid records, css, etc…

 

Is there a way to allow the system to function as usual, and then if the uri request cannot be found, fire this particular redirect logic, and if it falls through that, then present the 404 as usual??

Primarily just to keep the trigger from firing until needed… since I don’t need the trigger to lookup potentially ~000’s of eprintid’s, for every uri request coming through…

 

Original logic,

# $c->add_trigger( EP_TRIGGER_URL_REWRITE, sub {

#             my( %o ) = @_;

#

#             if( $o{uri} eq $o{urlpath}."/testpath01" )

#             {

#                             ${$o{return_code}} = EPrints::Apache::Rewrite::redir( $o{request}, https://totl.net/ );

#                             return EP_TRIGGER_DONE;

#             }

# } );

 

Updated to

# $c->add_trigger( EP_TRIGGER_URL_REWRITE, sub {

#             my( %o ) = @_;

#             my $url = "">

#             if( $url =~ m#^/(\d+)/#  || $url =~ m#^/id/eprint/(\d+)/# )            # <-- This snippet taken from cgi/handle_404

#             {

#                             if( $1 eq "99999" )

#                             {

#                                             ${$o{return_code}} = EPrints::Apache::Rewrite:: redir_permanent ( $o{request}, http://not.the.real.link.net/ );

#                                             return EP_TRIGGER_DONE;

#                             }

#             }

 

 

Cheers,

 

Matt.

__________________________________________________________________
This email (including any attached files) is confidential and is 
for the intended recipient(s) only. If you received this email by 
mistake, please, as a courtesy, tell the sender, then delete this 
email.
The views and opinions are the originator's and do not necessarily 
reflect those of the University of Southern Queensland. Although 
all reasonable precautions were taken to ensure that this email 
contained no viruses at the time it was sent we accept no 
liability for any losses arising from its receipt.
The University of Southern Queensland is a registered provider 
of education with the Australian Government.
(CRICOS Institution Code QLD 00244B / NSW 02225M, TEQSA PRV12081)