EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #09386


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

RE: [EP-tech] Question on export formats in HTML header


CAUTION: This e-mail originated outside the University of Southampton.

There is already some control, using the 'advertise' and/or 'visible' options for an export plugin.

This is the code that compiles the list to add to the <head> of a summary page:
v3.3: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Fperl_lib%2FEPrints%2FDataObj%2FEPrint.pm%23L1424-L1428&data=05%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C5bbd1b07c0654600720108dbaeb39033%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638295858459100574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=97bmcCVxfvarOHMGLsyfxJJoU83S4vu0%2FZ%2BxUKqK6%2Bg%3D&reserved=0
v3.4: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints3.4%2Fblob%2Fmaster%2Fperl_lib%2FEPrints%2FDataObj%2FEPrint.pm%23L1554-L1558&data=05%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C5bbd1b07c0654600720108dbaeb39033%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638295858459100574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=tP95znF1wrf9gThTOGA0Xa%2BSPwmeTKGRu2pU7bOWqeE%3D&reserved=0

Currently the export plugin 'visible' parameter can be 'all' or 'staff'.
Extending this to cover more cases might be an option - e.g. adding 'logged_in_user'.

Advertise is a true/false option - this would take a bit more engineering to change the behaviour.

The visible/advertise options also controls whether the export format appears in the search results export.
With the current options, I don't think we can exclude it from the <head> but still have it visible in the 'Export' formats list on the search results page (I've not traced it all the way through - so could be wrong!).

Cheers,
John

-----Original Message-----
From: eprints-tech-request@ecs.soton.ac.uk <eprints-tech-request@ecs.soton.ac.uk> On Behalf Of Matthew Kerwin
Sent: 06 September 2023 00:31
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] Question on export formats in HTML header

CAUTION: This e-mail originated outside the University of Southampton.

CAUTION: This e-mail originated outside the University of Southampton.

On Tue, 5 Sept 2023 at 23:59, Martin Brändle <martin.braendle@uzh.ch> wrote:
>
> Hi David,
>
> thanks for the explanation. As bots (good or bad ones) may travel through these links, we think for a large repository this may offer quite a large attacking flank (number of items * number of export links).
>
> Kind regards,
>
> Martin

I think a general summary could be: anything that shouldn't be discoverable, shouldn't be discoverable. If export plugins are exposed to (unauthenticated) users through a public interface, then they're discoverable by robots the same way. Maybe we could look into what configurable options there are (or could be) about which plugins are exposed in which ways.

At least good robots can be influenced using robots.txt, rel=nofollow, etc.

Cheers
--
  Matthew Kerwin
  https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmatthew.kerwin.net.au%2F&data=05%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C5bbd1b07c0654600720108dbaeb39033%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638295858459100574%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=NhRKFD2r7jXDSR81MN%2BSz3WJ%2F8wnScfnSyhlT4z25Xk%3D&reserved=0