EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #09812


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Ubuntu - Indexer as a service


CAUTION: This e-mail originated outside the University of Southampton.
Hi David,

Thanks for this. I've decided to go for a totally different option than either mine or yours. A script that runs every half an hour through cron. It grabs the indexer status. If it receives "Indexer is not running" it starts the indexer. If it receives "something along the lines of "...but the next index is overdue..." it stops it, waits two seconds, asks for the status again and then starts it if it gets "indexer is not running".

realised that what I had done was preventing me from ever stopping it. As soon as I stopped it a new pid fired up.

I had a think after you mentioned multiple instances of the indexer running and decided "ask and then do" rather than "just do it" might be better for this. I'll see how we go... My thinking is that the indexer being down for half an hour isn't the end of the world, but having multiple indexers running might feel very much like the end of the world.

Thanks,
James

On Wed, Aug 7, 2024 at 4:19 PM David R Newman <drn@ecs.soton.ac.uk> wrote:

Hi James,

This is what I current user on Rocky Linux 8 and 9:

[Unit]
Description=EPrints Indexer
After=crond.service mariadb.service
Requires=mariadb.service

[Service]
Type=forking
PIDFile=/opt/eprints3/var/indexer.pid
TimeoutStartSec=0
User=eprints
ExecStart=/opt/eprints3/site_lib/bin/epindexer_systemd_start
ExecStop=/opt/eprints3/bin/indexer stop
Restart=on-failure
RestartSec=60

[Install]
WantedBy=multi-user.target

/opt/eprints3/site_lib/bin/epindexer_systemd_start looks as follows:

#!/bin/bash
cd `dirname $0`
cd ../../
if [ -e var/indexer.pid ]; then
        pid=`ps aux | awk 'BEGIN{FS="[\t ]+"}{ print $11 }' | grep "indexer" | head -n 1`
        if [ "$pid" != "" ]; then
                bin/indexer stop
                sleep 5
        fi
        rm -f var/indexer.pid var/indexer.tick
fi
bin/indexer start


I have run this on tens of EPrints repository servers for years now and it is pretty reliable, if a bit hacky with epindexer_systemd_start.  It could probably be better written but it has been a work in incremental improvement.

There are two points to note with using Systemd module for the indexer:

1. Disable control of the indexer from the admin interface or you can end up with multiple instances running.
2. I still need a cron job that runs on restart that deletes indexer.pid if a stale version still exists (i.e the PID it holds does not point to an active indexer process based on /proc/PID/cmdline).  That could probably be integrated into epindexer_systemd_start.

Regards

David Newman


On 07/08/2024 16:00, James Kerwin wrote:
CAUTION: This e-mail originated outside the University of Southampton.
CAUTION: This e-mail originated outside the University of Southampton.
Hi everyone,

Wondering if anyone can check if this is a sane thing to do.

This has been bugging me for some time since we upgraded from Ubuntu 14 to 20/22 (I've lost track at the moment).

Whenever the server was restarted the eprints Indexer would stall. Then after an hour or so say "Indexer has stopped..."

I've been learning about systemd and turning /opt/eprints3/epindexer into a service.

I wanted it to be bound to Apache, which hasn't quite worked. I got it to a point where issuing: systemctl [stop/start] epindexer on the command line would give the desired action. When Apache stopped the indexer would stop (e.g. the indexer.pid file in /opt/eprints3/var would disappear). Unfortunately it didn't restart when Apache was started (although it did at least tell me it wasn't running, which is a help in itself).I did think of binding it to MySQL since that seems to restart without a problem after Apache goes down-up.

I've now got it to a point where Apache can be restarted, the indexer.pid file remains in place, but the indexer resumes its actions when Apache is back up. This is the service file I've made in /etc/systemd/system/epindexer.service:

[Unit]
Description=EP Indexer Service
After=network.target apache2.service

[Service]
Type=forking
ExecStart=/usr/bin/perl -T /opt/eprints3/bin/epindexer start
ExecStop=/usr/bin/perl -T /opt/eprints3/bin/epindexer stop
PIDFile=/opt/eprints3/var/indexer.pid
Restart=always
User=root
Group=root

[Install]
WantedBy=multi-user.target

It did previously have either "BindsTo=apache2.service" or "PartOf=apache2.service" on line 4, but this appeared to hinder more than help. I have just realised that depending on how I start the indexer the indexer.pid and indexer.tick have different owners (eprints from command line and www-data when I click the button. Both files have rw-rw-r-- permissions and eprints is in the www-data group and www-data is in the eprints group - I don't know if I'm on to something with this - maybe change the user in epindexer to www-data?).

Does this look sufficient? It's only on a Test server at the moment, but would look to put this on the live server if it doesn't raise any red flags. If any fellow Ubuntu users have had success with this I am all ears.

Appreciate this is not exactly EPrints, but it's such a niche question I thought it right to ask here.

Thanks,
James


*** Options: https://wiki.eprints.org/w/Eprints-tech_Mailing_List
*** Archive: https://www.eprints.org/tech.php/
*** EPrints community wiki: https://wiki.eprints.org/