EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #10078


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

RE: [EP-tech] HTTPS-only


CAUTION: This e-mail originated outside the University of Southampton.

Yuri

Many thanks for getting back to me. Interesting question! I am not exactly sure of the answer though, as I am somewhat out of my depth with this. What I have attached is most of the .conf files that I think are involved. As well as keeping them all in their original places in the installation, I have copy-pasted them all into this one text file, to help me see more easily what is where, especially for checking for accidental duplications or mismatches. I am sure that something might be missing from the configuration. Does the answer to your question lie in here, I wonder?

Best wishes

Will

-----Original Message-----
From: eprints-tech-request@ecs.soton.ac.uk <eprints-tech-request@ecs.soton.ac.uk> On Behalf Of Yuri
Sent: 08 April 2025 15:59
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] HTTPS-only

CAUTION: This e-mail originated outside the University of Southampton.

CAUTION: This e-mail originated outside the University of Southampton.

Seems like you're using a different perl environment. Here it is just doing use EPrints;

What are the cgi configs in your apache confs?

Il 08/04/25 16:25, Will Hughes ha scritto:
> /opt/eprints3/cgi/latest line 33
Regenerate the auto-generated conf files as follows:
/etc/apache2$ /opt/eprints3/bin/generate_apacheconf --system --replace
(However, will this may remove all the individual edits in these files? It does not seem to, though)

--------------------------------------------------------------------------------
1. /etc/apache2/apache2.conf
	# This is the main Apache server configuration file.  It contains the
	# configuration directives that give the server its instructions.
	# See http://httpd.apache.org/docs/2.4/ for detailed information about
	# the directives and /usr/share/doc/apache2/README.Debian about Debian specific
	# hints.
	#
	#
	# Summary of how the Apache 2 configuration works in Debian:
	# The Apache 2 web server configuration in Debian is quite different to
	# upstream's suggested way to configure the web server. This is because Debian's
	# default Apache2 installation attempts to make adding and removing modules,
	# virtual hosts, and extra configuration directives as flexible as possible, in
	# order to make automating the changes and administering the server as easy as
	# possible.

	# It is split into several files forming the configuration hierarchy outlined
	# below, all located in the /etc/apache2/ directory:
	#
	#	/etc/apache2/
	#	|-- apache2.conf
	#	|	`--  ports.conf
	#	|-- mods-enabled
	#	|	|-- *.load
	#	|	`-- *.conf
	#	|-- conf-enabled
	#	|	`-- *.conf
	# 	`-- sites-enabled
	#	 	`-- *.conf
	#
	#
	# * apache2.conf is the main configuration file (this file). It puts the pieces
	#   together by including all remaining configuration files when starting up the
	#   web server.
	#
	# * ports.conf is always included from the main configuration file. It is
	#   supposed to determine listening ports for incoming connections which can be
	#   customized anytime.
	#
	# * Configuration files in the mods-enabled/, conf-enabled/ and sites-enabled/
	#   directories contain particular configuration snippets which manage modules,
	#   global configuration fragments, or virtual host configurations,
	#   respectively.
	#
	#   They are activated by symlinking available configuration files from their
	#   respective *-available/ counterparts. These should be managed by using our
	#   helpers a2enmod/a2dismod, a2ensite/a2dissite and a2enconf/a2disconf. See
	#   their respective man pages for detailed information.
	#
	# * The binary is called apache2. Due to the use of environment variables, in
	#   the default configuration, apache2 needs to be started/stopped with
	#   /etc/init.d/apache2 or apache2ctl. Calling /usr/bin/apache2 directly will not
	#   work with the default configuration.


	# Global configuration
	#

	#
	# ServerRoot: The top of the directory tree under which the server's
	# configuration, error, and log files are kept.
	#
	# NOTE!  If you intend to place this on an NFS (or otherwise network)
	# mounted filesystem then please read the Mutex documentation (available
	# at <URL:http://httpd.apache.org/docs/2.4/mod/core.html#mutex>);
	# you will save yourself a lot of trouble.
	#
	# Do NOT add a slash at the end of the directory path.
	#
	#ServerRoot "/etc/apache2"

	#
	# The accept serialization lock file MUST BE STORED ON A LOCAL DISK.
	#
	#Mutex file:${APACHE_LOCK_DIR} default

	#
	# The directory where shm and other runtime files will be stored.
	#

	DefaultRuntimeDir ${APACHE_RUN_DIR}

	#
	# PidFile: The file in which the server should record its process
	# identification number when it starts.
	# This needs to be set in /etc/apache2/envvars
	#
	PidFile ${APACHE_PID_FILE}

	#
	# Timeout: The number of seconds before receives and sends time out.
	#
	Timeout 300

	#
	# KeepAlive: Whether or not to allow persistent connections (more than
	# one request per connection). Set to "Off" to deactivate.
	#
	KeepAlive On

	#
	# MaxKeepAliveRequests: The maximum number of requests to allow
	# during a persistent connection. Set to 0 to allow an unlimited amount.
	# We recommend you leave this number high, for maximum performance.
	#
	MaxKeepAliveRequests 100

	#
	# KeepAliveTimeout: Number of seconds to wait for the next request from the
	# same client on the same connection.
	#
	KeepAliveTimeout 5

	# These need to be set in /etc/apache2/envvars
	User ${APACHE_RUN_USER}
	Group ${APACHE_RUN_GROUP}

	#
	# HostnameLookups: Log the names of clients or just their IP addresses
	# e.g., www.apache.org (on) or 204.62.129.132 (off).
	# The default is off because it'd be overall better for the net if people
	# had to knowingly turn this feature on, since enabling it means that
	# each client request will result in AT LEAST one lookup request to the
	# nameserver.
	#
	HostnameLookups Off

	# ErrorLog: The location of the error log file.
	# If you do not specify an ErrorLog directive within a <VirtualHost>
	# container, error messages relating to that virtual host will be
	# logged here.  If you *do* define an error logfile for a <VirtualHost>
	# container, that host's errors will be logged there and not here.
	#
	ErrorLog ${APACHE_LOG_DIR}/error.log

	#
	# LogLevel: Control the severity of messages logged to the error_log.
	# Available values: trace8, ..., trace1, debug, info, notice, warn,
	# error, crit, alert, emerg.
	# It is also possible to configure the log level for particular modules, e.g.
	# "LogLevel info ssl:warn"
	#
	LogLevel warn

	# Include module configuration:
	IncludeOptional mods-enabled/*.load
	IncludeOptional mods-enabled/*.conf

	# Include list of ports to listen on
	Include ports.conf


	# Sets the default security model of the Apache2 HTTPD server. It does
	# not allow access to the root filesystem outside of /usr/share and /var/www.
	# The former is used by web applications packaged in Debian,
	# the latter may be used for local directories served by the web server. If
	# your system is serving content from a sub-directory in /srv you must allow
	# access here, or in any related virtual host.
	<Directory />
		Options FollowSymLinks
		AllowOverride None
		Require all denied
	</Directory>

	<Directory /usr/share>
		AllowOverride None
		Require all granted
	</Directory>

	<Directory /var/www/>
		Options Indexes FollowSymLinks
		AllowOverride None
		Require all granted
	</Directory>

	#<Directory /srv/>
	#	Options Indexes FollowSymLinks
	#	AllowOverride None
	#	Require all granted
	#</Directory>


	# AccessFileName: The name of the file to look for in each directory
	# for additional configuration directives.  See also the AllowOverride
	# directive.
	#
	AccessFileName .htaccess

	#
	# The following lines prevent .htaccess and .htpasswd files from being
	# viewed by Web clients.
	#
	<FilesMatch "^\.ht">
		Require all denied
	</FilesMatch>

	#
	# The following directives define some format nicknames for use with
	# a CustomLog directive.
	#
	# These deviate from the Common Log Format definitions in that they use %O
	# (the actual bytes sent including headers) instead of %b (the size of the
	# requested file), because the latter makes it impossible to detect partial
	# requests.
	#
	# Note that the use of %{X-Forwarded-For}i instead of %h is not recommended.
	# Use mod_remoteip instead.
	#
	LogFormat "%v:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined
	LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined
	LogFormat "%h %l %u %t \"%r\" %>s %O" common
	LogFormat "%{Referer}i -> %U" referer
	LogFormat "%{User-agent}i" agent

	# Include of directories ignores editors' and dpkg's backup files,
	# see README.Debian for details.

	# Include generic snippets of statements
	IncludeOptional conf-enabled/*.conf

	# Include the virtual host configurations:
	# Include /opt/eprints3/archives/arcom/ssl/securevhost.conf
	Include /opt/eprints3/archives/*/ssl/securevhost.conf
	ServerName 139.59.180.56
	# SSLStaplingCache shmcb:/var/run/apache2/ssl_stapling_cache(512000)



--------------------------------------------------------------------------------
2. /opt/eprints3/cfg/apache.conf
	#
	# apache.conf include file for EPrints
	#
	# Any changes made here will be lost if you run generate_apacheconf
	# with the --replace --system options
	#

	# Load the perl modules & repository configurations
	PerlSwitches -I/opt/eprints3/perl_lib
	Include /opt/eprints3/cfg/perl_module_isolation.conf

	# Load the per-repository apache configuration
	Include /opt/eprints3/cfg/apache/*.conf

--------------------------------------------------------------------------------
3. /opt/eprints3/cfg/apache_ssl.conf
	#
	# apache_ssl.conf include file for EPrints
	#
	# Any changes made here will be lost if you run generate_apacheconf
	# with the --replace --system options
	#

	# Note that PerlTransHandler can't go inside
	# a "Location" block as it occurs before the
	# Location is known.
	PerlTransHandler +EPrints::Apache::Rewrite

	# Load the per-repository apache configuration
	Include /opt/eprints3/cfg/apache_ssl/*.conf


--------------------------------------------------------------------------------
4. /opt/eprints3/cfg/apache_ssl.conf3. /opt/eprints3/cfg/perl_module_isolation.conf
	##This file is included by apache.conf -- Do not edit this file directly.
	##You should edit the perl_module_isolation flag in /opt/eprints3/perl_lib/EPrints/SystemSettings.pm, then run /opt/eprints3/bin/generate_apacheconf --system --replace to regenerate this file.

	##The following two lines turn the perl_module_isolation OFF (All repositories now sharing a single perl interpreter and have access to all perl modules.)
	#PerlModule EPrints
	#PerlPostConfigHandler +EPrints::post_config_handler

--------------------------------------------------------------------------------
5. /opt/eprints3/cfg/perl_module_isolation_vhost.conf
	##This file is included by each repository's apache conf -- Do not edit this file directly.
	##You should edit the perl_module_isolation flag in /opt/eprints3/perl_lib/EPrints/SystemSettings.pm, then run /opt/eprints3/bin/generate_apacheconf --system --replace to regenerate this file.

	##The following three lines are commented out to turn the perl_module_isolation OFF (All repositories now sharing a single perl interpreter and have access to all perl modules.)
	#PerlOptions +Parent
	#PerlSwitches -I/opt/eprints3/perl_lib
	#PerlModule EPrints


--------------------------------------------------------------------------------
6. /opt/eprints3/cfg/apache/arcom.conf
	#
	# apache.conf include file for arcom
	#
	# Any changes made here will be lost if you run generate_apacheconf
	# with the --replace option
	#
	# This file manually created after guidance from https://wiki.eprints.org/w/HTTPS-only_and_HSTS

	# The main virtual host for this repository
	<VirtualHost *:80>
	  RedirectPermanent / https://arcomabstracts.com/
	</VirtualHost>


--------------------------------------------------------------------------------
7. /opt/eprints3/cfg/apache_ssl/arcom.conf
	#
	# secure.conf include file for arcom
	#
	# Any changes made here will be lost if you run generate_apacheconf
	# with the --replace option
	#
	  # Set by $c->{max_upload_filesize}
	  LimitRequestBody 1073741824
	  
	Include /opt/eprints3/cfg/perl_module_isolation_vhost.conf

	  <Location "">
		PerlSetVar EPrints_ArchiveID arcom
		PerlSetVar EPrints_Secure yes
		PerlSetVar ArchiveDocRoot /opt/eprints3/archives/arcom/html

		Options +ExecCGI
		<IfModule mod_authz_core.c>
		   Require all granted
		</IfModule>
		<IfModule !mod_authz_core.c>
		   Order allow,deny
		   Allow from all
		</IfModule>
	  </Location>


--------------------------------------------------------------------------------
8. /opt/eprints3/archives/arcom/ssl/securevhost.conf
	<VirtualHost *:443>
	  ServerName arcomabstracts.com
	  Header set Strict-Transport-Security "max-age=15780000"

	  # EPrints core config
	  Include /opt/eprints3/cfg/apache_ssl/arcom.conf
	  Include /opt/eprints3/cfg/perl_module_isolation_vhost.conf

	  # SSL Configuration
	  SSLEngine on
	  SSLProtocol all -SSLv2 -SSLv3 -TLSv1 -TLSv1.1
	  SSLHonorCipherOrder on
	  SSLCipherSuite HIGH:!aNULL:!MD5:!3DES
	  SSLCertificateFile /opt/eprints3/archives/arcom/ssl/arcomabstracts.com.crt
	  SSLCertificateKeyFile /opt/eprints3/archives/arcom/ssl/arcomabstracts.com.key
	  SSLCertificateChainFile /opt/eprints3/archives/arcom/ssl/arcomabstracts.com.ca-bundle

	  # Content handling
	  DocumentRoot /opt/eprints3/archives/arcom/html
	  
	  # CGI configuration
	  ScriptAlias /cgi/ "/opt/eprints3/cgi/"
	  <Directory "/opt/eprints3/cgi">
		  Options +ExecCGI
		  SetHandler perl-script
		  PerlResponseHandler ModPerl::Registry
		  PerlOptions +ParseHeaders
		  Require all granted
	  </Directory>
	  
	  <Directory "/opt/eprints3/archives/arcom/html">
		Options +Indexes +ExecCGI FollowSymLinks
		AllowOverride None
		Require all granted
		DirectoryIndex index.xpage index.html
	  </Directory>

	  # EPrints-specific
	  <Location "">
		  PerlSetVar EPrints_ArchiveID arcom
		  PerlSetVar EPrints_Secure yes
		  Options +ExecCGI
		  Require all granted
	  </Location>

	  # IE workarounds
	  SetEnvIf User-Agent ".*MSIE.*" \
		nokeepalive ssl-unclean-shutdown \
		downgrade-1.0 force-response-1.0

	  # Logging
	  ErrorLog logs/ssl_error_log
	  TransferLog logs/ssl_access_log
	  LogLevel warn
	  CustomLog logs/ssl_request_log "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"
	</VirtualHost>



Also, see these files in /opt/eprints3/archives/arcom/cfg/cfg.d/

--------------------------------------------------------------------------------
9. /opt/eprints3/archives/arcom/cfg/cfg.d/10_core.pl
		# This file was edited from that created by bin/epadmin
		# You can regenerate this file by doing ./bin/epadmin config_core arcom but it will overwrite these settings

		 $c->{host} = 'arcomabstracts.com'; 
		 $c->{base_url} = 'https://arcomabstracts.com';
		 $c->{perl_url} = 'https://arcomabstracts.com/cgi';
		 $c->{https_cgiurl} = 'https://arcomabstracts.com/cgi';
		 
		$c->{securehost} = $c->{host};
		$c->{secureport} = 443;
		$c->{port} = 443;
		$c->{protocol} = 'https';
		$c->{http_root} = undef;

		$c->{aliases} = [];
		$c->{urlpath} = "/";
		$c->{userhome} = "/cgi/users/home";

--------------------------------------------------------------------------------
10. /opt/eprints3/archives/arcom/cfg/cfg.d/20_baseurls.pl
	 # 20_baseurls.pl - Constructs base URLs for EPrints (HTTPS-only)

	{
		my $uri = URI->new("https://";);

		# Ensure securehost is set for HTTPS-only configuration
		unless (EPrints::Utils::is_set($c->{securehost})) {
			die "securehost is not set! This repository is configured to be HTTPS-only.";
		}

		# Set up the URI for HTTPS
		$uri->scheme("https");
		$uri->host($c->{securehost});
		$uri->port($c->{secureport} || 443);  # Default to port 443 if not set
		$uri = $uri->canonical;

		# Ensure the path does not start or end with a slash
		my $path = $c->{https_root} || "";
		$path =~ s{^/|/$}{}g;  # Remove leading and trailing slashes
		$uri->path($path);

		# EPrints base URL without trailing slash
		$c->{base_url} = "$uri";
		# CGI base URL without trailing slash
		$c->{perl_url} = "$uri/cgi";
		
			### DEBUG OUTPUT ###
		use Data::Dumper;
		local $Data::Dumper::Indent = 1;  # Compact output
		print "\n--- Final Variable Values ---\n";
		print "URI: " . $uri->as_string . "\n";
		print "Path: " . ($path || "(empty)") . "\n";
		print "\$c->{base_url}: " . ($c->{base_url} || "(unset)") . "\n";
		print "\$c->{perl_url}: " . ($c->{perl_url} || "(unset)") . "\n";
		print "\nFull \$c dump (relevant parts):\n";
		print Dumper({
			securehost => $c->{securehost},
			secureport => $c->{secureport} // '(default 443)',
			https_root => $c->{https_root} // '(unset)',
			base_url   => $c->{base_url},
			perl_url   => $c->{perl_url},
		});
		print "--- End Debug ---\n\n";
		
	}

	# Set EPrints abstract page URL to /id/eprint/XX instead of the shorter version /XX, for search engine optimisation.
	$c->{use_long_url_format} = 1;

	# Set EPrints abstract page URL to /id/eprint/XX instead of the shorter version /XX, for search engine optimisation.
	$c->{use_long_url_format} = 1;