EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #01580


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: RFC access log table


Il 15/02/2013 10:59, Tim Brody ha scritto:
IRStats builds summary tables, so it doesn't need the data once the
processing has been run.

But ... unless you are really tight on space

bigger tables can be a problem or nothing to worry about?

  I would always keep the
original data. If you have Apache logs you could reverse-engineer them into
the access-log equivalent (URL matching).

I would keep them until possible :)


/Tim.

On Fri, 15 Feb 2013 10:59:15 +0100, Yuri<yurj@alfa.it>  wrote:
Great! 5,9 GB freed :-)

IRStats use them once or they need them always?

Il 15/02/2013 10:32, Tim Brody ha scritto:
Hi,

Yes, there is nothing in the core that relies on data in access*. The
IRStats 1&   2 use access to create their summary data.

It looks like the best solution is to provide a tool to periodically
dump
historic access data to files, but that it is still useful to keep
"current" (defined by config) data in the database.

All the best,
Tim.

On Fri, 15 Feb 2013 08:13:52 +0100, Yuri<yurj@alfa.it>   wrote:
We've a test server which is a clone of the production server. Can I
empty those access tables safely to save space? :) can I do an "delete
*
from access" without any issue? The same for access__ordervalues_en and
all the languages?

Il 15/02/2013 03:13, Mark Gregson ha scritto:
Hi Tim

Because of the DB backup issues we invested some time a while ago in
some
scripts for archiving the access data off to monthly dumps and for
restoring it (if required, say be the need to have IRStats reprocess
all
data). These scripts are not actually in production use because I
haven't
had time to test it to my satisfaction (sorry Nick!).

CSV is a more accessible format than a MySQL dump, which may be a
benefit.

We are using IRStats for statistics which uses the access table but I
guess this will be easily updated with a new parser. We also do some
custom logging to the access table for reporting on outbound link
clicks
via IRStats.  This logging is handled via EPrints::Apache::LogHandler.

Cheers
Mark


-----Original Message-----
From: eprints-tech-bounces@ecs.soton.ac.uk
[mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Tim Brody
Sent: Thursday, 14 February 2013 8:01 PM
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] RFC access log table

Hi All,

I'm thinking about the access log table and how it can be made
sustainable.

What I'm suggesting is to write accesses to CSV-formatted log files,
one
file per month. What I don't know is whether anyone is relying on the
database table for generating statistics?

The problem the access log table creates is in backing-up the EPrints
database.

I'd appreciate any thoughts/comments.

--
All the best,
Tim

*** Options:
http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** Options:
http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/