EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #08557
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] Partitioning access table (INNODB)
- To: eprints-tech@ecs.soton.ac.uk, "Juan C. Herraiz Regidor" <jcherraiz@ucm.es>
- Subject: Re: [EP-tech] Partitioning access table (INNODB)
- From: dago salas <dago.salas@gmail.com>
- Date: Sun, 28 Mar 2021 21:32:19 -0600
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-techHello everybody,
our eprints repository has an access table with more than 133 million records and uploading.
We recently updated the IRStats module version 1.1 in our test repository, and the reindexing has been running for 9 days and counting.
We have also converted the access table to INNODB and compressed (it took 7 hours).
The MySQL slow queries log reports:
- At first it took 2 seconds per query: Query_time: 2.068357
Time: 190304 0:13:52
# User@Host: eprintsdbo[eprintsdbo] @ [10.147.128.44] Id: 80
# Query_time: 2.068357 Lock_time: 0.000141 Rows_sent: 100000 Rows_examined: 700000
SET timestamp=1551654832;
SELECT `accessid`,`datestamp_year`,`datestamp_month`,`datestamp_day`,`datestamp_hour`,`datestamp_minute`,`datestamp_second`,`requester_id`,`requester_user_agent`,`referring_entity_id`,`service_type_id`,`referent_id`,`referent_docid` FROM `access` LIMIT 100000 OFFSET 600000;
- Nine days later .., it took 660 seconds per query: Query_time: 661.963604
# Time: 190312 12:33:07
# User@Host: eprintsdbo[eprintsdbo] @ [10.147.128.44] Id: 1077
# Query_time: 661.963604 Lock_time: 0.000180 Rows_sent: 99787 Rows_examined: 123899787
SET timestamp=1552390387;
SELECT `accessid`,`datestamp_year`,`datestamp_month`,`datestamp_day`,`datestamp_hour`,`datestamp_minute`,`datestamp_second`,`requester_id`,`requester_user_agent`,`referring_entity_id`,`service_type_id`,`referent_id`,`referent_docid` FROM `access` LIMIT 100000 OFFSET 123800000;
I was wondering if someone has partitioned the access table (for example every 10 million records - access) and if this would improve the generation of statistics.
Regards,
JC
Juan Carlos Herraiz Regidor
Gobierno TI
Servicios Informáticos · Gobierno TI. Avenida Complutense s/n. 28040 Madrid
Teléfono: +34 91 394 5130, Fax: +34 91 394 4773
___
La información contenida en este correo es CONFIDENCIAL, de uso exclusivo del destinatario/a arriba mencionado. Si ha recibido este mensaje por error, notifíquelo inmediatamente por esta misma vía y proceda a su eliminación, ya que ud. tiene totalmente prohibida cualquier utilización del mismo, en virtud de la legislación vigente.
Los datos personales recogidos serán incorporados y tratados en el fichero 'Correoweb', bajo la titularidad del Vicerrectorado de Tecnologías de la Información, y en él el interesado/a podrá ejercer los derechos de acceso, rectificación, cancelación y oposición ante el mismo (artículo 5 de la Ley Orgánica 15/1999, de 13 de diciembre, de Protección de Datos de Carácter Personal).
Antes de imprimir este correo piense si es necesario: el medioambiente es cosa de todos.
This message is private and confidential and it is intended exclusively for the addressee. If you receive this message by mistake, you should not disseminate, distribute or copy this e-mail. Please inform the sender and delete the message and attachments from your system, as it is completely forbidden for you to use this information, according to the current legislation. No confidentiality nor any privilege regarding the information is waived or lost by any mistransmission or malfunction.
The personal data herein will be collected in the file "Correoweb", under the ownership of the Vice-Rectorate for Information Technologies, in which those interested may exercise their right to access, rectify, erasure or right to object the contents (article 15-21 of Regulation (EU) 2016/679, General Data Protection Regulation).
Before printing this mail please consider whether it is really necessary: the environment is a concern for us all.
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/
dago.salas@gmail.com
- Follow-Ups:
- Re: [EP-tech] Partitioning access table (INNODB)
- From: dago salas <dago.salas@gmail.com>
- Re: [EP-tech] Partitioning access table (INNODB)
- References:
- [EP-tech] Partitioning access table (INNODB)
- From: "Juan C. Herraiz Regidor" <jcherraiz@ucm.es>
- Re: [EP-tech] Partitioning access table (INNODB)
- From: dago salas <dago.salas@gmail.com>
- [EP-tech] Partitioning access table (INNODB)
- Prev by Date: Re: [EP-tech] Faceted Search (EPrints goes ElasticSearch)
- Next by Date: [EP-tech] Antwort: Re: Re: Faceted Search (EPrints goes ElasticSearch)
- Previous by thread: [EP-tech] EPrints/CRIS
- Next by thread: [EP-tech] DOI handling in orcid_support_advance
- Index(es):