EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #08664
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] archive statistics
- To: Yuri via Eprints-tech <eprints-tech@ecs.soton.ac.uk>
- Subject: Re: [EP-tech] archive statistics
- From: David R Newman <drn@ecs.soton.ac.uk>
- Date: Tue, 22 Jun 2021 10:16:27 +0100
Hi Yuri (off list),Here is the listing for the ingredient. As you can see belowalthough there are a few files you should be able to map these into you local archive or your current lib or site_lib directories of EPrints (i.e. one path or another not some in lib, some in site_lib and some in your archive). I think we want to avoid making it a Bazaar plugin, as then it becomes something that requires ongoing maintenance and potential releases of new versions, which is a bit excessive for something just quickly whip up as a useful analysis tool really only intended for an the original developer's use.
Regards David Newman total 24 drwxrwxr-x 2 eprints eprints 4096 Jun 12 11:28 bin drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 cgi drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 lang -rw-rw-r-- 1 eprints eprints 3715 Jun 12 11:28 src.index.html drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 plugins -rw-rw-r-- 1 eprints eprints 995 Jun 12 11:28 readme.txt [eprints@demo disk_report]$ ls -ltR .: total 24 -rw-rw-r-- 1 eprints eprints 995 Jun 12 11:28 readme.txt drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 plugins -rw-rw-r-- 1 eprints eprints 3715 Jun 12 11:28 src.index.html drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 lang drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 cgi drwxrwxr-x 2 eprints eprints 4096 Jun 12 11:28 bin ./plugins: total 4 drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 EPrints ./plugins/EPrints: total 4 drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 Plugin ./plugins/EPrints/Plugin: total 4 drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 Screen ./plugins/EPrints/Plugin/Screen: total 4 drwxrwxr-x 2 eprints eprints 4096 Jun 12 11:28 Staff ./plugins/EPrints/Plugin/Screen/Staff: total 4 -rw-rw-r-- 1 eprints eprints 1036 Jun 12 11:28 EPrintDiskReport.pm ./lang: total 4 drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 en ./lang/en: total 4 drwxrwxr-x 2 eprints eprints 4096 Jun 12 11:28 phrases ./lang/en/phrases: total 4 -rw-rw-r-- 1 eprints eprints 445 Jun 12 11:28 disk_report.xml ./cgi: total 4 drwxrwxr-x 2 eprints eprints 4096 Jun 12 11:28 disk_report ./cgi/disk_report: total 12 -rw-rw-r-- 1 eprints eprints 655 Jun 12 11:28 data.json -rw-rw-r-- 1 eprints eprints 656 Jun 12 11:28 show -rw-rw-r-- 1 eprints eprints 743 Jun 12 11:28 data.csv ./bin: total 20 -rwxrwxr-x 1 eprints eprints 170 Jun 12 11:28 get_fs_data -rwxrwxr-x 1 eprints eprints 580 Jun 12 11:28 new_report -rwxrwxr-x 1 eprints eprints 454 Jun 12 11:28 get_db_data -rwxrwxr-x 1 eprints eprints 827 Jun 12 11:28 get_db_fdata -rwxrwxr-x 1 eprints eprints 2134 Jun 12 11:28 combine On 22/06/2021 10:04, Yuri via Eprints-tech wrote:
CAUTION: This e-mail originated outside the University of Southampton. wow, great work! I'm still on 3.3 and I just need this info for a migration, to get some data to put in a report. Il 22/06/21 10:49, David R Newman via Eprints-tech ha scritto:Hi Yuri, I was just saying to my colleague Justin, I wonder if you were a plant (to ask this question). He has just been working on a tool for this very purpose, so that we can analyse repositories that are running out of disk space to see if they genuinely need more space or if things could be tidied up to free up sufficient space. The tool is a bit rough around the edges but he is happy to make it available as a 3.4 ingredient in an EPrints GitHub repository, when he has had a chance to tidy it up (over the next few days). If you are still on 3.3 it may be possible to map the various files into directories in your archive and enable as a plugin but that is not something either Justin or I have tried to do. The tool uses a cronjob to produce monthly disk reports rather than a live status. If there is a wider interest in the tool we could look to making the tool customisable to allow a greater reporting frequency. Unfortunately, it is not just a case of running the cron job more frequently, although the adaptation required to the tool as a whole should be fairly minor. I have deployed this rough version on tryme.demo.eprints-hosting.org if you want to take a look. The disk reports are only available under the Admin menu, so you would need to give me the username you used for the account you create on tryme, so I can up this account to a repository admin one. Regards David Newman On 22/06/2021 07:52, Yuri via Eprints-tech wrote:CAUTION: This e-mail originated outside the University of Southampton. Hi! what is the best way to get archive statistics, like how many records in the archive, or to have some size hint on them (for example to find how many objects uses at least 10MB of space, 20MB and so on), how much total space the archive use (for example only record on archive status), maybe grouping them by type (for example thesys uses 10GB, articles uses 40GB and so on)? I've done rough statistics using du and some unix tools but I would like to refine them better. *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech *** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C6fd30df29a764e0deb7508d9355e6af1%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599501894141543%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=vCkd5YOgAc8%2FrPB8FfcK4Rh4y%2F3yuQKXUtxXVQLpxCk%3D&reserved=0 *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C6fd30df29a764e0deb7508d9355e6af1%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599501894141543%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=w8NT7DkIxk52FntVjeJRWHiKKHUhSPle%2BDIhRk4q5Cs%3D&reserved=0*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech *** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C6fd30df29a764e0deb7508d9355e6af1%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599501894141543%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=vCkd5YOgAc8%2FrPB8FfcK4Rh4y%2F3yuQKXUtxXVQLpxCk%3D&reserved=0 *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C6fd30df29a764e0deb7508d9355e6af1%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599501894141543%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=w8NT7DkIxk52FntVjeJRWHiKKHUhSPle%2BDIhRk4q5Cs%3D&reserved=0
-- This email has been checked for viruses by AVG. https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.avg.com%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C6fd30df29a764e0deb7508d9355e6af1%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599501894141543%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ZVpG0a7x1tPdV64ysI4%2B2bbYW%2FuatpvHxoIO6bF7swI%3D&reserved=0
- References:
- [EP-tech] archive statistics
- From: Yuri <yurj@alfa.it>
- Re: [EP-tech] archive statistics
- From: David R Newman <drn@ecs.soton.ac.uk>
- Re: [EP-tech] archive statistics
- From: Yuri <yurj@alfa.it>
- [EP-tech] archive statistics
- Prev by Date: Re: [EP-tech] archive statistics
- Next by Date: Re: [EP-tech] archive statistics
- Previous by thread: [EP-tech] EPrints/CRIS
- Next by thread: [EP-tech] DOI handling in orcid_support_advance
- Index(es):