EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #09790
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] Ask about search result and reindex
- To: eprints-tech@ecs.soton.ac.uk
- Subject: Re: [EP-tech] Ask about search result and reindex
- From: "Agung Prasetyo W." <prazetyo@gmail.com>
- Date: Thu, 25 Jul 2024 14:39:22 +0700
eprints@repo16-04:~$ cat archives/[archive_ID]/var/eprint_rindex_unindexed.txt
1101
7238
7417
7488
It's just that, when I searched for one of the keywords in item 1101, the search results appeared. Meanwhile, what I asked yesterday, there were several items that I was looking for in the search that did not appear, for example the author with the name "Rahmat Setyo" with the item ID being 6789. When I searched from search it did not appear, but when searching by author, the item with the author is available.
Next, I ran the "epadmin reindex" command on item ID 6789. When the epadmin reindex process was complete, when I searched for the author "Rahmat Setyo", the data appeared in the search results.
Hi Agung,
I have made some improvements to the script at:
http://files.eprints.org/3065/
Here are the installation/usage instructions:
Download the Bash script and run as follows to check that all eprint records in the live archive have titles, abstracts and creators indexed (if they exist for that record):
./find_eprint_rindex_unindexed
If your EPrints installation's archives are not under /opt/eprints3/archives then specify with -p flag:
./find_eprint_rindex_unindexed -p /usr/share/eprints/archives
If you want to check a specific archive rather than the first one the script finds then specify -a flag:
./find_eprint_rindex_unindexed -a my_archive
Results are output to the following file or run with -v flag to outout to the screen:
EPRINTS_PATH/archives/ARCHIVE_ID/var/eprint_rindex_unindexed.txt
If you have un-indexed results you want to ignore you can provide a new line separated list of these in:
EPRINTS_PATH/archives/ARCHIVE_ID/var/ignore_eprint_rindex_unindexed.txt
Regards
David Newman
On 24/07/2024 12:05, David R Newman wrote:
Did you specify the ARCHIVE ID as a parameter in the command:
./find_eprint_rindex_unindexed ARCHIVE_ID
Did you make sure you update EP_PATH in the script to match your EPrints path if this is not /opt/eprints3?
Di you update USER_PASS to the username and password for your EPrints database. The default assume that the root user can access the database with a need for a password. You will probably need to change:
USER_PASS="-u root"
To something like:
USER_PASS="-u USERNAME -pPASSWORD"
Where USERNAME is $c->{dbuser} and PASSWORD is $c->{dbpass} in your archive's cfg/cfg.d/database.pl.
I could probably improve the script to get it to pull this out by default when looking up the database name, which is already does from by grabbing dbname from this file.
Regards
David Newman
On 24/07/2024 11:49, Agung Prasetyo W. wrote:
CAUTION: This e-mail originated outside the University of Southampton.Hi David,
How do I know we use eprints database or xapian? After I run your script, it shows nothing. After I open the the file /var/eprint_rindex_unindexed.txt, it shows like below :Copyright (c) 2000, 2021, Oracle and/or its affiliates.
Is my step wrong ??
Thank you.
Regards,Agung PW
On Wed, 24 Jul 2024 at 17:22, David R Newman <drn@ecs.soton.ac.uk> wrote:
Hi Agung,
If you are using the database (i.e. eprint__rindex) table, then I wrote the following (rather hacky) Bash script to test this:
https://files.eprints.org/3065/
The script will ignore items whose metadata visibility is not set to show. It is worth manually checking you database for item you expect to be able to find in search but cannot to see if the metadata_visibility field has been changed. If you create new versions of items this will automatically set the current (now old) version to hide. (This is a far from ideal situation but it is quite difficult to determine a better way to ensure users only find the latest versions, especially when the "New Version" button gets used in the wrong circumstances).
If you are using a Xapian index, (e.g. typically used for simple search), then I did write a different script for this but it is a lot more complex to deploy.
Regards
David Newman
On 24/07/2024 10:51, Agung Prasetyo W. wrote:
CAUTION: This e-mail originated outside the University of Southampton.CAUTION: This e-mail originated outside the University of Southampton.Hi,
Sometimes there are items that don't appear when I do a search, even though they are in the repository. But after I did the command: epadmin reindex [archive_id] eprint [item_id]
As a result, these items can appear in search results.
Is there a way to find out the item IDs that have not been indexed so that we can reindex the item IDs?
Thank you.
Regards,Agung Prasetyo W.
*** Options: https://wiki.eprints.org/w/Eprints-tech_Mailing_List *** Archive: https://www.eprints.org/tech.php/ *** EPrints community wiki: https://wiki.eprints.org/
*** Options: https://wiki.eprints.org/w/Eprints-tech_Mailing_List *** Archive: https://www.eprints.org/tech.php/ *** EPrints community wiki: https://wiki.eprints.org/
- Follow-Ups:
- Re: [EP-tech] Ask about search result and reindex
- From: David R Newman <drn@ecs.soton.ac.uk>
- Re: [EP-tech] Ask about search result and reindex
- References:
- [EP-tech] Ask about search result and reindex
- From: "Agung Prasetyo W." <prazetyo@gmail.com>
- Re: [EP-tech] Ask about search result and reindex
- From: David R Newman <drn@ecs.soton.ac.uk>
- Re: [EP-tech] Ask about search result and reindex
- From: "Agung Prasetyo W." <prazetyo@gmail.com>
- Re: [EP-tech] Ask about search result and reindex
- From: David R Newman <drn@ecs.soton.ac.uk>
- Re: [EP-tech] Ask about search result and reindex
- From: David R Newman <drn@ecs.soton.ac.uk>
- [EP-tech] Ask about search result and reindex
- Prev by Date: [EP-tech] Error when upgrading version
- Next by Date: Re: [EP-tech] Ask about search result and reindex
- Previous by thread: Re: [EP-tech] Ask about search result and reindex
- Next by thread: Re: [EP-tech] Ask about search result and reindex
- Index(es):