EPrints Technical Mailing List Archive

Message: #08806


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Fulltext (PDF) index


CAUTION: This e-mail originated outside the University of Southampton.
Thanks John

Not sure if this is the cause.

I just change the Timezone of the server (because the timestamp on indexer.log is not correct)

Then I re-run the ./epadmin erase_fulltext_index repo --verbose.

It suddenly solves it, now it can index the pdf.

Regards

Izwan
UiTM Digital Library

On Mon, Dec 6, 2021 at 11:23 PM John Salter <J.Salter@leeds.ac.uk> wrote:
Hi Mohd,
I would check to see if the indexer is running, and if the task queue has anything in it.

The quickest way to do this is to visit:
https://[your repository URL]/cgi/counter

This should present a text response. Look for 'event_queue' and 'indexer'.

In EPrints, the fulltext indexing jobs are placed in the event_queue.
The 'indexer' works through this queue.

Normally, the 'indexer' should report as 'running', and the event_queue should be close to zero - meaning the indexer is doing what is needed.

If the indexer is either 'stopped' or 'stalled', try running the indexer with one of these parameters:
~/bin/indexer [status | stop | start]
The indexer writes a log to ~/var/indexer.log - if something is causing the indexer to stop, there may be some information in there.

To see what is actually in the event_queue (rather than just how many items are waiting), in the web interface, go to the Admin menu -> Manage Records -> Tasks.
If there are a lot of items, you can sort the list, or filter on the start time, status etc.

Hopefully that helps!

Cheers,
John


From: eprints-tech-bounces@ecs.soton.ac.uk <eprints-tech-bounces@ecs.soton.ac.uk> on behalf of MOHD.IZWAN SALIM via Eprints-tech <eprints-tech@ecs.soton.ac.uk>
Sent: 06 December 2021 04:04
To: EDER Norbert via Eprints-tech <eprints-tech@ecs.soton.ac.uk>
Subject: [EP-tech] Fulltext (PDF) index
 
CAUTION: This e-mail originated outside the University of Southampton.
Dear EPrints Community

I just set up a new repo with the latest Eprints version.

How searching word in pdf (full text) does not return any result.

The PDF is already OCR and searchable.

I already run ./epadmin erase_fulltext_index repo --verbose

Is there anything should I look at?

Regards

Mohd Izwan Bin Salim
UiTM Digital Library

PENAFIAN: E-mel ini dan apa-apa fail yang dihantar bersama-samanya ("Mesej") adalah dihasratkan hanya untuk kegunaan penerima yang dinyatakan di atas dan mungkin mengandungi maklumat yang tidak umum, bermilik, istimewa, sulit dan dikecualikan dari penzahiran di bawah undang-undang yang terpakai termasuklah Akta Rahsia Rasmi 1972. BACA SELANJUTNYA...


DISCLAIMER : This e-mail and any files transmitted with it ("Message") is intended only for the use of the recipient(s) named above and may contain information that is non-public,  proprietary,  privileged,  confidential  and  exempt  from  disclosure under applicable law including the Official Secrets Act 1972. READ MORE...

PENAFIAN: E-mel ini dan apa-apa fail yang dihantar bersama-samanya ("Mesej") adalah dihasratkan hanya untuk kegunaan penerima yang dinyatakan di atas dan mungkin mengandungi maklumat yang tidak umum, bermilik, istimewa, sulit dan dikecualikan dari penzahiran di bawah undang-undang yang terpakai termasuklah Akta Rahsia Rasmi 1972. BACA SELANJUTNYA...


DISCLAIMER : This e-mail and any files transmitted with it ("Message") is intended only for the use of the recipient(s) named above and may contain information that is non-public,  proprietary,  privileged,  confidential  and  exempt  from  disclosure under applicable law including the Official Secrets Act 1972. READ MORE...