EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #08228


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Experimental integration with MS Azure Cognitive Services


Hi David,

Absolutely! This wasn't really meant to be production ready and just a quick experiment. There are multiple ways to do this, you could for instance do some sort of AJAX call and show the results as suggested keywords, or as you say add them via the event queue. One thing I would probably want to do before putting it in production would be to have some way of monitoring the number of Azure API calls that had been made as, like many services in the cloud, the price you pay depends on how much you use it. I'd probably want to abstract the service used as well so that people can use their favoured provider (if they have equivalent services).

Thanks
Liam

From: David R Newman <drn@ecs.soton.ac.uk>
Sent: 29 June 2020 10:05
To: eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>; Liam Green-Hughes <L.E.Green-Hughes@kent.ac.uk>
Subject: Re: [EP-tech] Experimental integration with MS Azure Cognitive Services
 

One thing I forgot to mention was that this call is probably better as an event queue task rather than a before commit trigger.  As if you are relying on a third-party application you don't want to be waiting on this before the page can reload.  If you are only send the abstract that is probably going to be fairly rapid.  However, sometimes abstracts can be rather long and I can imagine the service being less responsive at times. So you may wait a while before you get a response back, so you can reload the page.  Obviously the disadvantage of having an event queue task is that you will end up with two revisions rather than one.  However, your code at the moment suggests this service will only be called once, as once there are keywords it cannot be run again. 


Just another thought: If you were to change this so keywords could be updated, obviously you would want to check to see the fields that were being sent for keywords analysis had changed and only call this service if they had.  You would also need some code to parse the current and returned keywords to merge them together.


On 29/06/2020 09:54, David R Newman via Eprints-tech wrote:

Hi Liam,


Looks interesting.  I have been working on improving search on keywords within EPrints by introducing a new Keywords MetaField type that is backwards with the Longtext MetaField that is currently the type with the keywords field.  I have also introduced a Idci (short for ID Case Insentive) field that could be used for keywords when set to be a multiple field.  Hopefully, I will be able to make the official release of 3.4.2 that includes these available this week.


Regards


David Newman


On 29/06/2020 09:44, Liam Green-Hughes via Eprints-tech wrote:
Hi everyone,

I've been experimenting with integrating EPrints with an off the shelf AI solution to generate keywords. The service I used was the Text Analytics service element of the Microsoft Azure Text Analytics. I've only gone as far as experimenting with the Key Phrases endpoint so far, but the Named Entities endpoint looks like it could add real value to EPrints records too.

The integration file is here: https://github.com/liamgh/eprints-ai-expt/blob/master/z_azure_keywords.pl. It isn't really production ready, but it is a starting point.

Let me know your thoughts!

Thanks
Liam

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/

Virus-free. www.avg.com

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/