EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #06962


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Experimental Schema.org support for EPrints


Hi Christopher,

nice to see you are back! Concerning the Schema.org support I did
something "custom" here:

http://en.unesco.org/mediabank

I think the proper way to go would be a plugin thought...

Denis


On 21/11/2017 17:46, Christopher Gutteridge wrote:
> Hi, EPrints-tech, long time no-see.
>
> I've recently rejoined the EPrints.soton.ac.uk support team, and was 
> asked about trying out schema.org support (which Google and Bing like). 
> I'm not a huge fan as I like peer-to-peer data, rather than via the big 
> search engines, but I gave it a go anyway.
>
> I have been working on a way to add schema.org support to EPrints. It's 
> using an invisible <div> which may not be everyone's preferred way of 
> doing it, but has the advantage of working well with the citation files.
>
> Other options would be to design the entire abstract page around this 
> feature (possible, but work to add to existing sites) or use JSON-LD 
> which is what I would do if I was doing it for just me, but making a 
> configuration file to generate JSON-LD would be more work for me and 
> more of a learning curve for the EPrints admin.
>
> I've added it as a pilot to https://eprints.soton.ac.uk/ (subject to 
> removal or change at any time)
>
> See the data extracted from a page here: 
> https://search.google.com/structured-data/testing-tool#url=https%3A%2F%2Feprints.soton.ac.uk%2F50995%2F
>
> There's lots more work to polish this, but it's work showing off now.
>
> I've used 3 citation files for this. One outer  one to handle the 
> different types. This is a bit ugly but was the solution I came up with, 
> a second one to process fields that come in a standard install of 
> EPrints, and a third for the fields eprints.soton has customised heavily.
>
> In the main summary_page.xml I added:
>
>    <epc:print expr="$item.citation('schema_org')" />
>
> Which links to schema_org.xml:
>
> <?xml version="1.0" ?>
> <!DOCTYPE html SYSTEM "entities.dtd" >
>
> <!--
>      Full "abstract page" (or splash page or summary page, depending on 
> your jargon) for an eprint.
> -->
>
> <cite:citation xmlns="http://www.w3.org/1999/xhtml"; 
> xmlns:epc="http://eprints.org/ep3/control"; 
> xmlns:cite="http://eprints.org/ep3/citation"; >
>
> <div style='display:none'>
>    <epc:choose>
>      <epc:when test="type = 'article'">
>        <div itemscope="itemscope" 
> itemtype="http://schema.org/ScholarlyArticle";>
>          <epc:print expr="$item.citation('schema_org_main')" />
>        </div>
>      </epc:when>
>      <epc:when test="type = 'book'">
>        <div itemscope="itemscope" itemtype="http://schema.org/Book";>
>          <epc:print expr="$item.citation('schema_org_main')" />
>        </div>
>      </epc:when>
>      <!-- book_section -->
>      <epc:when test="type = 'conference_item'">
>        <div itemscope="itemscope" 
> itemtype="http://schema.org/ScholarlyArticle";>
>          <epc:print expr="$item.citation('schema_org_main')" />
>        </div>
>      </epc:when>
>      <epc:when test="type = 'monograph'">
>        <div itemscope="itemscope" 
> itemtype="http://schema.org/ScholarlyArticle";>
>          <epc:print expr="$item.citation('schema_org_main')" />
>        </div>
>      </epc:when>
>      <!-- patent -->
>      <epc:when test="type = 'thesis'">
>        <div itemscope="itemscope" itemtype="http://schema.org/Thesis";>
>          <epc:print expr="$item.citation('schema_org_main')" />
>        </div>
>      </epc:when>
>      <epc:when test="type = 'dataset'">
>        <div itemscope="itemscope" itemtype="http://schema.org/Dataset";>
>          <epc:print expr="$item.citation('schema_org_main')" />
>        </div>
>      </epc:when>
>      <!-- ad_item // art design item //  -->
>      <epc:when test="type = 'mu_item'">
>        <div itemscope="itemscope" 
> itemtype="http://schema.org/MusicComposition";>
>          <epc:print expr="$item.citation('schema_org_main')" />
>        </div>
>      </epc:when>
>      <!-- letter -->
>      <!-- editorial -->
>      <epc:when test="type = 'review'">
>        <div itemscope="itemscope" itemtype="http://schema.org/Review";>
>          <epc:print expr="$item.citation('schema_org_main')" />
>        </div>
>      </epc:when>
>      <!-- special_issue -->
>      <!-- meeting_abstract -->
>      <!-- software // SoftwareApplication/ SoftwareSourceCode ?? -->
>      <epc:when test="type = 'website'">
>        <div itemscope="itemscope" itemtype="http://schema.org/Website";>
>          <epc:print expr="$item.citation('schema_org_main')" />
>        </div>
>      </epc:when>
>
>      <epc:otherwise>
>        <div itemscope="itemscope" itemtype="http://schema.org/CreativeWork";>
>          <epc:print expr="$item.citation('schema_org_main')" />
>        </div>
>      </epc:otherwise>
>    </epc:choose>
> </div>
>
> </cite:citation>
>
> Each of these options in turn links to the main one, 
> schama_org_main.xml, that uses default EPrints fields:
>
> <?xml version="1.0" ?>
> <!DOCTYPE html SYSTEM "entities.dtd" >
>
> <cite:citation xmlns="http://www.w3.org/1999/xhtml"; 
> xmlns:epc="http://eprints.org/ep3/control"; 
> xmlns:cite="http://eprints.org/ep3/citation"; >
>
> <div itemprop="name"><epc:print expr="title" /></div>
> <div itemprop="headline"><epc:print expr="title" /></div>
> <img itemprop="image" 
> src="http://www.eprints.org/uk/wp-content/uploads/EprintsServices2015icon.jpg"; 
> />
> <epc:if test="abstract">
>    <div itemprop="description"><epc:print expr="abstract" /></div>
> </epc:if>
> <epc:if test="keywords">
>    <div itemprop="keywords"><epc:print expr="keywords" /></div>
> </epc:if>
> <epc:if test="isbn">
>    <div itemprop="isbn"><epc:print expr="isbn" /></div>
> </epc:if>
> <epc:if test="id_number">
>    <div itemprop="identifier"><epc:print expr="id_number" /></div>
> </epc:if>
>
> <epc:if test="issn or series">
>    <div itemprop="isPartOf" itemscope="itemscope" 
> itemtype="http://schema.org/Periodical";>
>      <epc:if test="issn"><div itemprop="issn"><epc:print expr="issn" 
> /></div></epc:if>
>      <epc:if test="series"><div itemprop="name"><epc:print expr="series" 
> /></div></epc:if>
>    </div>
> </epc:if>
>
> <epc:comment>
>    <!-- pageEnd and pageStart could go here but are more bother to 
> extract. -->
> </epc:comment>
>
> <epc:if test="pagerange">
>    <div itemprop="pagination"><epc:print expr="as_string(pagerange)" 
> /></div>
> </epc:if>
> <epc:if test="publisher">
>    <div itemprop="publisher" itemscope="itemscope" 
> itemtype="http://schema.org/Organization";>
>      <div itemprop="name"><epc:print expr="publisher" /></div>
>    </div>
> </epc:if>
> <epc:if test="official_url">
>    <div itemprop="url"><epc:print expr="official_url" /></div>
> </epc:if>
>
> <epc:if test="creators">
>    <epc:foreach expr="creators" iterator="person">
>      <div itemprop="creator" itemscope="itemscope" 
> itemtype="http://schema.org/Person";>
>        <div itemprop="name"><epc:print 
> expr="$person.subproperty('name')" /></div>
>        <epc:if test="$person.subproperty('id')">
>          <div itemprop="identifier"><epc:print 
> expr="$person.subproperty('id')" /></div>
>        </epc:if>
>      </div>
>    </epc:foreach>
> </epc:if>
> <epc:if test="editors">
>    <epc:foreach expr="editors" iterator="person">
>      <div itemprop="editor" itemscope="itemscope" 
> itemtype="http://schema.org/Person";>
>        <div itemprop="name"><epc:print 
> expr="$person.subproperty('name')" /></div>
>        <epc:if test="$person.subproperty('id')">
>          <div itemprop="identifier"><epc:print 
> expr="$person.subproperty('id')" /></div>
>        </epc:if>
>      </div>
>    </epc:foreach>
> </epc:if>
>
> <epc:if test="corp_creators">
>    <epc:foreach expr="corp_creators" iterator="org">
>      <div itemprop="creator" itemscope="itemscope" 
> itemtype="http://schema.org/Organization";>
>        <div itemprop="name"><epc:print 
> expr="$person.subproperty('name')" /></div>
>      </div>
>    </epc:foreach>
> </epc:if>
>
>
> <epc:comment>
>    ADD IN LOCAL EXTENSIONS USING THIS FILE
> </epc:comment>
> <epc:print expr="$item.citation('schema_org_lcoal')" />
>
> </cite:citation>
>
> Finally I created schema_org_local.xml for the fields like date and 
> creators which we've heavily messed around with.
>
> <?xml version="1.0" ?>
> <!DOCTYPE html SYSTEM "entities.dtd" >
>
> <!--
>      Local extra content for schema.org info on summary page.
>
>      This file can be used to add new fields that are not standard for 
> EPrints.
> -->
>
> <cite:citation xmlns="http://www.w3.org/1999/xhtml"; 
> xmlns:epc="http://eprints.org/ep3/control"; 
> xmlns:cite="http://eprints.org/ep3/citation"; >
>
> <epc:if test="dates">
>    <epc:foreach expr="dates" iterator="date">
>      <epc:if test="$date.subproperty('date_type') = 'published'">
>        <div itemprop="datePublished"><epc:print 
> expr="$date.subproperty('date')" /></div>
>      </epc:if>
>      <epc:if test="$date.subproperty('date_type') = 'completed'">
>        <div itemprop="dateCompleted"><epc:print 
> expr="$date.subproperty('date')" /></div>
>      </epc:if>
>    </epc:foreach>
> </epc:if>
>
> <epc:if test="contributors">
>    <epc:foreach expr="contributors" iterator="person">
>      <div itemprop="contributor" itemscope="itemscope" 
> itemtype="http://schema.org/Person";>
>        <div itemprop="name"><epc:print 
> expr="$person.subproperty('name')" /></div>
>        <epc:if test="$person.subproperty('id')">
>          <div itemprop="identifier"><epc:print 
> expr="$person.subproperty('id')" /></div>
>        </epc:if>
>      </div>
>    </epc:foreach>
> </epc:if>
>
> </cite:citation>
>
>
> I'm not sure how useful all this is but figured I'd throw it out there. 
> It uses a default image as for some reason the Google checker insisted. 
> It doesn't link to files or mention subjects, doesn't include URIs 
> properly and doesn't link to ORCID etc. (which is data we have in 
> eprints.soton).
>
>
>


Attachment: signature.asc
Description: OpenPGP digital signature