EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #01710


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: Fwd: Are Closed Access DepositsIndexed byGoogle Scholar?


The order of the attributes shouldn’t ever matter as long as the content of each is correct and they are. I’ve no idea why that is happening.

 

If there are no other problems, you should see new records, at least, being indexed quite quickly, e.g., weeks. In the past however it took Google Scholar months to re-index records.

 

Cheers

Mark

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Nault, Pierre
Sent: Wednesday, 13 March 2013 12:32 AM
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] Re: Fwd: Are Closed Access DepositsIndexed byGoogle Scholar?

 

Hi Mark,

 

                Default.xml now includes <epc:pin ref="head" />. The meta tags are now present in all pages generated by Archipel. However, “name” and “content” attribute are inverse :

 

http://www.archipel.uqam.ca/4992/

 

I don’t think that robot indexer will be affected by this. Any idea why this behavior ?

 

Thanks,

 

Pierre Nault

 

De : eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] De la part de Mark Gregson
Envoyé : 11 mars 2013 19:21
À : eprints-tech@ecs.soton.ac.uk
Objet : [EP-tech] Re: Fwd: Are Closed Access DepositsIndexed byGoogle Scholar?

 

It’s notable that neither the DC or Simple metadata is appearing in the output HTML.

 

Check that <archivename>/cfg/lang/<lang>/templates/default.xml includes <epc:pin ref="head" /> within the head element of the html doc and add it if it’s not there. If you have multiple languages you may need to check the default template for each language.

 

Regards

Mark

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Nault, Pierre
Sent: Monday, 11 March 2013 11:50 PM
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] Re: Fwd: Are Closed Access Deposits Indexed byGoogle Scholar?

 

Hi Mark,

 

                Thanks for the reply. In eprint_render.pl I have :

 

my $links = $session->make_doc_fragment();

                $links->appendChild( $session->plugin( "Export::Simple" )->dataobj_to_html_header( $eprint ) );

                $links->appendChild( $session->plugin( "Export::DC" )->dataobj_to_html_header( $eprint ) );

 

I’m pretty sure that Export::DC is working based on the response we have from OAI-PMH protocol. For Export::Simple I’m not sure -> where should I look if plugins are disabled ? One other thing, base code of our repository have been modified substantially. Where can I verified if the name of the field of the schema have been changed ?

 

Thanks,

 

Pierre Nault

 

 

De : eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] De la part de Mark Gregson
Envoyé : 10 mars 2013 19:50
À : eprints-tech@ecs.soton.ac.uk
Cc : Paul THIRION; Nguyen, Minh-Quang
Objet : [EP-tech] Re: Fwd: Are Closed Access Deposits Indexed byGoogle Scholar?

 

Hi Pierre

 

Have a look in eprints/<archivename>/cfg/cfg.d/eprint_render.pl in the anonymous function starting:
$c->{eprint_render} = sub {

 

If the EPrints native metadata is being rendered you should be able to find the following lines:

my $links = $session->make_doc_fragment();
$links->appendChild( $session->plugin('Export::Simple')->dataobj_to_html_header($eprint) );

 

If it’s not there you should be able to add it, reload the configuration/restart httpd and it will just work.  There are caveats, e.g., if you have changed the name of fields in the schema or the plugin has been disabled.

 

Regards

Mark

 

Mark Gregson | Applications and Development Team Leader
Library eServices | Queensland University of Technology
Level 3 | R Block | Kelvin Grove Campus | GPO Box 2434 | Brisbane 4001
Phone: +61 7 3138 3782 | Web:
http://eprints.qut.edu.au/
ABN: 83 791 724 622
CRICOS No: 00213J

 

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Nault, Pierre
Sent: Saturday, 9 March 2013 1:32 AM
To: eprints-tech@ecs.soton.ac.uk
Cc: Paul THIRION; Nguyen, Minh-Quang
Subject: [EP-tech] Re: Fwd: Are Closed Access Deposits Indexed by Google Scholar?

 

Hi all,

 

                We have removed the <meta name="robots" content="noindex,nofollow" /> from default.xml. We confirm that we are running on Eprints version 3.1-2008-12-03-r3984. Anurag : you mention that all versions of eprints over 3.0 can generate the machine-readable bibliographic metadata. Obviously this is not the case for us. Following your assertion, something is probably missing in the configuration of our repository. I’m not too familiar with the config of eprints: I would much appreciate any help on activating this feature.

 

Best regards,

 

Pierre Nault

 

 

 

De : eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] De la part de Stevan Harnad
Envoyé : 7 mars 2013 13:24
À : eprints-tech@ecs.soton.ac.uk List
Cc : Paul THIRION; Nguyen, Minh-Quang
Objet : [EP-tech] Fwd: Are Closed Access Deposits Indexed by Google Scholar?

 

 

 

Begin forwarded message:

 

From: Anurag Acharya <acha@google.com>

Subject: Re: [EP-tech] Are Closed Access Deposits Indexed by Google Scholar?

Date: 5 March, 2013 10:30:35 PM EST

To: Stevan Harnad <harnad@ecs.soton.ac.uk>

 

Hi Marc: I took a quick look at the examples you mentioned. I noticed couple of issues:

 

First things first, you are explicitly asking for these pages to not be indexed. For the two examples you mentioned:

<meta name="robots" content="noindex,nofollow" />

 

<meta name="robots" content="noindex,nofollow" />

 

A noindex robots metatag on an html page asks web search services to not index the page.

 

Second, I don't know if this is an old version of eprints or a custom repository but looks like it doesn't include the machine-readable bibliographic metadata that eprints 3.0 and later embed using metatags. Eg: 

 

 <meta name="eprints.creators_name" content="Ohka, Seii" />

<meta name="eprints.creators_name" content="Sakai, Mai" />

<meta name="eprints.creators_name" content="Bohnert, Stephanie" />

<meta name="eprints.creators_name" content="Igarashi, Hiroko" />

<meta name="eprints.creators_name" content="Deinhardt, Katrin" />

<meta name="eprints.creators_name" content="Schiavo, Giampietro" />

<meta name="eprints.creators_name" content="Nomoto, Akio" />

[...]

 

If you are using an older version of eprints, I would recommend upgrading to a version later than 3.0. If you are using a different repository software, I would recommend http://roar.eprints.org/help/google_scholar.html and http://scholar.google.com/intl/en/scholar/inclusion.html

 

cheers,

anurag

 

 

 

 

 

On Tue, Mar 5, 2013 at 5:17 AM, Stevan Harnad <harnad@ecs.soton.ac.uk> wrote:

On 2013-03-05, at 5:12 AM, Tim Brody <tdb2@ecs.soton.ac.uk> wrote:

 

On Mon, 4 Mar 2013 15:23:06 -0500, Stevan Harnad <harnad@ecs.soton.ac.uk>
wrote:

I have been told that closed access deposits for
http://www.archipel.uqam.ca are not being indexed by Google Scholar: Is
there any way around this?

(I mean the metadata, of course, not the full-text, which I know is
unharvestable till access is re-set as OA).


There's no reason that the metadata pages shouldn't be indexed, but I don't
think (?) Google Scholar will list metadata-only records from repositories.

A specific example would be useful.

 

It's bad news (for the Button) if GS does not index the metadata of Closed Access deposits. (GS certainly indexes plenty of papers that do not have a free full-text version on the web).

 

Could this (if it's true) be fixed by optimizing the way an EPrints IR presents itself to google and GS (levels of embedding or something like that)? I seem to remember Les saying that the depth of documents was important.

 

A DSpace IR, Orbi, has 50% Closed Access contents (for example, here). 

These are all picked up by Google, for example this one: "Tubulin isoforms identified in the brain by MALDI in-source decay"

but they appear very late in the Google hit list (especially for much-sited or multi-cited papers)

and the Orbi version does not seem to be picked up by GS at all.

 

This is extremely important, because it affects the efficacy of the Button, and thereby the power of an immediate-deposit mandate (and the incentive to adopt one).

 

Is there any way to address this problem directly in EPrints (plus advice for our cousins in DSpace)?

 

Many thanks,

 

Stevan

 

 

 

From: Couture Marc <marc.couture@teluq.ca>

Subject: RE: [EP-tech] Are Closed Access Deposits Indexed by Google Scholar?

Date: 4 March, 2013 6:17:13 PM EST

To: Stevan Harnad <harnad@ecs.soton.ac.uk>, Leslie Carr <lac@ecs.soton.ac.uk>

 

Hi,

My belief that Google / Scholar doesn't index closed access documents (more precisely, the HTML page with the metadata) is based upon a simple check with two closed access documents in Archipel :

1. http://www.archipel.uqam.ca/4252 

This is the manuscript of a published article (Title : So into it they forget what time it is?)

If I put the title (between quotes) in Google or Google Scholar, all I see is the published (toll access) version :

http://www.igi-global.com/chapter/into-they-forget-time/67430 

2. http://www.archipel.uqam.ca/4254 

The title is: Discretionary power of project managers in knowledge intensive firms and gender issues

Again, Google Scholar finds only the published version (Google doesn't even find it):

http://onlinelibrary.wiley.com/doi/10.1002/cjas.147/abstract 

On the same results page, one sees another paper, available in open acces in Archipel, citing this one.

Both manuscripts have been in Archipel for more than one year (deposit date: Nov 2011).


Marc Couture