EPrints Technical Mailing List Archive
Message: #08212
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] DSpace Harvester and OAI_Bibliography.pm
- To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
- Subject: Re: [EP-tech] DSpace Harvester and OAI_Bibliography.pm
- From: Yuri <yurj@alfa.it>
- Date: Fri, 19 Jun 2020 07:51:18 +0200
You're right. OAI plugins works in this way: for every archive record, for every plugin, do the metadata format. But OAI_Bibliography works only for items with bibliography , not for all the records. So, it should not be used as generic oai plugin which expect to have valid metadata for every item. Bibliography has valid metadata only on bibliography items. You can disable it in the config, being it a plugin: $c->{plugins}{"Export::OAI_Bibliography"}{params}{disable} = 1; I think this should be a default setting, maybe worth a pull request on the git repository here: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Flib%2Fdefaultcfg%2Fcfg.d%2Fplugins.pl&data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Ca25dc2382e0b44b9d02008d81414cbef%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=jeLEcjRDezn6X486Fo9igN1qrzGA9%2F0aWLl%2BxoSrfxA%3D&reserved=0 Il 18/06/20 20:50, Tomasz Neugebauer ha scritto:
Hi Yuri, thank you for the detailed info. Yes, it looks like an issue with DSpace harvester. The issue did make me think about our oai_bibl metadata prefix, though, is that OAI_Bibliography.pm file doing something useful, if it suggests a metadata prefix in the OAI endpoint that returns empty records? If anyone has any comments on this, that's great, but the harvester question is resolved AFAIK, it should be requesting a specific prefix oai_dc. Tomasz -----Original Message----- From: eprints-tech-bounces@ecs.soton.ac.uk <eprints-tech-bounces@ecs.soton.ac.uk> On Behalf Of Yuri via Eprints-tech Sent: June 18, 2020 4:51 AM To: eprints-tech@ecs.soton.ac.uk Subject: Re: [EP-tech] DSpace Harvester and OAI_Bibliography.pm Attention This email originates from outside the concordia.ca domain. // Ce courriel provient de l'exterieur du domaine de concordia.ca I would exclude this format/plugin from oai2 in: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Fcgi%2Foai2%23L559&data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Ca25dc2382e0b44b9d02008d81414cbef%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=eTpePCErS2X5P4%2FP98lM1DWmOib%2BXPvZQVgXbME%2FP0w%3D&reserved=0 or you can change sort here (weak): https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Fcgi%2Foai2%23L565&data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Ca25dc2382e0b44b9d02008d81414cbef%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=5MPma7%2BgUxXVqi8BXPuRZHQnB0yxvZxOn6MSNxE4dNU%3D&reserved=0 I think this is an issue in DSpace, it should use always oai_dc as default format (instead of checking schema, the OAI specs cite oai_dc).
Il 17/06/20 20:22, Tomasz Neugebauer via Eprints-tech ha scritto:Hi everyone... in attempting to harvest some EPrinst repositories using DSpace harvester, the following issue was reported in 2016: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdspac e.2283337.n4.nabble.com%2FHarvesting-EPrints-repository-from-DSpace-td 4681086.html&data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf431 9bea44c499bca5e08d81364c1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0& ;sdata=nI%2FI5a0TlLkyap47s5F1Z0Qp14h%2FyFtkbPF6siVr5Ig%3D&reserved =0 <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdspa ce.2283337.n4.nabble.com%2FHarvesting-EPrints-repository-from-DSpace-t d4681086.html&data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf43 19bea44c499bca5e08d81364c1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&am p;sdata=nI%2FI5a0TlLkyap47s5F1Z0Qp14h%2FyFtkbPF6siVr5Ig%3D&reserve d=0> "What happens in this case is that EPrints has more than one entry for the supported metadata formats using OAI_DC (oai_bibl and oai_dc prefixes): . <metadataFormat> <metadataPrefix>oai_bibl</metadataPrefix><schema>https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.openarchives.org%2FOAI%2F2.0%2Foai_dc.xsd&data=01%7C01%7Cep rints-tech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a53 78f929f44d3ebe89669d03ada9d8%7C0&sdata=MLrEz%2BO6rjrjKedBKsZP4s2gY E5HBcmFChuRyJKP2lE%3D&reserved=0 <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww. openarchives.org%2FOAI%2F2.0%2Foai_dc.xsd&data=01%7C01%7Ceprints-t ech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f 44d3ebe89669d03ada9d8%7C0&sdata=MLrEz%2BO6rjrjKedBKsZP4s2gYE5HBcmF ChuRyJKP2lE%3D&reserved=0></schema><metadataNamespace>https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.openarchives.org%2FOAI%2F2.0%2Foai_dc%2F&data=01 %7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c 1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=DcH06bBnqrStDrAwV Ka7OheMydO6ax9Vw86FYCLAbu4%3D&reserved=0 <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww. openarchives.org%2FOAI%2F2.0%2Foai_dc%2F&data=01%7C01%7Ceprints-te ch%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f4 4d3ebe89669d03ada9d8%7C0&sdata=DcH06bBnqrStDrAwVKa7OheMydO6ax9Vw86 FYCLAbu4%3D&reserved=0></metadataNamespace> </metadataFormat> <metadataFormat> <metadataPrefix>oai_dc</metadataPrefix><schema>https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.openarchives.org%2FOAI%2F2.0%2Foai_dc.xsd&data=01%7C01%7Cep rints-tech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a53 78f929f44d3ebe89669d03ada9d8%7C0&sdata=MLrEz%2BO6rjrjKedBKsZP4s2gY E5HBcmFChuRyJKP2lE%3D&reserved=0 <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww. openarchives.org%2FOAI%2F2.0%2Foai_dc.xsd&data=01%7C01%7Ceprints-t ech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f 44d3ebe89669d03ada9d8%7C0&sdata=MLrEz%2BO6rjrjKedBKsZP4s2gYE5HBcmF ChuRyJKP2lE%3D&reserved=0></schema><metadataNamespace>https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.openarchives.org%2FOAI%2F2.0%2Foai_dc%2F&data=01 %7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c 1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=DcH06bBnqrStDrAwV Ka7OheMydO6ax9Vw86FYCLAbu4%3D&reserved=0 <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww. openarchives.org%2FOAI%2F2.0%2Foai_dc%2F&data=01%7C01%7Ceprints-te ch%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f4 4d3ebe89669d03ada9d8%7C0&sdata=DcH06bBnqrStDrAwVKa7OheMydO6ax9Vw86 FYCLAbu4%3D&reserved=0></metadataNamespace> </metadataFormat> . DSpace's harvester is then selecting the first metadataPrefix, i.e. oai_bibl, for which EPrints is returning records with no metadata." Someone is having a similar issue now with EPrints repositories, so I'm wondering, is this still an issue, or was there a fix/modification added to EPrints for this? I haven't tried the solution to remove OAI_Bibliography.pm from the core files. Tomasz *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech *** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.e prints.org%2Ftech.php%2F&data=01%7C01%7Ceprints-tech%40ecs.soton.a c.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f44d3ebe89669d03ad a9d8%7C0&sdata=FHk26N61rfj82zHanYPYmPj4MZ2%2Bw0fyHLb%2FiWX0fmI%3D& amp;reserved=0 *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki. eprints.org%2F&data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf4 319bea44c499bca5e08d81364c1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&a mp;sdata=77NiOEIH%2F2QizbYVyA2a8PVGoYkO4XgtFtE85W8zgEg%3D&reserved =0*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech *** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Ca25dc2382e0b44b9d02008d81414cbef%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=oj5mUhefMSnxVWtWkyhJbiQ4TUNR33KxM9LOeLncXf0%3D&reserved=0 *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Ca25dc2382e0b44b9d02008d81414cbef%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=AhsLoJJT0REuL4%2BVsKnTWYSa2SG4jEai8p0pArrlxps%3D&reserved=0
- Follow-Ups:
- Re: [EP-tech] DSpace Harvester and OAI_Bibliography.pm
- From: Yuri <yurj@alfa.it>
- Re: [EP-tech] DSpace Harvester and OAI_Bibliography.pm
- References:
- [EP-tech] DSpace Harvester and OAI_Bibliography.pm
- From: Tomasz Neugebauer <Tomasz.Neugebauer@concordia.ca>
- Re: [EP-tech] DSpace Harvester and OAI_Bibliography.pm
- From: Yuri <yurj@alfa.it>
- Re: [EP-tech] DSpace Harvester and OAI_Bibliography.pm
- From: Tomasz Neugebauer <Tomasz.Neugebauer@concordia.ca>
- Re: [EP-tech] DSpace Harvester and OAI_Bibliography.pm
- From: Yuri <yurj@alfa.it>
- [EP-tech] DSpace Harvester and OAI_Bibliography.pm
- Prev by Date: Re: [EP-tech] Charts Visualization on IRStats - Abstract Pages - Chrome
- Next by Date: Re: [EP-tech] Charts Visualization on IRStats - Abstract Pages - Chrome
- Previous by thread: [EP-tech] Sort view with creators_name and corp_creators
- Index(es):