EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #10114


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Thousands of dataobj.xml files


Hi,

dataobj.xml files are the placeholder name for history revision files that appear in the individual EPrints record's document's subdirectory under its revisions subdirectory.  Here they appear as 1.xml, 2.xml, etc. rather than daatobj.xml. where the number is the revision number of the history record for that EPrints.

History revision files are a snapshot in time for the metadata of that EPrints record.

Regards

David Newman


From: eprints-tech-request@ecs.soton.ac.uk <eprints-tech-request@ecs.soton.ac.uk> on behalf of kralizeck@gmail.com <kralizeck@gmail.com>
Sent: Thursday, May 8, 2025 7:05:09 PM
To: eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>
Subject: [EP-tech] Thousands of dataobj.xml files
 
CAUTION: This e-mail originated outside the University of Southampton.
CAUTION: This e-mail originated outside the University of Southampton.
Hi.

I have EPrints 3.4.6 on the latest AlmaLinux and Apache. I upgraded from EPrints 3.3.12 on a very old Ubuntu.

I get 77414 files when I go to "Manage records->Files" and filter by name "dataobj.xml". (a total of 119968 files without filters).

Modifications date from 2010 (first eprints installation by other guys) until now (I took control to upgrade from 3.3.12 to 3.4.6 a few weeks ago).

I've searched for information, but haven't found anything.

All .xml have the same content when I export it with Atom (url edited):
<?xml version="1.0" encoding="utf-8" ?>
<entry>
  <id>https://mysite-url/id/file/136799</id>
  <title>dataobj.xml</title>
  <link rel="alternate"/>
</entry>

There is no dataobj.xml in the filesystem, so I assume they are in the database.

I would appreciate any help or recommendations to investigate this issue and my doubts:
  • For what and how are those .xml generated?
  • Do they serve any purpose?
  • Can I stop their generation?
  • Can I delete them? Any batch system to delete them?

Thanks and best regards.
Fernando Hdez.