EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #09858


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Encoding for database tables - utf8mb4


CAUTION: This e-mail originated outside the University of Southampton.

Hi,
Currently my EPrints database still uses utf8, rather than utf8mb4.

 

There are a couple of fields (title, abstract) that I would like to update to utf8mb4 – before (at some point in the future) updating the whole database to utf8mb4.

Has anyone done anything similar – changing a limited subset of columns - or should I just try and do the whole database at once?

 

These instructions seem to cover what I need to do for individual columns:

https://wiki.eprints.org/w/Unicode#Managing_32-Bit_Unicode_Characters

and this is a useful guide: https://www.eprints.org/eptech/msg07198.html

 

If the fields are indexed, these changes are probably also needed: https://www.eprints.org/eptech/msg09275.html.

 

The reason I want to do it this way is:

  • I have some external systems that want to push utf8_mb4 data into those fields
  • If I try to update the whole DB at the moment, I hit the index length issues, so need more time to investigate/reconfigure/resolve these

 

Any details from anyone who's update part, or all their DB provision to utf8mb4 welcomed!

 

Cheers,

John

 

John Salter

https://orcid.org/0000-0002-8611-8266

 

White Rose Libraries Technical Officer
IT - Application Support (Research)
University of Leeds