EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #09863
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] Encoding for database tables - utf8mb4
- To: <eprints-tech@ecs.soton.ac.uk>
- Subject: Re: [EP-tech] Encoding for database tables - utf8mb4
- From: Matthew Kerwin <matthew@kerwin.net.au>
- Date: Sun, 27 Oct 2024 22:17:34 +1000
CAUTION: This e-mail originated outside the University of Southampton. Oh, haha, I realised the message you linked to was from an ancient version of me. Well, there you go. It's still working, fwiw. On Sun, 27 Oct 2024 at 22:14, Matthew Kerwin <matthew@kerwin.net.au> wrote: > > Hi John, > > On Fri, 25 Oct 2024 at 22:37, John Salter <J.Salter@leeds.ac.uk> wrote: > > > > Hi, > > Currently my EPrints database still uses utf8, rather than utf8mb4. > > > > There are a couple of fields (title, abstract) that I would like to update to utf8mb4 – before (at some point in the future) updating the whole database to utf8mb4. > > Has anyone done anything similar – changing a limited subset of columns - or should I just try and do the whole database at once? > > I definitely updated ours at some point, many years ago. I think I did > the whole database, but if I recall correctly it required a lot of > typing in a terminal because I had to change each column of each > table, and collations, and various other things. I also changed the > code somewhere around EPrints::Database (and possibly ::mysql) to > ensure it connected with the right encoding/charset as well. I can > probably dig out some more details once I'm back at work tomorrow, if > you need it. > > > These instructions seem to cover what I need to do for individual columns: > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.eprints.org%2Fw%2FUnicode%23Managing_32-Bit_Unicode_Characters&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Cb87064dc17da4bbdf4e608dcf6815ee8%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638656282719531348%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C60000%7C%7C%7C&sdata=jfc83qZqndvX5WxI9gPMk8ed9ieGQHgKPbIquOaMwIU%3D&reserved=0 > > and this is a useful guide: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.eprints.org%2Feptech%2Fmsg07198.html&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Cb87064dc17da4bbdf4e608dcf6815ee8%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638656282719531348%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C60000%7C%7C%7C&sdata=VAgkQ3hZ1j%2FN8m8T2kbAdOknGQIZbM7z8jjAN%2F4PPYE%3D&reserved=0 > > > > If the fields are indexed, these changes are probably also needed: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.eprints.org%2Feptech%2Fmsg09275.html&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Cb87064dc17da4bbdf4e608dcf6815ee8%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638656282719531348%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C60000%7C%7C%7C&sdata=Ibeqpy9Ra4iuEx1Qci2ZG0ezb4Cuug%2Fp%2FBI5Ca6uJpE%3D&reserved=0. > > > > The reason I want to do it this way is: > > I have some external systems that want to push utf8_mb4 data into those fields > > If I try to update the whole DB at the moment, I hit the index length issues, so need more time to investigate/reconfigure/resolve these > > > > Any details from anyone who's update part, or all their DB provision to utf8mb4 welcomed! > > > > Cheers, > > John > > > > Cheers > -- > Matthew Kerwin [he/him] > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmatthew.kerwin.net.au%2F&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Cb87064dc17da4bbdf4e608dcf6815ee8%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638656282719531348%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C60000%7C%7C%7C&sdata=eG9MDLeGHt3ckTNWmC7aCk4CFHIBYkOd8RWCiiz2IC4%3D&reserved=0 -- Matthew Kerwin [he/him] https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmatthew.kerwin.net.au%2F&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Cb87064dc17da4bbdf4e608dcf6815ee8%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638656282719687520%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C60000%7C%7C%7C&sdata=IVDI4jIMOh1i%2BUYcsG2WAu5iSizR8zR1RE%2BmZG96uWE%3D&reserved=0
- Follow-Ups:
- Re: [EP-tech] Encoding for database tables - utf8mb4
- From: David R Newman <drn@ecs.soton.ac.uk>
- Re: [EP-tech] Encoding for database tables - utf8mb4
- References:
- [EP-tech] Encoding for database tables - utf8mb4
- From: John Salter <J.Salter@leeds.ac.uk>
- Re: [EP-tech] Encoding for database tables - utf8mb4
- From: Matthew Kerwin <matthew@kerwin.net.au>
- [EP-tech] Encoding for database tables - utf8mb4
- Prev by Date: Re: [EP-tech] Encoding for database tables - utf8mb4
- Next by Date: Re: [EP-tech] Encoding for database tables - utf8mb4
- Previous by thread: Re: [EP-tech] Encoding for database tables - utf8mb4
- Next by thread: Re: [EP-tech] Encoding for database tables - utf8mb4
- Index(es):