Note
First of all, this question is mostly technical and only tangentially related to manga. However, such image annotation is mostly used by boorus, which are used mostly by manga/anime fans, so I thought it best to ask here. If this is not the correct place to post this, then I apologize. If you happen to know a better place to post this, would you mind telling me?
First Question
How do boorus annotate images so that textual data is associated with a particular region? You can see an example here. For example, if you highlight わたしのコードネームはハーモニー, a pop-up appears that provides the translation: "My codename is Harmonie.". It's basically the image equivalent of "softsubs" (subtitles that are not rasterized into the video stream, but instead dynamically loaded during runtime). Technical note: It's not important that the text appears when highlighting regions. The only important part is that text is associated with a particular region of an image. The user agent will determine what to do with this information. (If you don't understand, please don't worry about this part.)
Note: Wikimedia Commons does this as well. You can see this here. Please ensure JavaScript is enabled. Otherwise, the annotations won't appear.
Second Question (Very Important)
This is what I want to know the most: Is there an image file format that natively stores regional annotation like a booru? If you try to download the image in the link above, the annotations seem to disappear. I think that they are stored separately from the file. I want an image format that can store regional annotations within the file itself, so that the text and image are never separated.
I thought that PNG files could store annotations natively, but I'm not sure. Maybe JPEG XL can do it? I don't know. But this should exist. It would be extremely convenient for manga translation (or any translation, really).
You can read further discussion here: