Output of file(1) for UTF-8 encoded files
Output of file(1) for UTF-8 encoded files
I'm trying to figure out if the following happens just to me or for everyone. If you have a file, let's call it "myfile", that you know is encoded in UTF-8, what is your output for:
$ file myfile
?
I have a freshly installed OpenBSD 7.3, where I think I've set the locale to en_US.UTF-8. If I manually make a file with some characters like "åäö", file outputs "ISO-8859 text". If I copy/paste some unicode characters from the web, file outputs "Non-ISO extended-ASCII text". If I send these files to a Linux computer and run file it outputs "UTF-8".
Maybe relevant info:
$ locale
LANG=
LC_COLLATE="C"
LC_CTYPE=en_US.UTF-8
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_MESSAGES="C"
LC_ALL=
$ env | grep "UTF"
LC_CTYPE=en_US.UTF-8
XTERM_LOCALE=en_US.UTF-8