Try this in your favourite File Manager

nevj · February 25, 2022, 4:44am

There seem to be some issues with filetype identification and choosing the right application to display a file in the Dolphin File Manager
Try this:
Find or make a small text file, eg
vi txt.txt
copy it as follows
cp txt.txt txt.jpg
cp txt.txt txt.pdf
cp txt.txt txt.orig
View these with Dolphin

Dolphin seems to think txt.pdf is a pdf file
and txt.jpg is an image file
So lets try and display them
It gets txt.txt and txt.orig correctly displayed with KWrite
it tries to display txt.jpg as an image file with Gwenview and fails
Screenshot_20220225_151317

it tries to display txt.pdf as a pdf file with Okular and succeeds because okular can display text files.

So what is going on here. We can use Properties to see what Dolphin is doing, for example with txt.pdf it gets

So Dolphin knows the file Contents are plain text document but it sets the Type to PDF document.
Same for txt.jpg. It gets the other two correct.

So where is Dolphin getting its information from?
Go to command line and do

$ file txt*
txt.jpg:  ASCII text
txt.orig: ASCII text
txt.pdf:  ASCII text
txt.txt:  ASCII text

So Linux file command gets it right.
Now do

nevj@trinity:~
$ xdg-mime query filetype txt.txt
text/plain
nevj@trinity:~
$ xdg-mime query filetype txt.pdf
application/pdf
nevj@trinity:~
$ xdg-mime query filetype txt.jpg
image/jpeg
nevj@trinity:~
$ xdg-mime query filetype txt.orig
text/plain

but xdg-mime gets it wrong

So now we know. Dolphin uses file to determine the Content but xdg-mime to determine the Type of each file.
and
Dolphin then decides on a display application using the xdg-mime result, and ignores the Content.

Note this is not peculiar to Dolphin. I found the same in Thunar and Files(Gnome)

Now some questions:

Is this the way you would like your File Manager to behave? My preference would be to have it work on the file Content, or at least offer a choice in its configuration.
Can we find out how general this is? What does your File Manager do?
Can anyone come up with a File Manager that displays a file according to Content?

Final note:
There is a good article in file types here
https://www.baeldung.com/linux/file-mime-types

and a wiki on using MIME here

and also consult the Unix man pages for file and xdg-mime

Akito · February 25, 2022, 5:24am

Not an issue. Just a question of perspective.

Of course. If I, the user, name it *.pdf I want that to be respected, because the user should have the last word.

Choice is always best. But if there is none, I prefer having the extensions being the priority criterium in determining file types.
Actually, it always annoyed me how smart Linux tried to be and in the end shot itself in the foot, so many times. I prefer Window’s style of just respecting file extensions. It’s easier for the system and easier for the human to control behaviour.

How many times was my shell script labeled as “plain text”, when it was a shell file. Oh, because the operating system wanted to be so smart and determine the file by its content. Well, if I choose something to be perceived as a pdf file by the system, I want that to be respected.

There are so many use-cases for that. For example, some programs only accept certain file types as input. Let’s say you just want to test if reading a file, which should be a PDF, works, you could quickly rename a text file to make it seem like a PDF and voíla, the test works. Of course, the file won’t be parsed as a PDF, but if I wanted to do this short test, I had the choice to.

Back when I used Linux GUI more frequently, I experienced the problems described above. My file extension choices were not respected and I had to bow to whatever the OS thinks the file should be. Most of the time it bit itself in the ass with the shell scripts, as described above.

To my knowledge this is usually the default behaviour, as this is the default way of how Linux does it. Perhaps, some file managers changed the default. However, I assume there should be a way to re-configure that option in Dolphin, as it’s a file manager with numerous options available.

MIME should be a thing for software, not for humans. If a file manager is software which primarily is translating what is saved on your storage for the human brain, then the logic behind the way information is displayed, should be as human as possible.
In my opinion, it’s more human to just respect the file extensions. They are ultra transparent and everyone understands that. However, file content is purely visible from the computer’s perspective. It’s hard for a human to determine the actual content, without using tools like file or whatever.
File extensions are the easiest and most human way to handle the file storage information shown to the user.

nevj · February 25, 2022, 9:44am

Except when you download files from the internet, and they come with the wrong extension.
I am sure we have all experienced this with Android phones or tablets. If the extension is wrong, it will not display the file at all.

And on Dolphin
I looked, there is no option to alter whether it uses Content or Extension to determine how to display a file. It seems to know both, but use the Extension-determined filetype.

As you say, the traditional Unix way is Content using file command. It is rarely wrong.

nevj · February 25, 2022, 10:18am

Here is a really simple Julia file
[nevj@mary Julia]$ cat add.jl
function add(a,b)
x = a + b
return x
end
Lets see what Thunar does? It says it is a Matlab file
so which command is it using?

[nevj@mary Julia]$ file add.jl
add.jl: ASCII text
[nevj@mary Julia]$ xdg-mime query filetype add.jl
text/x-matlab

So this time, neither command is respecting the extension, and Thunar goes with the xdg-mime result.

All very confusing.

Akito · February 25, 2022, 2:11pm

Android has bastardised away from Linux a long way, so it’s not exactly a comparison from Linux to Linux, except you are actually using Termux on Android.

It’s also possible to change the file extension in a file manager.

That said, I actually had more issues with Android trying to be as smart as Linux and shooting itself in the foot with it. I had trouble testing JSON file content, because it always tried to detect by content, instead of by file extension. Very annoying and cumbersome.

That’s actually a very great example for how determination by file extension is superior. To have the file content matching work, it needs to detect content. That’s especially hard with short scripts, which are not formatted the official way and perhaps lack information. For example, the Hash-Bang of a bash script might be missing and it still might be a valid bash script. That’s another issue file sucks at…

So, why is this a great example for how determination by file extansion is superior, you ask?
Because, MATLAB is a 1000 times closer to reality than the stupid “ASCII text” provided by file. file is so stupid, it cannot even point me in the right direction. If I have a folder filled with different lesser known shell script types, file would only show plain text files… Very helpful!
However, when I have a folder filled with text files, except a single file is a Julia file, then I can at least distinguish the Julia file from the text files, as it would have a MATLAB icon attached to it, in case the matching was done by file extension.

So, I prefer a MATLAB detection over a stupid and absolutely unhelpful “ASCII text” detection, which is simply just wrong. Absolutely wrong. It doesn’t even get detected as a shell script.

That said, there are ambiguities with file extensions. Especially short extension share several meanings. So, this might also be the reason, why the Julia file is wrongly matched. However, it’s still better than relying on file which needs enough properly formatted file content, to detect anything remotely useful.

Please, test this script file and you will see how determination by file extension is superior:

l="$(base64 -w0 <<<"hello")"
echo $l

This is a valid bash script. Nothing wrong with it. It just misses a Hash-Bang, but that’s not required to make it run like bash test. So, the stupid file tells me it is ASCII text. That’s wrong!
If I append a .sh to the file name and the matching is done by file extension, you get the correct result! Okay, it might detect sh instead of bash, but if you want to be explicit you can still append a .bash to the file name, instead. Then it would should be able to detect it just fine, while file is still dreaming around in the nothing-to-know world…

nevj · February 25, 2022, 11:39pm

Sometimes being stupid is actually an advantage
All the file manager needs to know to display this file is that it is ASCII

If a user wants a very sophisticated choice of what application to open a file with, it is up to the user to configure xdg-mime or use the menues in Dolphin to configure. In that case user controlled use of extensions would be an advantage or even essential. The price of sophistication is eternal configuration and being open to sophisticated mistakes.

But for the average user who just wants to see the file contents automatically, for any file with or without extension, I think the simple file command approach is superior. MIME may improve, but it is too tricky at the moment

Akito · February 26, 2022, 12:32am

That’s not the point, however. The point is, that a file manager like Dolphin is supposed to visualise what files appear in front of the user’s eyes. If all files get mistakenly detected as “ASCII text” files, then half the purpose of the file manager is gone.

You assume that’s what the average user wants. I do not share that assumption. I assume people want stuff to work, first and foremost. Having a shell script opened in a text editor, because it has been mistakenly identified as a text file, makes the situation worse.

One of the most common issues, that appears after users switch from Linux to Windows is actually precisely that. People want to run “something” and they don’t know it’s a script. Then they double click it and a text editor opens and they don’t know what they are supposed to do with that.

However, if the file manager would respect the .sh extension, it would at least attempt to run the damn file. But just opening the shell script in a text editor, does not help anyone really.
It does not help anyone, because beginners wouldn’t know what to do and cannot script anyway and advanced users, like e.g. myself, would not start editing any script or program in a simple text editor, except for editing a seriously tiny detail. However, whenever I edit anything beyond human text, I always use VS Code and others probably use their favourite code editor. Starting with Emacs, over Visual Studio Code, to giants like IntelliJ IDEA. However, I don’t know anyone who is happily editing major script parts in a simple text editor, without code highlighting and any of the convenience features an IDE or code editor has to offer.

So, to apply this thought to your previous example: I prefer having my Julia file opened in the MATLAB IDE over being opened in a simple text editor. Ideally, it would actually run that script, without needing to chmod +x. Another major culprit in the Linux world… But that’s a topic for another discussion.

jimofadel · February 26, 2022, 1:19am

I’m somewhere in the middle between “beginner” and “expert”, I guess it could be called the “dangerous zone”. I see a lot of files on my system that I have no idea what they do. One thing I would NOT want is a file that ends in .txt but if you double-click on it, or OPEN it with the default application, it runs some script that does something to the system. If it says “.txt” or .doc or .html I wouldn’t hesitate to double click on it! So I depend on the system not to let me destroy it by opening files that have innocuous extensions. If it’s an unknown file with no extension, I would not try to open it. I sometimes open unrecognized file extensions with Bluefish or Text Editor just to see “what’s in there” and if it is all gibberish I know it’s not a file I can read or edit and I close it right away.
Does the Open Source Initiative have a “Standards Committee” on file naming?

nevj · February 26, 2022, 12:22pm

Hi Jim,
There is the MIME standard, but that is an internet creation, not Open Source.

The security issue never occurred to me. I am wary in email, and down loading files. I think files that originate in your distro are fairly safe, as long as you only download packages from the official repository.

You can always look at a non-text file with od -a and it will quite often show what it is because any bits that are ascii will display as characters while the rest will just be octal.

Sometimes the man system will tell you what a file is for, especially if it is in /etc

Cheers
Neville

nevj · February 26, 2022, 12:36pm

I thought a File Manager was for managing files, and an Application Launcher was for running files.

I would never, ever, want my File Manager to run a file when I click on it.
That is dangerous.

Another case like systemd. Dolphin is trying to be everything to everybody instead of concentrating on doing a good single job of managing files.

Managing files should not extend beyond shuffling them into folders and viewing them.

And see comment from @jimofadel . Executing things you dont know about is a security issue and a system hazard.

Cheers
Neville

Did not think I was launching a security debate

Akito · February 26, 2022, 1:39pm

I mean, that’s what the Linux people say. But all UX and design experience people gained in their life, especially when they are average users, scream their lungs out against that “that’s dangerous” crap. If you have a dangerous file visible in your file manager, then the damage is basically already done. There shouldn’t be a “dangerous” file to begin with.

That’s just outright unrealistic. It just does not conform to reality. Period.
Even I found it sometimes cumbersome and difficult to properly add something to the application launcher manually. I don’t want to imagine how hard it must be for an average non-techy user to do that.
Secondly, in your assumption you are basically saying that people should add all kinds of unnecessary stuff to the application launcher, even setup.sh scripts and things they only use once in 3 years.
Isn’t that going against your complaints from weeks ago, when you complained that too much stuff is in the application launcher and it should actually be slimmer by default?
How does that work, when you put every single script into the application launcher, just to run it. That does not make much sense.

Additionally, it’s just reality, that people run/execute something from the file manager. It’s real life and not only non-techy people do that. I did that, too and all the techy friends I know do that, too. It’s normal.
So, requesting all users to add every single mini one-shot script to the application launcher, just to be able to run it, is just not depicting reality.

That’s again, your perspective. It’s valid. However, there are again several definitions for “doing a good single job of managing files”. Especially on Linux system, you will probably have to run a script at least once a week, if you do a lot on that Linux system. So, using the file manager to do that, is most of the time the easiest and quickest way to achieve that.
I assume, that is supposed to be part of the “single job managing files” part. It just is one of the few parts of managing files.
Just having a frontend for copying stuff from A to B is not a file manager but just a frontend for cp, which is less than “doing a good single job of managing files”.

My definition of “file manager” includes all types of file management. Creating, moving, mounting, deleting, changing attributes and properties, running scripts, deleting a line from a repository file, etc. That is my definition of “managing files”. A script is just a another file, after all.

I think that’s an unfair comparison. First of all, I already explained why file managers manage files and why part of managing means that you should be able to run a script quickly and easily through the file manager’s interface.

Systemd on the other hand is a completely different story. First of all, the bloat is immense. It’s huge! It’s not comparable with a file manager which is more than just a stupid cp frontend.
Secondly, systemd maintainers are pricks. They are utter idiots.
Thirdly, systemd bloat has major security implications.
The type of file manager you were addressing does not have that and probably couldn’t have that, because it does not run in the background to manage security critical low-level kernel-like operations.

From reading his comment, I received the impression, that relying on file content rather than file extension is a problem. And if you see a .sh file you just won’t run it to be safe, but if you see a .txt file it should be safe. But if the content is the major factor in determining file type, then suddenly each and every .txt file becomes dangerous, because the system is not respecting the file extension.

So, if you care about security, you absolutely love file type determination by file extension, because it is transparent and safe, even when half the user’s brain fails to think before double clicking.

Just imagine the havoc, a script file is appended with .txt and when you double click, it runs automatically… What a security hell!

File extension file type determination makes this happening next to impossible.

nevj · February 27, 2022, 12:11am

Well Akito, I agree with you totally on systemd.

On file management…
One thing that helps with security is to follow the original Unix approach and put all executable files in a bin directory.
Then it does not matter what their name is, ie with or without extension.
That is my practice, when I write a script I make it executable and put it in ~/bin. Then I dont need a launcher, I can execute it just by naming it.

So I wonder could we do something like that in a GUI?
Mixing executables with other files is hazardous, so could we have some means of automatically moving them . Then we could have
a file manager
a script manager
and the normal app launcher.

And it is not just scripts, although they are especially nasty because they are ASCII. If I comple a C program and make an ELF file, the same applies, although it is less likely to be mistyped.

So how does a GUI user run an executable file? Seems like an aweful amount of fuss, just to avoid typing its name in a command line, or having to remember its name. But if we must have it, can we have a simple Run Manager that reminds us of names , keeps all the runnable files separate, and allows the user to drag an drop input and output files into the potential run candidate, instead of having to open I/O files, and allows the user to set options.
If we had that , the GUI user would really be able to avoid the command line and still access the full capabilities of Unix

I like your reply, especially the systemd bit.
Neville

Akito · February 27, 2022, 12:35am

Yes. That’s a good practice. But, that involves an extra step, many users don’t want to take or they don’t think about it.

Good idea.

I understand that to us, who are so familiar with the CLI, this might seem that way. However, for a normal user, it’s not an everyday thing. In fact, most people never had to use that. Especially people coming from Windows have almost always a huge barrier between them and just opening a terminal. Just opening it and thinking about using it is actually a big no-go for many users and, to be honest, I cannot blame them. CLI is natuarlly an advanced user’s tool.

There is something like that in KDE, but not sure if it covers all cases. It’s basically a search bar that can appear anytime on the desktop, if you wish so. It’s easy to run things like that. However, never used that a lot so don’t know if it would work so smoothly, as needed.

Thanks. The discussion we are having is a pleasure.

nevj · February 27, 2022, 12:45am

Will have a look.
Just got latest KDE into Void… was not easy. It looks rather different to previous version I used.

4dandl4 · February 27, 2022, 1:56pm

I have been following this conversation and still fail to see the issue, especially for one
coming from Windows. Please explain “briefly” in layman language.

Akito · February 27, 2022, 5:03pm

There are two ways to define what a file is:

by

file extension
file content

If a file is determined by its extension in its name, like e.g. a file called myname.txt would be identified as a text file, then it never matters what is actually in that file. It could be a binary, e.g. a myname.exe file on Windows, and if you rename it to myname.exe.txt, then the system would try to open it in your default text editor on double clicking the file in your file manager.
So, in theory, if you have a script file and you want to open it in your IDE, then it might still open in a simple text editor, when it is called myname.kt.txt, because the file extension is txt, leading to the system’s assumption of this being a simple text file.

If a file is determined by its actual content, then you can append whatever file extension you want, the OS will always try to outsmart you and ignore the file extension. If you, for whatever reason, want to open a file strictly with a text editor, even though the default app for it is an IDE, then no matter what you do, your Linux OS will still open that file in the IDE, when you double click on it. To open it in a different app, you would need to add a couple of extra steps, i.e. more effort is needed, i.e. the OS increases your work amount.

So, in theory, file type determination by file content sounds pretty fine. However, in reality, it’s pretty hard for the detection to get things right. It needs to understand the file content. This is basically like your Linux OS parsing (reading) the file and then deciding what the file type is. This is very error prone and I have demonstrated numerous examples in what ways that behaviour can fail.
The second big reason why I know that file determination by file extension is superior, is that file extensions are ultra transparent and every user gets the point. If you see a txt file you know what’s going on. You know why it’s been opened in a text editor, after double clicking that file in your file manager. There are no questions. It’s all obvious.
However, imagine you have a txt file and your file manager opens it in an IDE or even tries to execute it. This is intransparent. The user doesn’t know what’s going on. Plus, the OS tries to be smarter than the user. If I, the user, decide that that particular file should be considered a txt file, I want that to be respected by the OS and not be ignored to the point where the Linux OS thinks it’s smarter than me. No, sometimes it’s not smarter, so I want to have the last word on what the file type of the particular file should be.

This is why I advocate for file type determination by file extension, instead of by content.

Linux users talk a lot of power and having the OS under control. File type determination by file extension is one example of such power, because the OS does not try to outsmart you, when it’s often wrong in that specific situation.

4dandl4 · February 27, 2022, 7:08pm

So, in Windows, I can associate different file type extensions, to be opened with a certain program,
like notepad, word or the windows media player.
Does not the “open with” accomplish the same, in both Windows and Linux? I will say, that I do not
understand on how to make an Linux script file executable, unless it has a certain file ext, and use
./xxxxxxx.run in Linux, or using .exe file in Windows
I do understand that a file content can be harmful, if it is made to run, without user permission, I guess
that is why it is called a virus.
Any new Windows to Linux user, need to realize, that the Command Prompt, in both Windows and Linux is a very powerful tool, and should be used with respect.

Akito · February 27, 2022, 7:17pm

Yes. But the file type is determined by the file extension, so essentially you are associating the app with the extension, not the content. So it’s different.

It’s not so much about security, as some Linux users try to portrait it is. Running any file is a potential security risk.

This discussion solely focuses on the desktop environment’s GUI file manager. CLI usage is a different story.

nevj · February 27, 2022, 11:10pm

You can make any file name executable in Linux…something like
chmod 755 myfile
will do it.
That does not mean it will run, it just means the system will allow you to try to run it.
But, to run it successfully, the file content has to be something that will execute, eg a binary for a program or a script of a known type.

Open with does indeed do the same thing in Windows and Linux. It declares the file to be an input file for some application, eg an editor.
That is a different thing from telling Linux or Windows that the file is an executable and that you want to run the file itself.

When you click on a file in a file manager, you get open with and it tries to ‘guess’ the file type so it can make a sensible choice of what app to open it with. That ‘guess’ is what this discussion started out debating.

The debate has drifted into execution (ie run) and security. That is really a separate topic. You cant make the mistake of executing a file with a simple left button click. That means open with. Unless you set it up to open it with some app that will execute its input file ( eg the loader or bash or julia). That is the bit that is dangerous. I dont do that but some people like to, especially with scripts

I would recomment that users keep executable files away from other files ( eg in a bin directory), and run them either with the run GUI or from the command line. That takes the ‘guess filetype’ bit out of it and puts you in control.

Neville

4dandl4 · February 28, 2022, 12:47am

@nevj

And if I am using my Linux distro and browsing my email and a file pops up without any extension, how is it recognized as a Linux executable file that runs with only a mouse click?