This is misleading because you use plural for both and I'm sure most of these UX missteps were _each_ made by a _single_ person, and there were >1 users even at the time.
Yeah but now they're using npm to install a million packages to do things like tell if a number is greater than 10000. The chances of the programmer wanting to understand the underlying system they are using is essentially nil.
> At least allow us to use names instead of numbers.
You can for the destination. That's the whole reason you need the "&": to tell the shell the destination is not a named file (which itself could be a pipe or socket). And by default you don't need to specify the source fd at all. The intent is that stdout is piped along but stderr goes directly to your tty. That's one reason they are separate.
And for those saying "<" would have been better: that is used to read from the RHS and feed it as input to the LHS so it was taken.
Bash syntax is anything but simple or logical. Just look at the insane if-statement syntax. Or how the choice of quotes fundamentally changes behavior. Argument parsing, looping, the list goes on.
> Also the choice of quotes changing behavior is a thing in:
In those languages they change what's contained in the string. Not how many strings you get. Or what the strings from that string look like. ($@ being an extreme example)
> @ Expands to the positional parameters, starting from one. When the expansion occurs within double quotes, each parameter expands to a separate word. That is, "$@" is equivalent to "$1" "$2" ... If the double-quoted expansion occurs within a word, the expansion of the first parameter is joined with the beginning part of the original word, and the expansion of the last parameter is joined with the last part of the original word. When there are no positional parameters, "$@" and $@ expand to nothing (i.e., they are removed).
That’s…a lot. I think Bash is interesting in the “I’m glad it works but I detest having to work with it” kind of way. Like, fine if I’m just launching some processes or tail’ing some logs, but I’ve rarely had a time when I had to write an even vaguely complex bash script where I didn’t end up spending most of my time relearning how to do things that should be basic.
Shellcheck was a big game changer at least in terms of learning some of the nuance from a “best practice” standpoint. I also think that the way bash does things is just a little too foreign from the rest of my computing life to be retained.
Complex and bash script should not be in the same sentence.
If a script you have is becoming complex, that’s an hint to use an anemable programming language with proper data types and structures.
Uh, reading a bash script shouldn't be as hard as doing your taxes. Bash syntax has to be simple because bash code is going to be read and reasoned by humans. Reading just a simple if statement in bash syntax requires a TON of knowledge to avoid shooting yourself in the foot. That's a massive failure of usability just to save a couple of keystrokes.
This is like saying "what's wrong with brainfuck??? makes sense to me!" Every syntax can be understood, that does not automatically make them all good ideas.
Bash syntax is the pinnacle of Chesterton's Fence. If you can't articulate why it was done that way, you have no right to remove it. Python would be an absolutely unusable shell language.
I didn't say that there wasn't a reason. I said it was absolute trash to use. It's so bad that the moment I need even the slightest bit of complexity, I will switch away from bash. Can't really say that for any other language.
Which is about the same as `2>&1` but with a friendlier name for STDOUT. And this way `2> /dev/stdout`, with the space, also works, whereas `2> &1` doesn't which confuses many. But it's behavior isn't exactly the same and might not work in all situations.
And of course I wish you could use a friendlier name for STDERR instead of `2>`
The situation where this is going to cause confusion is when you do this for multiple commands. It looks like they're all writing to a single file. Of course, that file is not an ordinary file - it's a device file. But even that isn't enough. You have to know that each command sees its own incarnation of /dev/stdout, which refers to its own fd1.
I quite like how archaic it is. I am turned off by a lot of modern stuff. My shell is nice and predictable. My scripts from 15 years ago still work just fine. No, I don't want it to get all fancy, thanks.
For a while, there was a strong trend of "I want to do everything in one singular language". Your coding is in language XYZ. Your build tools will be configured/written in XYZ. Your UI frontend will be generated from XYZ. Everything will be defined in XYZ.
Shell is from a time when you had a huge selection of languages, each for different purposes, and you picked the right one for the job. For complex applications, you would have multiple languages working together.
People look at Bash and think, "I would never dare do $Task with that language!". And you'd be right, because you're thinking you only have one tool in the toolbox.
Counter reason in favor is that you can always count on it being there and working the same way. Perl is too out of fashion and python has too many versioning/library complexities.
You have to write the crappy sh script once but then you get simple, easy usage every time. (If you're revising the script frequently enough that sh/bash are the bottleneck, then what you have is a dev project and not a script, use a programming language).
You're not wrong, but there's fairly easy ways to deal with filenames containing spaces - usually just enclosing any variable use within double quotes will be sufficient. It's tricker to deal with filenames that contain things such as line breaks as that usually involves using null terminated filenames (null being the only character that is not allowed in filenames). e.g find . -type f -print0
You're not wrong, but at my place, our main repository does not permit cloning into a directory with spaces in it.
Three factors conspire to make a bug:
1. Someone decides to use a space
2. We use Python
3. macOS
Say you clone into a directory with a space in it. We use Python, so thus our scripts are scripts in the Unix sense. (So, Python here is replacable with any scripting language that uses a shebang, so long as the rest of what comes after holds.) Some of our Python dependencies install executables; those necessarily start with a shebang:
#!/usr/bin/env python3
Note that space.
Since we use Python virtualenvs,
#!/home/bob/src/repo/.venv/bin/python3
But … now what if the dir has a space?
#!/home/bob/src/repo with a space/.venv/bin/python3
Those look like arguments, now, to a shebang. Shebangs have no escaping mechanism.
As I also discovered when I discovered this, the Python tooling checks for this! It will instead emit a polyglot!
#!/bin/bash
# <what follows in a bash/python polyglot>
# the bash will find the right Python interpreter, and then re-exec this
# script using that interpreter. The Python will skip the bash portion,
# b/c of cleverness in the polyglot.
Which is really quite clever, IMO. But, … it hits (2.). It execs bash, and worse, it is macOS's bash, and macOS's bash will corrupt^W remove for your safety! certain environment variables from the environment.
Took me forever to figure out what was going on. So yeah … spaces in paths. Can't recommend them. Stuff breaks, and it breaks in weird and hard to debug ways.
My practical view is to avoid spaces in directories and filenames, but to write scripts that handle them just fine (using BASH - I'm guilty of using it when more sane people would be using a proper language).
My ideological view is that unix/POSIX filenames are allowed to use any character except for NULL, so tools should respect that and handle files/dirs correctly.
I suppose for your usage, it'd be better to put the virtualenv directory into your path and then use #!/usr/bin/env python
For the BSDs and Linux, I believe that shebang are intepreted by the kernel directly and not by the shell. /usr/bin/env and /bin/sh are guaranteed by POSIX to exists so your solution is the correct one. Anything else is fragile.
These are part of the rituals of learning how a system works, in the same way interns get tripped up at first when they discover ^S will hang an xterm, until ^Q frees it. If you're aware of the history of it, it makes perfect sense. Unix has a personality, and in this case the kernel needs to decide what executable to run before any shell is involved, so it deliberately avoids the complexity of quoting rules.
This won't do what you're thinking it does. If I run that, I get:
env: No terminating quote for string: /path/with"/path/with
… because the string you've given env -S on my system is malformed, and lacks a terminating quote. (You can test this w/ just giving an unterminated quoted string to env … I have no idea why the messaging is so funky looking, but that's sort of par for the course here.)
As I alluded to in my post, shebangs don't handle escaping. Now, yes, you're thinking that env will do it, here. The other problem with shebangs is that they're ridiculously unstandardized. On Linux, for example, that shebang will parse out to:
This is because Linux passes everything after the first space as a single arg. macOS splits on spaces, but does no further processing (such as some form of backslash escapes) beyond that.
It works fine with GNU env with -S support, and a GNU-compatible kernel. I'm aware that won't work on some other systems, hence the 9 other examples. I said I would try that first and see how it goes, and low and behold it works fine on the systems I use.
$ cat bbb.ml
#!/usr/bin/env -S "/home/user/.local/bin/o c a m l" -no-version
print_endline "ok";;
$ ls -lh ~/.local/bin/"o c a m l"
lrwxrwxrwx 1 user user 14 Feb 27 07:26 '/home/user/.local/bin/o c a m l' -> /usr/bin/ocaml
$ chmod a+rx bbb.ml
$ ./bbb.ml
ok
$
But if it didn't work, you can get pretty good mileage out of abusing sh to get the job done for many popular languages.
They're more like capabilities or handles than pointers. There's a reason in Rust land many systems use handles (indices to a table of objects) in absence of pointer arithmetic.
In the C API of course there's symbolic names for these. STDIN_FILENO, STDOUT_FILENO, etc for the defaults and variables for the dynamically assigned ones.
What they point to are capabilities, but the integer handles that user space gets are annoyingly like pointers. In some respects, better, since we don’t do arithmetic on them, but in others, worse: they’re not randomized, and I’ve never come across a sanitizer (in the ASan sense) for them, so they’re vulnerable to worse race condition and use-after-free issues where data can be quietly sent to the entirely wrong place. Unlike raw pointers’ issues, this can’t even be solved at a language level. And maybe worst of all, there’s no bug locality: you can accidentally close the descriptor backing a `FILE*` just by passing the wrong small integer to `close` in an unrelated part of the program, and then it’ll get swapped out at the earliest opportunity.
BITD the one "fd sanitizer" I ever encountered was "try using the code on VxWorks" which at the time was "posix inspired" at best - fds actually were pointers, so effectively random and not small integers. It didn't catch enough things to be worth the trouble, but it did clean up some network code (ISTR I was working on SNTP and Kerberos v4 and Kerberized FTP when I ran into this...)
Well, if we reduce it enough I supposed they can be seen as the same concept through a certain kind of philosophical lense. True and false also belong in the same class, they're handles to a pool of two possible boolean values.
The difference is in scope. Pointers (aka memory addresses) are an ordered set of numbers enumerating all the memory locations, it enables unique powerful properties that enable a large set of uses that you cannot do with handles. And also make them quite unsafe and harder to understand.
It's a waste of time unless you're specifically targeting and testing mac, all of the BSDs, various descendants of Solaris, and other flavors of Unix. I wrote enough "portable shell" to run into so many quirks and slight differences in flags, in how different tools handle e.g. SIGPIPE.
Adding a new feature in a straightforward way often makes it work only on 4/7 of the operating systems you're trying to support. You then rewrite it in a slightly different way (because it's shell — there's always 50 ways to do the same thing). This gets you to 5/7 working systems, but breaks one that previously worked. You rewrite it yet another way, fixing the new breakage, but another one breaks. Repeat this over and over again, trying to find an implementation that works everywhere, or start adding workarounds for each system. Spend an hour on a feature that should have taken two minutes.
If it's anything remotely complicated, and you need portability, then use perl/python/go.
Actually, while the Actual Nodes are a linux thing, bash itself implements (and documents) them directly (in redirections only), along with /dev/tcp and /dev/udp (you can show with strace that bash doesn't reference the filesystem for these, even if they're present.)
> At least allow us to use names instead of numbers
Many people probably think in terms of "fd 0" and "fd 1" instead of "standard in" and "standard out", but should you wish to use names at least on modern Linux/BSD systems do:
Bash and zsh also allow, and modern Borne-compatible shells (sh) might too:
echo >&2 error_message
On Linux, /dev/std* requires the kernel to do file name resolution in the virtual file system because it could point to something nonstandard that isn't a symlink to something like /proc/self/fd/XX and then the kernel has to check that that should hopefully point to a special character device.
I don't have macos right now but I think that it doesn't have these files. What's worse is that bash emulates these files so they might even somewhat work, but not in all situations. I distinctly remember issues with this command:
I've done quite a bit of unix historical work ... Not enough for a talk at the CHM but decent enough that I have interviewed dozens of people.
I really think some basic stuff was just left in a hacky state and we never revisited the primitives right.
I've been trying to do that in my own projects
For instance I should be able to do something like
Command || processor
And not have processor hijack the input without hacky pty stuff. I am intentionally using || here.
There's lots of use cases to this: llms are the best, logging, rendering text, readline, translation, accessibility, it'd be a very useful primitive and it's impossible to do without a full pty wrapper or some kind of voodoo heuristic wrangling.
Nushell, Powershell, Python, Ruby, heck even Perl is better. Shell scripting is literally the worst language I've ever seen in common use. Any realistic alternative is going to be better.
But it isn't portable, unless you stick to posix subset which kinda sucks. You'll use some feature that some dude using an ancient shell doesn't have then he'll complain to you. And that list of features is LONG: https://oneuptime.com/blog/post/2026-02-13-posix-shell-compa...
If you're using shell specific features in a tightly controlled environment like a docker container then yeah, go wild. If you're writing a script for personal use, sure. If you're writing something for other people to run then your code will be working around all the missing features posix hasn't been updated to include. You can't use arrays, or arithmetic context, nothing. It sucks to use.
Besides, if you're writing a script it is likely that it will grow, get more complicated, and you will soon bump up against the limitations of the language and have to do truly horrible workarounds.
This is why if I need something for others to run then I just use python from the beginning. The code will be easier to read and more portable. At this point the vast majority of OS's and images have it available anyway so it's not as big a barrier as it used to be.
Anything that is infected by UCS-2 / UTF-16 garbage should be revised and reconsidered... Yeah UTF-8 has carve outs for those escape sequences... However JSON is even worse, you _have_ to use UTF-16 escapes. https://en.wikipedia.org/wiki/JSON#Character_encoding
Trying to be language agnostic: it should be as self-explanatory as possible. 2>&1 is all but.
Why is there a 2 on the left, when the numbers are usually on the right. What's the relationship between 2 and 1? Is the 2 for std err? Is that `&` to mean "reference"? The fact you only grok it if you know POSIX sys calls means it's far from self explanatory. And given the proportion of people that know POSIX sys calls among those that use Bash, I think it's a bit of an elitist syntax.
POSIX has a manual for shell. You can read 99% of it without needing to know any syscalls. I'm not as familiar with it but Bash has an extensive manual as well, and I doubt syscall knowledge is particularly required there either.
If your complaint is "I don't know what this syntax means without reading the manual" I'd like to point you to any contemporary language that has things like arrow functions, or operator overloading, or magic methods, or monkey patching.
No, the complaint is that "the syntax is not intuitive even knowing the simpler forms of redirection": this one isn't a competition of them, but rather an ad-hoc one.
I know about manuals, and I have known this specific syntax for half of my life.
Arrow functions etc are mechanisms in the language. A template you can build upon. This one is just one special operator. Learn it and use it, but it will serve no other purpose in your brain. It won't make anything easier to understand. It won't help you decipher other code. It won't help you draw connections.
> the syntax is not intuitive even knowing the simpler forms of redirection
The MDN page for arrow functions in JS has, I shit you not, 7 variations on the syntax. And your complaint is these are not intuitively similar enough?
Is it more sane, or is it just what you are used to?
Python doesn't really have much that makes it a sensible choice for scripting.
Its got some basic data structures and a std-lib, but it comes at a non-trivial performance cost, a massive barrier to getting out of the single thread, and non-trivial overhead when managing downstream processes. It doesn't protect you from any runtime errors (no types, no compile checks). And I wouldn't call python in practice particularly portable...
Laughably, NodeJS is genuinely a better choice - while you don't get multithreading easily, at least you aren't trivially blocked on IO. NodeJS also has pretty great compatibility for portability; and can be easily compiled/transformed to get your types and compile checks if you want. I'd still rather avoid managing downstream processes with it - but at least you know your JSON parsing and manipulation is trivial.
Go is my goto when I'm reaching for more; but (ba)sh is king. You're scripting on the shell because you're mainly gluing other processes together, and this is what (ba)sh is designed to do. There is a learning curve, and there are footguns.
File descriptors are like handing pointers to the users of your software. At least allow us to use names instead of numbers.
And sh/bash's syntax is so weird because the programmer at the time thought it was convenient to do it like that. Nobody ever asked a user.