Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why is the DOS path character "\"? (msdn.com)
269 points by thibaut_barrere on March 19, 2012 | hide | past | favorite | 66 comments


Goes deeper than just IBM. Prior to DOS there was CP/M and CP/M had many programs with an older heritage. For example, it was possible to copy from a source (such as a file) to a destination (such as another file, or an I/O port) using a program called PIP (http://en.wikipedia.org/wiki/Peripheral_Interchange_Program). On CP/M, for example, you could print a file with the command (PRN: is the name of the printer device; at the time that was likely a parallel connected printer):

    PIP PRN:=FOO.TXT
PIP itself predates CP/M having been present on machines such as the PDP-11. Wikipedia tells me that it was first implemented on the PDP-6. For example, on the PDP-10 you could do:

    PIP DTA1:/X=DSK:*.*
which copied all the files from the disk onto tape. The /X there modifies the destination and indicates that file attributes (modification time etc.) should be preserved.

PS Now my memory is fully restored from 1980s backing store I believe that CP/M actually replaced the / in PIP with single character switches inside [] and that the route to / being switches in MS-DOS most likely comes through DEC heritage from the PDP series. For example, the W switches would ignore r/o set on a file and overwrite it. This would copy all .COM files (i.e. executables) from B: to A: and overwrite without warning:

    PIP A:=B:*.COM[W]


You could still do that with copy in MS-DOS, I have a feeling my mom still does it since she has a printer on parallel port. So if she wants to print a textfile it's just

     copy file.txt LPT1
she can also print multiple files or results of print to file using the /b switch (binary) ignoring EOF

That also explains why you shouldn't use LPT1 as file name - it was a special name that meant first parallel port. I remember seeing a ridiculing thread about it on HN some time ago


"And that is why, in 2009, when developing in Microsoft .NET 3.5 for ASP.NET MVC 1.0 on a Windows 7 system, you cannot include /com\d(\..)?, /lpt\d(\..)?, /con(\..)?, /aux(\..)?, /prn(\..)?, or /nul(\..)? in any of your routes."

Source: http://news.ycombinator.com/item?id=655397


I'm pretty sure you could still do this up until at least Win2k+ if your mapped printer had a port associated.


It still works on Windows 7. Sadly, I know that offhand due to having to deal with some weird legacy stuff from our corporate HQ.


Interestingly, if you ever come to Japan you will find many people expect path separators to be yen symbols (¥). Their windows paths look like:

¥docs¥finance¥

To understand the reason, just check out the table on the Shift-JIS encoding:

http://en.wikipedia.org/wiki/Shift_JIS

Specifically, look at the character at 0x5C where the backslash sits in ASCII. Learning this was one of those mind-blowing moments when you try to imagine what a world where that was normal would be like.


Interesting! I wonder why they changed that particular character and not one that is literally everywhere in DOS.

Using ¥ as backslash still works in Windows and legacy apps (in OSX option+¥ gives a "real" \.)

This leads to some interesting situations: Word documents filled with ¥ where there should be \ and vice versa. By interesting I mean extremely annoying.

Another caveat is that anything hardcoded to use the tilde ~ will probably break because it is mapped to a completely key on Japanese keyboards. This causes endless fun in DOSBOX or even newer games: to get the console in Doom 3 I had to edit the source and recompile it with the console key mapped to another character.


Several positions in the table ({ | } [ \ ] @) were meant for "national variants". I guess those characters were deemed unimportant at the time.

Here is some more info: http://www.cs.tut.fi/~jkorpela/chars.html#national-ascii


Thus, the characters that appear in those positions - including those in US-ASCII - are somewhat "unsafe" in international data transfer

...

Systems that support ISO Latin 1 in principle may still reflect the use of national variants of ASCII in some details; for example, an ASCII character might get printed or displayed according to some national variant. Thus, even "plain ASCII text" is thereby not always portable from one system or application to another.

That's awesome! Hard to believe that systems have worked as well as they have. It's pretty obvious nobody was expecting that one day all computers would have to read each other's data pretty much constantly.

So I guess the question is: why did the DOS developers pick the character that happens to vary from region to region? I guess they didn't know?


As a wild guess, I would say that in the days before the internet made "international data transfer" something a regular person would encounter frequently, the thought just never occurred to them.


I wonder why they changed that particular character and not one that is literally everywhere in DOS.

I've wondered about that as well. My guess is that they looked at it and thought 'who needs two kinds of slash anyway?' I would love to know the actual thought behind that choice.


Obligatory link for this fact to the One True Source for oddities like these:

https://blogs.msdn.com/b/michkap/archive/2005/09/17/469941.a...


The Korean won symbol ₩ also occupies 0x5C. I never got totally used to seeing directory paths separated by ₩.


I have the korean QSENN DT-35 keyboard. The first time I opened up the cmd, I couldn't find the '\' and had to click random keys until I found the ₩ key :-D


Same for Korean with the ₩. And you can change the default system locale to Japanese or Korean and see it on any language version of Windows.


That was one of my awkward debugging moments. We had a path issue for a Japanese clients and I was completely unaware of this specificity. The path issue was somewhere else but understanding that having ¥ instead of \ was a feature and not a bug took me a while.


I've always felt the better question is "Why is the UNIX path character "/"?

People use "/" as a date separator as in 12/25/1979 so effectively banning that character as a character in filenames seems a really poor choices. By that metric chosing "\" is a much better choice because "\" is used no where I know outside of computers. I've never encountered it anywhere outside of computers so arguable it's the perfect choice for a path character.


> I've always felt the better question is "Why is the UNIX path character "/"?

Just a guess, but it's probably because \ is used for character escapes in C (and the Bourne shell). Given that the history of Unix is inextricably tied to C, it kind of makes sense to use / over \ so you don't have to do lots of \\ to escape path names.


Other people use '.' in dates. Or '-'. Especially the latter would end up with the same problem: Why is - and -- used as a switch character if it is used in dates (and even quite common in normal text, names)?

And given that the history of '\' as a path separator is that long, maybe you haven't seen in elsewhere _because_ it is now forever taken for the world's most popular OS?


Why is - and -- used as a switch character if it is used in dates (and even quite common in normal text, names)

Those uses rarely have it next to whitespace, and almost never immediately after whitespace.

If someone’s named Jean-Claude, it is not customary to refer to them as -Claude for short, and if I refer to items dated 2012-03-19 and 2012-03-20, it is not customary to call one of them -20.

But that last example shows something that Unixy switches do conflict with: negative numbers.


Well responded.

Two (not entirely serious) counterarguments:

- Lists like this

- Abusing the char as em-dash

- Weird languages. In German it's quite common to use that character to shorten the monster words. Pulling samples from my reared (note the disclaimer about being not to serious)

  Lebens- und Krankenversicherung
  (Shorthand for Lebensversicherung und Krankenversicherung) 
The inverse is possible, but less common as far as I am aware

  Arbeitsplatz und -Umfeld
  (Arbeitsplatz und Arbeitsumfeld?) 
Looks ugly to me and the samples are pre-coffee, but this is definitely possible.


All good points. I would say that lists like that are also abuse; the hyphens there ought to be · or something similar. But it certainly appears in the wild. (Similarly, in formal contexts, the negative sign is − and not -, but very few people care.)

The last usage is also possible in English. A construction like “face-huggers and -eaters” is rare but not outlandish.


Of course, Windows bans the use of both / and \ (along with : and other useful things) in filenames anyway so you still can't enter dates in that format. I guess back when they made the decision on the path separator the max path length in DOS was so short it may not have seemed relevant.


It seems from http://cm.bell-labs.com/who/dmr/hist.html that Multics used > as the path separator. Unix decided to use < and > for redirection (in the PDP7 era), so something else was chosen for pathnames when they were introduced later on the PDP11.


This is a good thing though since this encourages more people to use the ISO date formats. They are unambiguous, easy to parse and sort correctly


There are other uses of the slash character. It's a poor choice.

_ or | get my vote, but it's a bit late now to fix the mistakes of the past. There are enough special chars needed anyway that collisions are inevitable.

At least they didn't pick 'E'.


The problem with those characters is that they require you to hold shift to type them on a standard keyboard. Also, they make you reach in a way that / doesn't (now I'm showing my pro-/ bias).


On which standard keyboard?

Each country has a different layout, are you aware of it?

In most European countries both types of slash require two key presses.


In addition to the other reasons listed here,

/ is easier to type than \


On which country?



I have reading comprehension issues, because if I count correctly almost all layouts require AltGr or Shift to obtain \ or /.

But hey that is just me.


Put those into the tie category.


The early DOS developers wanted to use "/", so it turns out the underlying OS will accept either / or \. The main trouble you have is that cmd.exe and powershell think \ is the switch character -- so as long as you're not writing programs that write shell scripts or shell out you do OK.

Back when I was a Windows dev I drove the guys I worked with nuts because I used "/" instead of "\", figuring it would be more portable and I wouldn't have to write "\\" all the time. Of course, C# has @"" just to deal with that latter problem.


So I'm sort of nitpicking your comment here, but the purpose of @"" for verbatim strings wasn't created just to deal with directory slashes, it's just a handy way of writing string without the need for escaping in general.

Also, if you really want to make portable C# code, you should use Path.DirectorySeparatorChar, as it's not platform specific at all.

http://msdn.microsoft.com/en-us/library/system.io.path.direc...


What bugs me is that they chose a character right next to "Return" on the keyboard. For some reason no PC keyboard has moved it.

I've had embarrassing MS-DOS moments in the past where I was typing out something like a recursive-delete on a path. If you accidentally hit RETURN instead of a backslash then you can do things like blow away entire parent directories unintentionally. This didn't exactly improve my already-critical view of PCs. :)


It's funny to think of how standardized keyboards are within the US today, and yet it wasn't so long ago that you couldn't even count on the arrow keys being in the same place (or in fact, being there at all - which is why you have hjkl on Vim to this day.

On the other hand, the computers at Bell Labs had an arrow key, which is why R (which is based on S) uses '<-' for assignment, as the two-character combination was enabled for compatibility reasons.

And on a related note, I can't imagine how people use Vim without rebinding ESC to Caps_Lock. Given that ESC used to be where TAB currently is (and the latter is too useful to be removed), I don't really see a point in keeping the obsolescent Caps_Lock, especially when it's so irritating to keep moving the hand to the furthest corner of the keyboard.


I usually swap the left ctrl and the Caps lock keys, mostly out of habit from the first computer I used (An actual computer, with the Magnetic Tape spools and punch-card boot loader).

I have also dealt with crappy tiny keyboards that don't even have any keys beyond the standard chunk (So no function keys, 10-key, or arrows / browsing control)


As a frequent vim user I probably should have done the caps-as-esc remap but by now caps-as-control is so engrained in my muscle memory that I now have a significant friction to changing it.

My favorite keyboard happens to be the Apple Extended Keyboard II, but I can't use one these days because its caps key would physically toggle between on and off states (like a typewriter caps key) which means it does not make a very good candidate for remapping.


Ctrl-[, which turns out to be ESC, is one solution. Given that many people like to remap CapsLock to Ctrl, it might give you the best of both worlds.


That is interesting, I have had the exact opposite experience!

I have never seen a PC keyboard with the backslash anywhere other than in the bottom LEFT corner next to your left smallest/pinky finger with one exception: in Germany and France, they keyboards sometimes have backslash on the top line numerical keys.


Germany has a really awkward position for this key. You can only reach it by hitting 'Alt Gr' (the right Alt key) and a _right_ key in the row of digits. One-handed and similar (albeit mirrored) to the US layout's access to ~.

I .. switch all my keyboard to the US layout because I feel that I kill my fingers otherwise to reach {[]}| and \


are you saying they looked into the future and chose a character next to return on a keyboard that hadn't been designed yet?

http://www.pcguide.com/ref/kb/layout/stdXT83-c.html


I have always hated how the colon is used in windows for drive letters, there have been way too many times where I would type a long path only to realize that there is a semi-colon at the beginning.


In Microsoft tradition, this article completely ignores the fact that DOS was a CP/M clone bought wholesale from another company (originally called QDOS). The question is, did QDOS have / for command line switches? Or were all of these tools added (as the article implies) by Microsoft engineers later?

[I didn't use QDOS, but I did use CP/M and it had the most unholy command line syntax. I don't fondly remember PIP...]


No, QDOS (86-DOS) did not have switches. IIRC none of the commands had any options at all. It did, however, have nice CP/M compatibility because there was a command (RDCPM?) that allowed you to read from CP/M formatted media.


CP/M did not have directories as well.


This article remind me of my first days of C coding and the problems generated with escaped "c:\haracters in strings".

After years I still think that the use of forward slash is the worst decision ever. At least Apple chose ":" on Mac OS classic.


If you're referring to \ it's a backslash not a forward slash.


A simple way to remember forward/backslash (which I probably read on HN) is the phrase "Backslash is by the backspace"

Edit: This is for the standard US keyboard. A more general way of thinking about it, though a bit more to remember: associate "positive" with "forward", and see that '/' has a positive slope. Or just go listen to a BBC podcast, and let the sound of the broadcaster saying "bbc dot co dot uk forward-slash newspod" resonate in your head.


"Forward slash" is a goofy name anyway. There are:

- "slash": the character you use in English and when typing fractions; and

- "backslash": the Microsoft-slash: backwards, like everything else they do.

Another way to remember is that (at least on US keyboards) the slash is always in the same place (on the same key as the question mark). The backslash, being the oddity that it is, is in different places, depending on the keyboard vendor.


Reminds me of many discussions I would have explaining how to invoke a particular DOS command to people whose MO was "dir" and then violently mashing <break> to allow themselves to page through a large directory.

"File slash" and "option slash" were far less confusing than decoding the secretary's concept of "forward" and "back."


It doesn't really make sense. Given we use a left-to-right and up-to-down, \ can be seen slashing forward in writing direction, starting from the top, and / backward.

Upslash / and downslash \ would make more sense, but this is what we have. My mnemonic is that forward and back are the opposite of what makes sense.


There's no one right way to remember something. I'm glad you found something that works for you!


Backslash is near L-Ctrl on my keyboard. Forward slash is far closer to backspace.

Perhaps a better way would be to think (for L-to-R language users) of the charater toppling "forwards" / or "backwards" \ WRT the text direction.


Echoing parent, in different words:

English is left-to-right. (And gravity is down.)

Is the slash "leaning forward" or is it "leaning backward".


Talking about the BBC: sometimes they say stroke instead of slash, which as an American, really confused me the first time I heard it.


But it isn't by the backspace... in fact it's almost the furthest from it! Probably depends on your keyboard layout though.


I think of them as pipes that are falling over. The backslash falls backward and the forward slash falls forward.


Alt-shift-7 on a Finnish Mac keyboard. Took me a while to find it the first time.


On a UK keyboard, it's by left shift, making slash closer to backspace.


In my time supporting naive endusers, it's always amused me that people at a level where they have trouble with left and right mouse buttons can remember slash and backslash like pros.


Fortunately today you can use string literals with most programming languages.


The comments has the story on that, which involved IBM.


Relevant to this, one of the most popular posts I’ve ever written: http://alanhogan.com/tips/php/directory-separator-not-necess...


Argh, second word of second paragraph is “answer’s”, an attempted pluralization of “answer”?

Argh!

Will continue reading anyway, and will lose karma for this, but shit, does no one even read their shit after publishing anymore?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: