Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

[flagged]


As the article says: "I find this particularly interesting because this isn't fundamentally a problem of the software being written in C. These are logic errors that are possible in nearly all languages, the common factor being this is a vulnerability in the interprocess communication of the components (either between git and external processes, or within the components of git itself). It is possible to draw a parallel with CRLF injection as seen in HTTP (or even SMTP smuggling)."

You can write this in any language. None of them will stop you. I'm on the cutting edge of "stop using C", but this isn't C's fault.


You can, but in languages like python/java/go/rust/... you wouldn't, because you wouldn't write serialization/de-serialization code by hand but call out to a battle hardened library.

This vulnerability is the fault of the C ecosystem where there is no reasonable project level package manager so everyone writes everything from scratch. It's exacerbated by the combination of a lack of generics (rust/java's solution), introspection (java/python's solution), and poor preprocessor in C (go's solution) so it wouldn't even be easy to make a ergonomic general purpose parser.


Python's pathlib wouldn't help you here, it can encode the necessary bits. Especially with configparser - it's 20 year old configuration reader. Java's story is worse.

What part of this would be prevented by another language?

You'd need to switch your data format to something like json, toml, etc. to prevent this from the outset. But JSON was first standardised 25 years ago, and AJAX wasn't invented when this was written. JSON was a fledgling and not widely used yet.

I guess we had netrc - but that's not standardised and everyone implements it differently. Same story for INI.

There was XML - at a time when it was full of RCEs, and everyone was acknowledging that its parser would be 90% of your program. Would you have joined the people disparaging json at the time as reinventing xml?

This vulnerability is the fault of data formats not being common enough to be widely invented yet.


> What part of this would be prevented by another language?

> You'd need to switch your data format to something like json, toml, etc.

The part where if you wrote this in any modern languages ecosystem you would do this.

Yes, modern languages and their ecosystems likely did not exist back then. The lesson going forwards is that we shouldn't keep doing new things like we did back then.

Saying smithing metal by using a pair of hand driven bellows is inefficient isn't to say the blacksmiths ages ago who had no better option were doing something wrong.


Ok... So you're not saying C is a problem.

You're saying every few years, we should torch our code and rewrite from scratch, using new tools.

... Enjoy your collapsing codebase. I'll stick with what works, thanks.


What an absurdly bad faith interpretation. I never said anything to even suggest abandoning old code.

As demonstrated by vulnerabilities like the one in the article, C (and its ecosystem) doesn't "work", so I'm glad to hear that you won't be sticking with that for new projects going forwards.


... Except you have already admitted that it has bupkus to do with C.

You said it was a lack of "modern tooling". Modern C toolchains vastly outstrip most, for modernity. C23 is three years old.

But no. I wont be breaking compatibility for everyone, just to chase a shiny nickel. That is burning the barnyard for fear of geese.


This is a total strawman. Blaming old tools for a problem is not a cry to never ever use old tools.

They want better choices going forward. You made up this constant rewrite crusade so you could have something to be mad at.

Also a few compiler updates don't make an ecosystem modern.


It's not a straw man. We were talking about git using a particular thing. They said particular thing was a dumb idea and git should change it. That's a rewrite.


They did not say git should replace this parser, though you can argue they implied it.

They did not say git should change language.

They did not say "every few years, we should torch our code and rewrite from scratch, using new tools." That's a fever dream that barely resembles their words in a way that makes you super right and them super unreasonable.

A key phrase they said was "we shouldn't keep doing new things like we did back then". New things. That's not saying to rewrite anything.


I have a feeling that this code was developed before any of those languages were widely popular and before their package managers or packages were mature.

This file was written like 20 years ago.


Sure, I'm not trying to assign blame to Linus for deciding to write git in C, I'm saying that modern tooling (not C) would prevent the bug with reasonably high probability and that that's a factor when deciding what to do going forwards.


We keep getting RCEs in C because tons of useful programs are written in C. If someone writes a useful program in Rust, we'll get RCEs in Rust.


There are a lot of useful programs written in Rust nowadays. This comment might have made more sense like 5 years ago.


I mean Photoshop, Excel, Figma, etc -- programs I can show someone and say "Look, here's a cool thing you couldn't do with a computer before, but now you can!" Nothing I've seen in rust cuts meets that bar for me.


materialize.com (disclosure: I worked there for five years) is entirely written in Rust and as far as I know the first system to support incremental view maintenance over the full range of SQL semantics (including e.g. fully precise non-windowed joins, recursive queries, etc.) with a SQL interface (Postgres dialect).


It's not that only C programs are useful. It's that subtle mistakes on C result in more catastrophic vulnerabilities.

Make a mistake in application code in a language like, say Java, and you'll end up with an exception.


The article refutes that somewhat:

> I find this particularly interesting because this isn't fundamentally a problem of the software being written in C. These are logic errors that are possible in nearly all languages, the common factor being this is a vulnerability in the interprocess communication of the components (either between git and external processes, or within the components of git itself).


C programmers don't see C problems. They see logic errors that are possible in any language.


Running with scissors isn't a problem. The problem is stabbing yourself with them. Isn't it obvious?


As mentioned in the article, this is a logic error that has nothing to do with C strings.


Whilst true, there’s a swathe of modern tooling that will aide in marshalling data for IPC. Would you not agree that if protobuf, json or yaml were used, it’d be far less likely for this bug have slipped in?


The OC was about language choice. You can use protobuf, json or yaml in C as well.

In general, though, all these can be wildly overkill for many tasks. At some point you just need to write good code and actually test it.


No I would not agree that YAML or JSON parsers in any language are far less likely to have logic errors, and I'm not sure why protobuf (a binary format) would be a good choice for a human readable file.

INI is not a particularly complex format (less complex than YAML for example), and there are existing open source parsers written in C that could have been used.

You can dig in all you want, but this is not an issue with C strings or the INI format.


This isn't even a parser error at all - the INI format comes from DOS/Windows where a trailing carriage return would not be considered part of the value either.


In isolation, for any one particular bug, yes, but if you start applying this logic to everything, even problems as simple as reading some bytes from a file, you end up with a heao of dependencies for the most mundane things. We've tried that, it's bad.


I don't believe we must apply any guideline ad absurdum. Using a battle tested marshalling/serialization library is clearly the way to go most often. Of course, one can still construct difficult to parse XML and JSON or any other blob for any given format, but the chances that bad input will result in an RCE are lower.


On the contrary, we've tried it and it works great.


No, I think in general you should trust other people with experience in an area more than your own naive self. Division of labor and all that.

There are exceptions, as always, but using dependencies is good as a first approximation.


Having "safe" yaml parsing is a whole topic of head scratching in whatever language of your choice if you want a rabbit hole to look into...


If only it were just the c code that was causing people to be owned lol


Using other languages would likely fix the issue but as a side-effect. Most people would expect a C-vs-Rust comparison so I’ll take Go as an example.

Nobody would write the configuration parsing code by hand, and just use whatever TOML library available at hand for Go. No INI shenanigans, and people would just use any available stricter format (strings must be quoted in TOML).

So yeah, Rust and Go and Python and Java and Node and Ruby and whatnot would not have the bug just by virtue of having a package manager. The actual language is irrelevant.

However, whatever the language, the same hand implementation would have had the exact same bug.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: