Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Audio, video, and image codecs written in Rust seem like a fantastic early opportunity to investigate Rust's utility for closing down related security vulnerabilities.

Is anyone aware of work on fuzz testing an interesting Rust-based "attack surface"(1)? I'm very interested to see what kinds of issues are/aren't turned up in Rust vs. the usual C/C++ code for these libraries.

[1] By "attack surface", I'm thinking of traditional bits of code with high exposure to untrusted data: codecs, HTTP parsers, etc.



I know we've run the Rust URL parser through AFL. IIRC, it found one panic (brackets in URLs) and nothing else. Nothing security-sensitive was found.

(Don't take the statement about security too strongly, however; AFL will not detect logic problems that could result in security issues, such as interactions with TLS hostname validation or whatnot.)


I don't think that production-quality codecs will be written in Rust. As far as I aware, codecs are usually very complex piece of software and employ a lot of hand-written assembly and very low-level C. Rust just doesn't offer anything valuable in this area.


A very complex piece of software is exactly the sort of thing you want to avoid writing in C or assembly. The last person I know writing a codec implementation did it by machine-translating the spec into a functional program that output assembler, C, python, Verilog etc as targets. That did require handcrafting "leaf nodes" for things like matrix multiplication, but they were small enough to verify.


Yes, I think of Rust as a promising target for parser generators. Thinking about text parsing, if Bison generated Rust, then you'd have a memory-safe parser that should be about as efficient as C. Something like this probably already exists or is being worked on. Ideally existing code generators could target Rust also to get memory safety.

There are some SIMD optimizations in parsers though that I don't know how easily you could express in Rust. The quintessential example of this for text parsing is Clang's optimization that uses SSE to skip over C++ comments 16 bytes at a time:

https://github.com/llvm-mirror/clang/blob/61f6bf2c8a8e94c4fa...


I am far from an expert in this area, but you can do SIMD with Rust: https://github.com/huonw/simd


This looks like a good start (though experimental). If it supported something like AltaVec's vec_any_eq() intrinsic on its u8x16 type, that would do the trick. vec_any_eq() takes a vector and a value and returns true if any element of the vector equals the value.

On x86 with SSE, this could generate a sequence of two instructions: pcmpeqb (do 16 byte-wise compares) followed by pmovmskb (collect the 16 comparison results into a single byte). Then you'd get the same efficiency as what Clang does (Clang searches 8 bytes at a time for a '/' character when skipping over comments).


Huon is working on SIMD full time at Mozilla this summer as far as I know. We should see more mature support materialize in the next couple of months for sure.


I have been working with https://github.com/kevinmehall/rust-peg for a while. Also there is https://github.com/Geal/nom


A while back there was a blog post (I think by Dark Shikari) about the use of assembly in x264. I couldn't find it by searching, but basically the very best C versions of essential image processing algorithms were orders of magnitude slower than the hand-crafted assembly versions, especially with SIMD instructions.

To create a viable codec, a functional spec compiler would have to close that performance gap. Edit: it might be possible to create a set of SIMD algorithmic building blocks (e.g. like liboil[0]) that could be verified for correctness then incorporated into the compiler.

[0] https://wiki.freedesktop.org/liboil/


There has been work on typed assembly that could also apply: http://research.microsoft.com/en-us/projects/talproj/


Conveniently computers are getting orders of magnitude faster and cheaper, while the software on them - by virtue of complexity - has an increasing attack surface.

I will happily trade a video player utilising 4% of my CPU power and with fewer potential vulnerabilities than one using 0.4%.


When it comes to video encoding, the tradeoff is often between encoding live video or not encoding live video.

I think there's also an important tradeoff to be made in how much of a carbon footprint we dedicate to software. I really don't think we should be trying to make software expand to consume all available resources. We have to be both efficient and secure.


Even when you are on a mobile device constrained by battery?


Very interesting. How can we get more info on this? Is there any public code or product out there?


The resulting commercial product is not the codec itself, but a set of test signals for codec verification for use by hardware vendors.

http://www.argondesign.com/products/argon-streams-hevc/


Thanks! I'm contributing to an open-source project where we do a lot of standardization (including MP4). We're trying to improve the parser generation by using a model from the specification. We even have some funding for this. I'd be happy if we could discuss this (contact@gpac.io). Better standards means a better world for everyone :)


Can you imagine, writing a Bluetooth spec (and profiles) in a high level language and let Rust (or something else) produce the native driver code?


Yes that's exactly what it is about! Unfortunately, our current result shows that it still requires much time to model the spec correctly. Any help appreciated, my contact is available on the message above :)


We'll see. For inline assembly, sure, it doesn't, but you can do some pretty intense stuff with Rust's type system to guarantee certain static properties. For example, earlier today this comment appeared on Reddit, talking about how to use the type system to guarantee that you're not doing out-of-bound array accesses: http://www.reddit.com/r/rust/comments/3aahl1/outside_of_clos...

It is true that the lower you go, the less we currently offer, but over time I expect library authors will use this and other future features to check even more properties without runtime overhead.


A more modern language (modules, closures, pattern matching, package management, type inference) and memory safety are nice. If I were writing a new codec, I'd seriously consider writing the lion's share of it in Rust, to save time and to avoid security problems.


I think the sentiment of the parent was that we should avoid (re)writing things in pet language of the week.

Sure, Rust has a great following, but so does GO. Which is the "right" choice for a new codec? I don't think there is a clear answer.


Rust does not require a runtime / garbage collector and offers a programming model that interacts very precisely with external threading requirements. It supports being called from the native platform ABI without initialization, and it supports writing libraries to match an existing, native ("C") ABI and function as a drop-in replacement. Go does not do these things.

The suggestion to use Rust here is because this is specifically a thing Rust was designed to do, not because it has a huge following. Go is a great language, but it is designed for different things (and has a following of people who want it to do those different things). If those two languages are the two choices for replacing a codec that functions as a library, there is an objectively correct answer.


For a codec there is probably a clearer answer than for some other applications- codecs are very performance-sensitive (probably don't want a GC) and also very security-sensitive (you're decoding completely untrusted data). Previously they've always been written in C or C++ because of the performance requirements, and Rust keeps the same performance while adding some more safety checks. Go doesn't.


>Sure, Rust has a great following, but so does GO. Which is the "right" choice for a new codec? I don't think there is a clear answer.

Sure there is. Go would be an absolutely horrible choice for writing a video codec -- and wasn't even designed for that kind of work in the first place.


> Sure, Rust has a great following, but so does GO. Which is the "right" choice for a new codec? I don't think there is a clear answer.

Actually, there is quite a clear right answer for a hardware codec.

You have 3 choices: C, C++, or Rust.

While C or C++ probably is better for the core of the codec, that's not where your security and concurrency issues probably are.

Most of your issues are in unpacking headers, validating packets, getting decode parameters correct, etc. Rust is WAY better for this task than C/C++ from a security or concurrency point of view.


There is also Ada. :)


How easy is it to inter-link C and ADA?



As codec operations move into dedicated hardware blocks on silicon, their code will be more about pointing registers at blocks of data and less about performing the mathematics on CPU.


rust code can incorporate assembly just like C can and low-level rust should be just as fast as low level C so it sounds like rust will be a great replacement to write production-quality codecs


I use downvotes pretty sparingly on HN, but I'm afraid you're completely wrong. This is the kind of thing that Rust excels at.


What about (de)muxers?


Container formats are much, much, much simpler and demand much less CPU than sample data formats. One could (and I have) write a perfectly useable ISO parser in emacs lisp, of all things. The reason that they tend to be written in C is because a container parser is not super useful outside of the context of handling the sample data; the overwhelming majority of usages of code that understands containers is to be able to pump out pointers to each individual sample. Doing this across process boundaries would be ... unpleasant.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: