Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> If you're tired of hearing about memory safety, this article is for you.

Tell me more about memory safety, any time; just hold the Rust.

Rust skeptics are not memory safety skeptics. Hopefully, there are no memory safety skeptics, other than rhetorical strawmen.



I've spoken with quite a few C++ developers who swear that they don't need memory safety tooling because they're good enough to achieve it on their own. More than a few also expect that the worst that can happen from breaking memory safety is a SEGFAULT, which also suggests that they don't understand memory safety.

So, I'd say that there is still some outreach to do on the topic.

On the other hand, you're absolutely right that Rust is only one of the many ways to get there.


OK, so those are skeptics about tooling, not about memory safety per se.

And not even about tooling per se, since achieving safety on their own doesn't literally mean on their own; they are relying on tooling.

True Scotsman's "on your own" means working in assembly language, in which you have to carefully manage even just calling a function and returning from it: not leaving stray arguments on the stack, saving and restoring all callee-saved register that are used in the function and so on.

Someone who thinks that their job is not to have segfaults is pretty green, obviously.


I'm not entirely certain what you mean.

These C++ developers I'm mentioning are all at least senior (some of them staff+), which makes their remarks on segfaults scary, because clearly, they haven't realized that a segfault is the best case scenario, not the worst one. This means that they very much need a refresher course on memory safety and why it matters.

The fact that they assume that they're good enough to avoid memory errors without tooling despite the fact that most of these errors are invisible and may remain invisible for years before being discovered is a second red flag, because it strongly suggest that they misunderstand the difficulty.

Of course, the conversation between you and I is complicated by the fact that the same words "memory safety" could apply to the programming language/tooling or to the compiled binary.


The whole discussion sucks, because Rust is not memory safe, and can easily be worse than C and C++ regarding memory safety. Memory unsafety is entirely possible in Rust[0].

[0]: https://materialize.com/blog/rust-concurrency-bug-unbounded-...


Looks like unsafe code doing the actual freeing.. unsafe Rust has never been claimed to be memory safe.


Not sure where that comes from. We're not even discussing Rust in this thread.


> True Scotsman's "on your own" means working in assembly language, in which you have to carefully manage even just calling a function and returning from it: not leaving stray arguments on the stack, saving and restoring all callee-saved register that are used in the function and so on.

I've actually been trying to figure out if it's practical to write assembly but then use a proof assistant to guarantee that it maintains safety. Weirdly feels easier than C insofar as that you don't have (C's) undefined behavior to contend with. But formal methods are a whole thing and I'm very very new to them so it remains to be seen whether this is actually a good idea, or awful and stupid:) (Edit: ...or merely impractical, for that matter)


You should take a look at TAL (the Typed Assembly Language) and DTAL (the Dependently Typed Assembly Language). The latter was designed specifically to support co-writing assembly and formal proofs.


It would be impractical. It would be monumental to even specify what 'safe' means in that degree of freedom, let alone allow/reject arbitrary assembly code according to the spec.

E.g. Trying to reject string operations which write beyond the trailing \0. At assembly level, \0 is only one of many possible conventions for bounding a string. E.g. maybe free() is allowed to write past 0s. So you'd need to decide whether it's safe depending on context.


Well yeah, I assumed if I was doing formal verification that that would necessarily involve fully specifying/tracking data sizes, which would mean that all strings would be pascal strings (at least internally; obviously it might be necessary to tack on a \0 for compatibility with external code).


You could do that and some very expensive aerospace tooling exists in that direction (for verifying compiler outputs), but an easier way is to use sanitizers on a best effort basis. You can either use something like AFL++ with QAsan or use retrowrite to insert whatever you want into the binary as if it was a normal compiled program.


That doesn’t sound so different from using a programming language that guarantees safety. The compiler proves that its translations are safe.


Sounds like a straw-man. I know developers who are good enough to achieve it on their own, but they use the tooling anyway, because one can’t write perfect code always: feature requests might be coming in too fast, team members have different skill levels, dev turnover happens, etc.

Furthermore, memory bugs still can be considered by teams as just another bug, so they might not get prioritised.

The only significant difference is that there’s lots of criminal energy targeting them, otherwise nobody would care much.


I wish memory safety skepticism was nothing more than a rhetorical strawman. It's not hard to find prominent people who think differently though. Take Herb Sutter for example, who argues that "memory safety" as defined in this article is an extreme goal and we should instead focus on a more achievable 95% safety instead to spend the remaining effort on other types of safety.

I can also point to more extreme skeptics like Dan O'dowd, who argue that memory safety is just about getting gud and you don't actually need language affordances.

Discussions about this topic would be a lot less heated if everyone was on the same page to start. They're not. It's taken advocates years of effort to get to the point where we can start talking about memory safety without immediate negative reactions and that process is still ongoing.


> Take Herb Sutter for example, who argues that "memory safety" as defined in this article is an extreme goal and we should instead focus on a more achievable 95% safety instead to spend the remaining effort on other types of safety.

One thing I've noticed when people make these arguments is that they tend to ignore the fact that most (all?) of these other safeties they're talking about depend on being able to reason about the behaviour of the program. But when you violate memory safety a common outcome is undefined behaviour, which has unpredictable effects on program behaviour.

These other safeties have a hard dependency on memory safety. If you don't have memory safety, you cannot guarantee these other safeties because you can no longer reason about the behaviour of the program.


Herb Sutter's article on this.[1]

For C/C++, memory safety is a retrofit to a language never designed for it. Many people, including me, have tried to improve the safety of C/C++ without breaking existing code. It's a painful thing to attempt. It doesn't seem to be possible to do it perfectly. Sutter is taking yet another crack at that problem, hoping to save C/C++ from becoming obsolete, or at least disfavored. Read his own words to see where he's coming from and where he is trying to go.

Any new language should be memory safe. Most of them since Java have been.

The trouble with thinking about this in terms of "95% safe" is that attackers are not random. They can aim at the 5%.

[1] https://herbsutter.com/2024/03/11/safety-in-context/


> Most of them since Java have been

The most popular ones have not been necessarily. Notably Go, Zig, and Swift are not fully memory safe (I’ve heard this may have changed recently for swift).


Could you expand on how Go isn’t memory safe?


Go's memory safety blows up under concurrency. Non-trivial data races are Undefined Behaviour in Go, violating all safety considertions including memory safety.


While that keeps being given as example, Go is not C, C++, Objective-C levels of memory corruption opportunities.

Lets not let the perfect be the enemy from good.

Even with all my criticism of many Go's design decisions, I rather have more infrastructure code being written in Go than C, and derived languages.

Or any of the supposed better C alternatives, with manual memory management and use after free.


Go maps don't have enough locking to be thread safe, as I understand it. That was true at one time. Was it fixed?


I would not expect that it makes sense to provide this as the default for Go's hash table type, my understanding is that modern Go has at least a best effort "fail immediately" detector for this particular case, so when you've screwed this up your code will exit, reporting the bug, in production and I guess you can curse "stupid" Go for not allowing you to write nonsense if you like, or you could use the right tool for the job.


Isn't it strange that they created a new language where one of its main selling points is concurrency, and punted on that design issue.

(Yes I'm aware that Go literature 'encourages' the use of Channels and certain patterns)


Similar to how Rust clearly is not memory safe, Go is also not memory safe.


Not sure what this is saying but you can create a trivial concurrent program that violates the memory safety guarantees Go is supposed to provide [1]: https://biggo.com/news/202507251923_Go_Memory_Safety_Debate

That is not true of Rust.


> [You can] create a trivial concurrent [Go] program that violates the memory safety guarantees ... > That is not true of Rust.

It's not supposed to be, but Rust does have plenty of outstanding soundness bugs: https://github.com/rust-lang/rust/issues?q=state%3Aopen%20la...

Rust as intended is safe and sound unless you write u-n-s-a-f-e, but as implemented has a ways to go.


It's hard to imagine that if a memory problem were reported to Sutter about one of his own programs, that he would not prioritize fixing that, over most other work.

However, I imagine he would probably take into consideration the context. Who and what is the program for? And does the issue only reproduce if the program is misused? Does the program handle untrusted inputs? Or are there conceivable situations in which a user of the program could be duped by a bad actor into feeding the program a malicious input?

Imagine Sutter wrote a C compiler, and someone found a way to crash it. But the only way to reproduce that crash is via code that invokes undefined behavior. why would Herb prioritize fixing that over other work?

Suppose the user insists that he's running the compiler as a CGI script, allowing unauthenticated visitors to their site to compile programs, making it a security issue.

How should Herb reasonably reply to that?


The problem in this conversation is that you are equivocating between "fixing memory safety bugs" and "preventing memory safety bugs statically." When this blog post refers to "memory safety skeptics," it refers to people who think the second is not a good way to expend engineering resources, not your imagined flagrantly irresponsible engineer who is satisfied to deliver a known nonfunctional product.


It's worth differentiating the case of a specific program from the more general case of memory safety as a language feature. A specific program might take additional measures appropriate to the problem domain like static analysis or using a restricted subset of the language. Memory safety at the language level has to work for most or all code written using that language.

Herb is usually talking about the latter because of the nature of his role, like he does here [0]. I'm willing to give him the benefit of the doubt on his opinions about specific programs, because I disagree with his language opinions.

[0] https://herbsutter.com/2024/03/11/safety-in-context/


Yeah, what kind of Crazy Person would make a web site where unauthenticated visitors can write programs and it just compiles them?

What would you even call such a thing? "Compiler Explorer"?

I guess maybe if Herb had helped the guy who owned that web site, say, Matt Godbolt, to enable his "Syntax 2 for C++" compiler cppfront, on that site, it would feel like Herb ought to take some responsibility right ?

Or maybe I am being unreasonable ?


> Imagine Sutter wrote a C compiler, and someone found a way to crash it. But the only way to reproduce that crash is via code that invokes undefined behavior. why would Herb prioritize fixing that over other work?

Because submitting code that invokes undefined behavior to one's compiler is a very normal thing that most working C developers do dozens of times per day, and something that a decent C compiler should behave reasonably in response to. (One could argue that crashing is acceptable, but erasing the developer's hard drive is not - but by definition that means means undefined behaviour in this situation is not acceptable).

> Suppose the user insists that he's running the compiler as a CGI script, allowing unauthenticated visitors to their site to compile programs, making it a security issue. How should Herb reasonably reply to that?

By fixing the bug?


> Take Herb Sutter for example, who argues that "memory safety" as defined in this article is an extreme goal and we should instead focus on a more achievable 95% safety

I wonder how you figure out when your codebase has reached 95% safety? Or is it OK to stop looking for memory unsafety when you hit, say, 92% safe?


Anything above 90% safety is acceptable because attackers look at that and say “look they’ve tried hard. We shouldn’t attack them, it’ll only discourage further efforts from them.” When it comes to software security, it’s the thought that counts.


I would not describe Herb as a memory safety skeptic. He's a skeptic of what is practically achievable w/r/t memory safety within the C++ language and community. All 100% memory safe evolutions of the language are guaranteed to break so much that they receive near zero adoption. In that context I think it makes sense to talk about what version of the language can we create to catch the most errors while actually getting people to use it.


> Take Herb Sutter for example, who argues that "memory safety" as defined in this article is an extreme goal and we should instead focus on a more achievable 95% safety instead to spend the remaining effort on other types of safety.

I don't really see how that's a) a scepticism of memory safety or b) how it's not seen as a reasonable position. Just because someone doesn't think X is the most important thing ever doesn't mean they are skeptical of it, but rather that the person holding the 100% viewpoint is probably the one with the extreme position.


Look at the definition quoted in the article:

    [A] program execution is memory safe so long as a particular list of bad things, called memory-access errors, never occur
"95% memory safety" is not a meaningful concept under this definition! That's very much skepticism of memory safety as defined in this article, to highlight the key phrase in the comment you're quoting.

It's also not a meaningful concept within the C++ language standard written by the committee Herb Sutter chairs. Memory unsafety is undefined behavior (UB). C++ code containing UB has no defined semantics and is inherently incorrect, whether that's 1 violation or 1000.

Now, we can certainly discuss the practical ramifications of 95% vs 100%, but even here Herb's arguments have fallen notoriously flat. I'll link Sean Baxter's piece on why Herb's actual proposals fail to achieve even these more modest goals as an entry point [0]. No need to rehash the volumes of digital ink already spilled on this subject in this particular comment thread.

[0] https://www.circle-lang.org/draft-profiles.html


Skepticism of an absolutist binary take on memory safety is not the same as skepticism of memory safety in general and it's important to distinguish the two.

It's like saying that people skeptical of formal verification are actually skeptical of eliminating bugs. Most people are not skeptical of eliminating bugs, but they might be skeptical of extreme approaches to do so.


As I explained in a sibling comment, memory safety violations aren't comparable to logic bugs. Avoiding them isn't an absolutist take, it's just a basic requirement in common programming languages like C and C++. That's not debatable, it's written right into the language standards, core guidelines, and increasingly government standards too.

If you think that's impossibly difficult, you're starting to understand the basic problem. We already know from other languages that memory safety is possible. I've already linked one proposal to retrofit similar safety onto C++. The author of Fil-C is elsewhere in these comments arguing for another way.


Everything you say about memory safety issues applies to logic bugs too. And likewise in reverse - you can have a memory safety issue that doesn't result in a vulnerability or crash. So I don't buy it that memory safety is so different from other types of bugs that it should be considered a binary issue and not on a spectrum like everything else!


> Everything you say about memory safety issues applies to logic bugs too.

It doesn't, because logic bugs generally have, or can be made to have limited scope.

> And likewise in reverse - you can have a memory safety issue that doesn't result in a vulnerability or crash.

No you can't, not in standard C. Any case of memory unsafety is undefined behaviour, therefore a conforming implementation may implement it as a vulnerability and/or crash. (You can have a memory safety issue that happens to not result in a vulnerability or crash in the current version of gcc/clang, but that's a lot less reassuring)


This whole memory-bugs-are-magical thinking just comes from the Rust community and is not an axiomatic truth.

It’s also trivial to discount, since the classical evaluation of bugs is based on actual impact, not some nebulous notions of scope or what-may-happen.

In practice, the program will crash most of the time. Maybe it will corrupt or erase some files. Maybe it will crash the Windows kernel and cause 10 billion in damages; just like a Rust panic would, by the way.


We simply don't treat "gcc segfaults on my example.c" file the same way as "libssl has an exploitable buffer overflow". That's a synopsis of the nuance.

Materials to be consumed by engineers are often unsafe when misused. Not just programs like toolchains with undefined behaviors, but in general. Steel beams buckle of overloaded. Transistors overhead and explode outside of their SOA (safe operating area).

When engineers make something for the public, their job is to combine the unsafe bits, but make something which is safe, even against casual misuse.

When engineers make something for other engineers, that is less so; engineers are expected to read the data sheet.


> engineers are expected to read the data sheet

even if you know what the data sheet says, it's easier said than done, especially when the tool gives you basically no help. you are just praying people will magically just git gud.


I prefer to treat testing like insurance. You purchase enough insurance to get the coverage you need, and not a penny more. Anything beyond that could be invested better.

Same thing with tests, get the coverage you need to build the confidence in your codebase, but don't tie yourself in knots trying to get that last 10%. It's not worth it. Create some manual and integration tests and move one.

I feel like type safety, memory safety, thread safety, etc. are are all similar. Building a physics core to simulate the stability of your nuclear stockpile? The typing should be second to none. Building yet another CSV exporter? Who gives a damn.

Context is so damn important.


This is a perfectly reasonable argument if memory safety issues are essentially similar to logic bugs, but memory unsafety isn't like a logic bug.

A logic bug in a library doesn't break unrelated code. It's meaningful to talk about the continued execution of a program in the presence of logic bugs. Logic bugs don't time travel. There are ways to exhaustively prove the absence of logic bugs, e.g. MC/DC or state space exploration, even if they're expensive.

None of these properties are necessarily true of memory safety. A single memory safety violation in a library can smash your stack, or allow your code to be exploited. You can't exhaustively defend against this with error handling either. In C and C++, it's not meaningful to even talk about continued execution in the presence of memory safety violations. In C++, memory safety violations can time travel. You typically can't prove the absence of memory safety violations, except in languages designed to allow that.

With appropriate caveats noted (Fil-C, etc), we don't have good ways to retrofit memory safety onto languages and programs built without it or good ways to exhaustively diagnose violations. All we can do is structurally eliminate the possibility of memory unsafety in any code that might ever be used in a context where it's an important property. That's most code.


All of that stuff doesn’t matter though. If you look close enough everything is different to everything, but in real life we only take significant differences into consideration otherwise we’d go nuts.

Memory bugs have a high risk of exploitability. That’s it; the threat model will tell the team what they need to focus on.

Nothing in software or engineering is absolute. Some projects have decided they need compile-time guarantees about memory safety, others are experimenting with it, many still use C or C++ and the Earth keeps spinning.


If your attacker controls the data you're exporting to a CSV file, they can take advantage of a memory safety issue in your CSV exporter to execute arbitrary code on your machine.

https://georgemauer.net/2017/10/07/csv-injection.html


> Building yet another CSV exporter? Who gives a damn.

The problem with memory unsafe code is that it can have unexpected and unpredictable side-effects. Such as subtly altering the critical data you're exporting, of letting an attacker take control of your CSV exporter.

In other words, you need quite a lot of context to figure out that a memory bug in your CSV exporter won't be used for escalation. Figuring out that context, documenting it and making sure that the context never changes for the lifetime of your code? That sounds like a much complex proposition that using memory-safe tools in the first place.


I’m curious, what memory safe alternative is there for a C/C++ codebase that doesn’t give up performance?

Also for what it’s worth Rust ports tend to perform faster according to Russinovich. Part of that may be second system syndrome although the more likely explanation is that the default std library is just better optimized (eg hash tables in Rust are significantly better than unordered_map)


Ada has been around for years. The approach to memory safety isn't as strong as Rust, but it is a lot strong than C or C++. C++ is also adding a lot of memory safety, it is a lot easier to bypass than it is in Rust (though I've seen Rust code where everything is marked unsafe), but you still get some memory safety if you try.

All benchmarks between Ada, C, C++, and Rust (and others) should come down to a wash. A skilled programmer can find a difference but it won't be significant. A skilled C++ programmer wouldn't be using unordered_map so it is unfair to point out you can use something bad.


It has but you need spark too to avoid the runtime overhead. And I haven’t seen adoption of Ada in the broader industry so I wouldn’t pick it based on that. I would need to understand why it remains limited to industry’s that mandate government certification.

> A skilled C++ programmer wouldn't be using unordered_map so it is unfair to point out you can use something bad.

Pretending defaults don’t matter is naive especially in a language that is so hostile to being easy to add 3p dependencies (even without that defaults matter).


Rust has plenty of runtime overhead, or do you think Vec is fully static checked?

There are other examples for other data structures.


> A skilled C++ programmer wouldn't be using unordered_map so it is unfair to point out you can use something bad.

C++ isn't my primary language. Pray tell - what's wrong with unordered_map, and what's the alternative?


std::unordered_map basically specifies a bucket-based hashtable implementation (read: lots of extra pointer chasing). Most high-performance hashtables are based on probing.


Bluntly: exactly why does Ada matter, at all? The code footprint of software (1) written in Ada and (2) of concern when we talk about memory safety has measure zero. Is Ada better than C++? It turns out, I don't have to care: to go from C++ to Ada, one needs to rewrite, and if one is going to rewrite for memory safety, they're not going to rewrite to Ada.


If I'm going to rewrite I'm going to look at if formal proofs offer me anything, something ada can give. Ada is tiny I'll grant, but it has always been there.


A lot of Rust versus C or C++ comparisons be like: "Yo, check this Rust rewrite of Foo, which runs 2.5 faster¹ than the C original²".

---

1. Using 8 cores.

2. Single-threaded


1. Amdahl's law

2. That's a language feature too. Writing non-trivial multi-core programs in C or C++ takes a lot of effort and diligence. It's risky, and subtle mistakes can make programs chronically unstable, so we've had decades of programmers finding excuses for why a single thread is just fine, and people can find other uses for the remaining cores. OTOH Rust has enough safety guarantees and high-level abstractions that people can slap .par_iter() on their weekend project, and it will work.


Given most machines have cores to spare, and people want answers faster, is that a bad thing?


I think the complaint is that the C version isn’t multi threading ignoring that Rust makes it much easier to have a correct multithreaded implementation. OP is conveniently also ignoring that the Rust ports that I reference Russinovich talking about are MS internal code bases where it’s a 1:1 port, not a rearchitecture or an attempt to improve performance. The defaults being better, no aliasing that the compiler takes advantage, and automatic struct layout optimization all largely explain that it ends up being 5-20% faster having done nothing other than rewrite it.

But critics seem to often never engage with the actual data and just blindly get knee jerk defensive.


> Part of that may be second system syndrome

It may be that they've implemented it differently in a way that is more performant but has fewer features. A "rust port" is not automatically or apparently a 1:1 comparison.


It could be, but it's often just that the things you got in the box were higher quality and so your results are higher quality by default.

Better types like VecDeque<T>, better implementations of common ideas like sorting, even better fundamental concepts like providing the destructive move, or the owning Mutex by default.

Even the unassuming growable array type, Rust's Vec<T>, is just plain better than C++ std::vector<T>. It's not a huge difference and for many applications it won't matter, but that's the sort of pervasive quality difference I'm talking about and so I can well believe that in practice this ends up showing through.


There’s also compiler optimizations that aren’t available in c++ - noalias automatically applied everywhere + the compiler automatically optimally laying out structs is probably a non trivial perf add as well.


> Hopefully, there are no memory safety skeptics, other than rhetorical strawmen.

There are plenty of such skeptics. It's why Google, Microsoft, etc all needed to publish things like "70% of our vulnerabilities are memory-safety linked".

Even today, the increasing popularity of Zig indicates that memory-safety is not taken as a baseline.


Good point. There are even two posts about Zig on the front page along side this post.


Indeed Zig would be interesting 30 years ago, when Modula-2 and Object Pascal were all over the place, and writing full games in Assembly was still seen as pretty normal.


> Tell me more about memory safety

What is it you want to hear about memory safety? If you’re willing to accept the tradeoffs of an automatic garbage collector, memory safety has been a solved problem for decades, and there’s not a whole lot to tell you about it other than learning about widespread, mature technology.

But if you have some good reason to avoid that technology, then your options are far more limited, and there are good reasons that Rust dominates that conversation.

So the question stands - what is it you want to hear more about?


For example, everyone that is already more than happy using any form of automatic resource managed programming language.

Many have no use for borrow checker, and there is already enough choice with ML inspired type systems.


There was an article about Zig on the front page just a few hours ago that attracted many "Why do I need memory safety?" comments. The fact that new languages like Zig aren't taking memory safety as a foundational design goal should be evidence enough that many people are still skeptical about its value


Zig's approach to memory safety is an open question. I don't like it (obviously, a very subjective statement), but as more software is written in it, we'll get empirical data about whether Zig's bet pays off. It very well might.


The post in question had early empirical data comparing bug reports from Node, Bun and Deno. It wasn't the main focus of the article, and I would love to see a deeper dive, but it showed that Bun had almost 8x the amount of "crash" or "segfault" bug reports on Github as Deno, despite having almost the same amount of total issues created. (4% of bug reports for Deno are crashes, 26% of bug reports for Bun are crashes).

This matches my experience with the runtimes in question—I tried Bun out for a very small project and ran into 3 distinct crashes, often in very simple parts of the code like command line processing. Obviously not every crash / null-pointer dereference is a memory safety issue, but I think most people would agree that Zig does not seem to make bug-free software easier to write.


Go programs "segmentation fault" all the time. They're still memory-safe.


Indeed segfaults aren't necessarily a symptom of memory unsafety. But also, Go is not memory safe due to the possibility of races in multithreaded code.


I don't think so...


You're wrong.


nope, since the presence of a segfault doesn't imply memory safety.

Probably segfaults imply the absence of memory safety.


Once again, no, that's incorrect.


No, it isn't.


> Rust skeptics are not memory safety skeptics

Definitely not all of them, yes.

> Hopefully, there are no memory safety skeptics, other than rhetorical strawmen.

You'll find the reality disappointing then…




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: