Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> just taking a lock around some shared data

Can’t you do this in Rust with std::sync::Mutex and similar?



Absolutely, and IMHO the Mutex/MutexGuard API is one of the best showcases of what Rust is capable of. (Fun fact: RwLock<T> is Sync only if T is Sync, but Mutex<T> is Sync even if T is not Sync!)


Certainly it's interesting that Mutex<T> is a thing in Rust but the equivalent (a mutex in the form of a wrapper type) doesn't exist in C++. One rationale is that Mutex<T> in Rust is actually safe, whereas the C++ equivalent would be an attractive nuisance since it looks safe but would be easily abused and if any of the mutex's users abused it you're screwed.


> Mutex<T> is a thing in Rust but the equivalent (a mutex in the form of a wrapper type) doesn't exist in C++

This is the synchronized value pattern [1]. I'm pretty sure that my 3rd edition of "The C++ programming Language" by Bjarne had a description of it and it predates rust by at least a decade.

[1] https://www.boost.org/doc/libs/1_80_0/doc/html/thread/sds.ht...


The Fourth Edition does not appear to mention this pattern under that name, and indeed it gives as an example burying the mutex inside the type to be protected, which has the same downside (the resulting object is bigger†) but not the upside (Stroustrup's approach means we can still forget to take the lock)

That Boost link says it is "experimental and subject to change in future versions" but I don't know whether Boost just says that about everything or whether this would particularly mark out this feature.

† In Rust Mutex<T> is 8 bytes bigger than T, typically. In C++ std::mutex is often 40 bytes.


I don't have the copy of the book with me, so I don't know how Stroustrup called it. It was part of a discussion of overloading operator->.

The boost warning doesn't have much to it. Boost libraries don't even guarantee API stability across versions and boost.synchronized has been available for a few years with no changes.

> In Rust Mutex<T> is 8 bytes bigger than T, typically. In C++ std::mutex is often 40 bytes.

That's because on libstdc++ std::mutex embeds a pthread_mutex_t which is 40 bytes for ABI reasons. It is a bad early ABI decision that can't be changed unfortunately. std::mutex on MSVC is worse. std::shared_mutex is much smaller on MSVC, but even worse than std::mutex on libstdc++.

A portable Mutex<T> of minimal size can be built on top of std::atomic::wait though.


Maybe the standard should then take the opportunity to define such a thing, since it would be smaller and more useful than what they have today in practice.


Well, yes. Then again the committee took 10 years to standardize std::mutex, 14 years for std::shared_mutex. 17 for std::optional. We still don't have a good hash map.

We have to be realistic, the standard library will never be complete and you'll always have to get basic components from 3rd party or write them yourself.


One thing that’s nice about C++ is that you reimplement stuff that doesn’t have a decade of battle testing behind it.

That way, such ‘bleeding edge’ features evolve and improve a lot before being set in stone.


Unfortunately, they did standardize the bad hash map.


To be fair they standardized the hash map you'd have probably been taught 30 and maybe even 20 years ago in CS class. It's possible that if your professor is rather slow to catch on they are still teaching new kids bucketed hash tables like the one std::unordered_map requires.

I'd guess that while a modern class are probably taught some sort of open addressed hash map, they aren't being taught anything as exotic as Swiss Tables or F14 (Google Abseil and Facebook Folly's maps) but that's OK because standardising all the fine details of those maps would be a bad idea too.

On the other hand, the document does not tell you to use a halfway decent hash function, and many standard implementations don't provide one, so in practice many programs don't use one. The "bad hash map" performs OK with a terrible hash function, whereas the modern ones require decent hashes or their performance is miserable.


I think one of the reasons this isn't standard is that it's too easy to make mistakes with it. For example, if `std::string readValue3()` was changed to `std::string& readValue3()`, that reference would outlive the temporary guard, and any code that retained that reference would be broken. That's not so different from regular C++ mutex issues, but the downside here is that the convenience of synchronized_value also makes it harder to spot the mistake.


Indeed. It is relatively easy to leak out a reference from a synchronized wrapper. I see it more of an aid to highlight which data is shared (and which mutex protects it) than a strong safety helper.


Not in the stdlib, but it exists elsewhere, such as folly::Synchronized. There are some gotchas but it's a LOT better than a separate mutex and data. The main gotcha is instead of

  for (auto foo : bar.wlock()) {
      baz(foo);
  }
You need to do

  bar.withWLock([](auto &lockedBar) {
      for (auto foo : lockedBar) {
          baz(foo);
      }
  });
In rust, the lifetime checking prevents that.

The other big gotcha is accidentally blocking while holding a lock. E g. Instead of

  auto g = bar.wlock();
  baz(*g);
  co_await quux();
You should do

  {
      auto g = bar.wlock();
      baz(*g);
  }
  co_await quux();
Or use withWLock. If you co_await with the lock held you can deadlock if the executor switches to a different coroutine that tries to acquire the same lock. If you actually need to hold the lock across the blocking call, you need coroutine-aware locks, which turns it into

  auto g = co_await bar.wlock();
  baz(*g);
  co_await quux();
No idea if rust prevents this problem - I suspect not, but I haven't used async rust.


It is possible to fall asleep in Rust while holding a lock, but it's possible to statically detect this mistake/ infelicitious choice in the software and diagnose it. Clippy calls this await_holding_lock - unfortunately the current Clippy diagnosis sometimes gives false positives, so that needs improving.

Tokio provides a Mutex like your final example that is intended for use in such async code that will hold locks while waiting because Tokio will know you are holding the lock. It is accordingly more expensive, and so should only be used if "Don't hold locks while asleep" was not a practical solution to your problem.


To use the articles example, there are typically multiple bank accounts you want to update atomically, so guarding one account with a mutex doesn’t help you prevent deadlocks. The lock needs both accounts. The Mutex<T> example just doesn’t work with interacting objects.


In C++ you'd want to still also offer std::mutex because C++ doesn't have Zero Size Types, so a C++ Mutex<T> equivalent would always need space to store something. Mutex<()> is the same size as a hypothetical "mutex only" type and so Rust has no reason to offer a separate type representing a mutex which doesn't protect anything in particular.

In fact even without actually putting anything of substance in the mutex, you can get value from type system judo using this mechanism, which C++ doesn't appear to do either.


> C++ doesn't have Zero Size Types

[[no_unique_address]] since C++20. Before that there was the empty base class optimization.


Neither the Empty Base Class nor [[no_unique_address]] give C++ Zero Size Types. The [[no_unique_address]] attribute is a way to achieve something empty base classes were useful for without the accompanying problems, so that's nice, but it's not ZSTs.

Can you say whether you genuinely thought C++ had ZSTs? And if so, how you came to that conclusion ?


I'm not saying that C++ has zero size types. I'm saying that no_unique_address and EBO are a way to store a stateless object without it occupying any space, which is all you need to implement a zero space overhead Mutext<T> for stateless types.


I think the complexity to deliver an equivalent of Mutex<T> which also works via no_unique_address to deliver no-space-overhead for deliberately stateless types that would otherwise add 1 byte to the type size is probably a bit much to ask.

Thanks for pointing me to Boost synchronized_example<T> showing that this does exist, at least as an experimental library feature.


It is not exactly rocket science: https://gcc.godbolt.org/z/6Kz53bs7x. Bonus it supports visiting multiple synchronized at the same time, deadlock free.


Huh. I was expecting that providing access to the no_unique_address value despite it not having an address would be much trickier than that.


> Fun fact: RwLock<T> is Sync only if T is Sync, but Mutex<T> is Sync even if T is not Sync!

What’s more interesting is figuring out why that is. Also why Arc<T> is Send only if T is Send.


Why is that?


My guess is one way it could break, if it was otherwise, would be if T relied on thread local state.


The alternative would be to actively undermine Rust's type system (and guarantees):

- RwLock hands out multiple references (that's the point), Sync means a type can be used from multiple threads (concurrently), if RwLock<T: !Sync> was Sync it would allow multiple outstanding references for the same !Sync object, which is not legal.

- Mutex, however, only hands out a single reference at a time (which is also why it can always hand out a mutable reference), meaning semantically it acts like it moves the T to the target thread then borrows it = nothing to sync, that's why Mutex<T> is Sync if T is Send.

- For Arc, if it were Send without the wrapped object being Send it would allow Send-ing !Send objects: create an Arc<T>, clone it, move the clone to a second thread, drop the source, now you can try_unwrap() or drop() the clone and it'll work off of the second thread.

This is a problem with threadlocal state, but also with resource affinity (e.g. on windows a lock's owner is recorded during locking, and only the owner can release the lock, there are also lots of APIs which can only work off of the main thread.), or thread-safety (Rc for instance would be completely broken if you could Send it, as the entire point is to use an unsynchronised refcount).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: