Some Background
Originally, Rust switched from readdir(3) to readdir_r(3) for thread safety. But readdir_r(3) has some problems, then they changed it back:
- Linux and Android: fs: Use readdir() instead of readdir_r() on Linux and Android
- Fuchsia: Switch Fuchsia to readdir (instead of readdir_r)
- ...
So, in the current implementation, they use readdir(3) on most POSIX platforms
#[cfg(any(
target_os = "android",
target_os = "linux",
target_os = "solaris",
target_os = "fuchsia",
target_os = "redox",
target_os = "illumos"
))]
fn next(&mut self) -> Option<io::Result<DirEntry>> {
unsafe {
loop {
// As of POSIX.1-2017, readdir() is not required to be thread safe; only
// readdir_r() is. However, readdir_r() cannot correctly handle platforms
// with unlimited or variable NAME_MAX. Many modern platforms guarantee
// thread safety for readdir() as long an individual DIR* is not accessed
// concurrently, which is sufficient for Rust.
super::os::set_errno(0);
let entry_ptr = readdir64(self.inner.dirp.0);
Thread issue of readdir(3)
The problem of readdir(3) is that its return value (struct dirent *) is a pointer pointing to the internal buffer of the directory stream (DIR), thus can be overwritten by the following readdir(3) calls. So if we have a DIR stream, and share it with multiple threads, with all threads calling readdir(3), a race condition may happen.
If we want to safely handle this, an external synchronization is needed.
My question
Then I am curious about what Rust did to avoid such issues. Well, it seems that they just call readdir(3), memcpy the return value to their caller-allocated buffer, and then return. But this function is not marked as unsafe, this makes me confused.
So my question is why is it safe to call fs::read_dir() in multi-threaded programs?
There is a comment stating that it is safe to use it in Rust without extra external synchronization, but I didn't get it...
It requires external synchronization if a particular directory stream may be shared among threads, but I believe we avoid that naturally from the lack of
&mutaliasing.DirisSync, but onlyReadDiraccesses it, and only from its mutableIteratorimplementation.
CodePudding user response:
readdir is not safe when called from multiple threads with the same DIR* dirp parameter (i.e. with the same self.inner.dirp.0 in the Rust case) but it may be called safely with different dirps. Since calling ReadDir::next requires a &mut self, it is guaranteed that nobody else can call it from another thread at the same time on the same ReadDir instance, and so it is safe.
