I have (had) some Ruby code which for historical reasons is (was) essentially
Dir.mktmpdir do |dir|
path_list = something_which_creates_files_in(dir)
path_list.each(&:delete)
end
Occasionally I get (got) exceptions from this code:
Errno::ENOENT: No such file or directory @ dir_s_rmdir - /tmp/d2..4w/file.csv
:
/path/to/source.rb:124:in `unlink'
/path/to/source.rb:124:in `delete'
/path/to/source.rb:124:in `each'
:
/usr/lib/ruby/2.5.0/tmpdir.rb:93:in `mktmpdir'
:
so it appears to me that that my "cleanup" of the list of paths at the end of the block is not entirely synchronous, that (some of) the files still exist after it completes, then mktmpdir removes the temporary directory so that the asynchronous (?) unlink fails, its target has gone away. Is this a reasonable interpretation?
This is more an academic question than anything else; the behaviour puzzles me. Just removing the cleanup (the path_list.each(&:delete)) and leaving the deletion to Dir.mktmpdir seems to stop these exceptions.
If it makes a difference, this is Ruby 2.5 (MRI) running on Linux.
CodePudding user response:
Your assumption seems to be correct.
If you would check the source of File.unlink you could see the following:
static VALUE
rb_file_s_unlink(int argc, VALUE *argv, VALUE klass)
{
return apply2files(unlink_internal, argc, argv, 0);
}
Here unlink_internal is a trivial thing (just a thin wrapper around a system call?), but what is interesting is the implementation of apply2files. You could see there the following call:
...
rb_thread_call_without_gvl(no_gvl_apply2files, aa, RUBY_UBF_IO, 0);
...
where aa is some fancy struct that contains among other things a pointer to the thing that we want to apply - unlink in our case.
The name of this function is quite self-descriptive, but the source contains some documentation too, so we can just refer to it:
/*
* rb_thread_call_without_gvl - permit concurrent/parallel execution.
* rb_thread_call_without_gvl2 - permit concurrent/parallel execution
* without interrupt process.
...
So from what I see (disclaimer: without the really careful analysis of the source code :)) the deletions within the block in question 1) do happen concurrently and 2) without GIL "protection" - so "surprises" are more than possible if one tries to delete temporary files twice (1st time explicitly and 2nd time implicitly when the mktmpdir block exits).
