subfolders - How can I safely create a nested directory in Python?
python delete directory (17)
What is the most elegant way to check if the directory a file is going to be written to exists, and if not, create the directory using Python? Here is what I tried:
import os
file_path = "/my/directory/filename.txt"
directory = os.path.dirname(file_path)
try:
os.stat(directory)
except:
os.mkdir(directory)
f = file(filename)
Somehow, I missed os.path.exists
(thanks kanja, Blair, and Douglas). This is what I have now:
def ensure_dir(file_path):
directory = os.path.dirname(file_path)
if not os.path.exists(directory):
os.makedirs(directory)
Is there a flag for "open", that makes this happen automatically?
Check if a directory exists and create it if necessary?
The direct answer to this is, assuming a simple situation where you don't expect other users or processes to be messing with your directory:
if not os.path.exists(d):
os.makedirs(d)
or if making the directory is subject to race conditions (i.e. if after checking the path exists, something else may have already made it) do this:
import errno
try:
os.makedirs(d)
except OSError as exception:
if exception.errno != errno.EEXIST:
raise
But perhaps an even better approach is to sidestep the resource contention issue, by using temporary directories via tempfile
:
import tempfile
d = tempfile.mkdtemp()
Here's the essentials from the online doc:
mkdtemp(suffix='', prefix='tmp', dir=None) User-callable function to create and return a unique temporary directory. The return value is the pathname of the directory. The directory is readable, writable, and searchable only by the creating user. Caller is responsible for deleting the directory when done with it.
New in Python 3.5: pathlib.Path
with exist_ok
There's a new Path
object (as of 3.4) with lots of methods one would want to use with paths - one of which is mkdir
.
(For context, I'm tracking my weekly rep with a script. Here's the relevant parts of code from the script that allow me to avoid hitting more than once a day for the same data.)
First the relevant imports:
from pathlib import Path
import tempfile
We don't have to deal with os.path.join
now - just join path parts with a /
:
directory = Path(tempfile.gettempdir()) / 'sodata'
Then I idempotently ensure the directory exists - the exist_ok
argument shows up in Python 3.5:
directory.mkdir(exist_ok=True)
Here's the relevant part of the documentation:
If
exist_ok
is true,FileExistsError
exceptions will be ignored (same behavior as thePOSIX mkdir -p
command), but only if the last path component is not an existing non-directory file.
Here's a little more of the script - in my case, I'm not subject to a race condition, I only have one process that expects the directory (or contained files) to be there, and I don't have anything trying to remove the directory.
todays_file = directory / str(datetime.datetime.utcnow().date())
if todays_file.exists():
logger.info("todays_file exists: " + str(todays_file))
df = pd.read_json(str(todays_file))
Path
objects have to be coerced to str
before other APIs that expect str
paths can use them.
Perhaps Pandas should be updated to accept instances of the abstract base class, os.PathLike
.
Insights on the specifics of this situation
You give a particular file at a certain path and you pull the directory from the file path. Then after making sure you have the directory, you attempt to open a file for reading. To comment on this code:
filename = "/my/directory/filename.txt" dir = os.path.dirname(filename)
We want to avoid overwriting the builtin function, dir
. Also, filepath
or perhaps fullfilepath
is probably a better semantic name than filename
so this would be better written:
import os
filepath = '/my/directory/filename.txt'
directory = os.path.dirname(filepath)
Your end goal is to open this file, you initially state, for writing, but you're essentially approaching this goal (based on your code) like this, which opens the file for reading:
if not os.path.exists(directory): os.makedirs(directory) f = file(filename)
Assuming opening for reading
Why would you make a directory for a file that you expect to be there and be able to read?
Just attempt to open the file.
with open(filepath) as my_file:
do_stuff(my_file)
If the directory or file isn't there, you'll get an IOError
with an associated error number: errno.ENOENT
will point to the correct error number regardless of your platform. You can catch it if you want, for example:
import errno
try:
with open(filepath) as my_file:
do_stuff(my_file)
except IOError as error:
if error.errno == errno.ENOENT:
print 'ignoring error because directory or file is not there'
else:
raise
Assuming we're opening for writing
This is probably what you're wanting.
In this case, we probably aren't facing any race conditions. So just do as you were, but note that for writing, you need to open with the w
mode (or a
to append). It's also a Python best practice to use the context manager for opening files.
import os
if not os.path.exists(directory):
os.makedirs(directory)
with open(filepath, 'w') as my_file:
do_stuff(my_file)
However, say we have several Python processes that attempt to put all their data into the same directory. Then we may have contention over creation of the directory. In that case it's best to wrap the makedirs
call in a try-except block.
import os
import errno
if not os.path.exists(directory):
try:
os.makedirs(directory)
except OSError as error:
if error.errno != errno.EEXIST:
raise
with open(filepath, 'w') as my_file:
do_stuff(my_file)
Call the function create_dir()
at the entry point of your program/project.
import os
def create_dir(directory):
if not os.path.exists(directory):
print('Creating Directory '+directory)
os.makedirs(directory)
create_dir('Project directory')
Check os.makedirs: (It makes sure the complete path exists.)
To handle the fact the directory might exist, catch OSError.
(If exist_ok is False (the default), an OSError is raised if the target directory already exists.)
import os
try:
os.makedirs('./path/to/somewhere')
except OSError:
pass
I found this Q/A and I was initially puzzled by some of the failures and errors I was getting. I am working in Python 3 (v.3.5 in an Anaconda virtual environment on an Arch Linux x86_64 system).
Consider this directory structure:
└── output/ ## dir
├── corpus ## file
├── corpus2/ ## dir
└── subdir/ ## dir
Here are my experiments/notes, which clarifies things:
# ----------------------------------------------------------------------------
# [1] https://.com/questions/273192/how-can-i-create-a-directory-if-it-does-not-exist
import pathlib
""" Notes:
1. Include a trailing slash at the end of the directory path
("Method 1," below).
2. If a subdirectory in your intended path matches an existing file
with same name, you will get the following error:
"NotADirectoryError: [Errno 20] Not a directory:" ...
"""
# Uncomment and try each of these "out_dir" paths, singly:
# ----------------------------------------------------------------------------
# METHOD 1:
# Re-running does not overwrite existing directories and files; no errors.
# out_dir = 'output/corpus3' ## no error but no dir created (missing tailing /)
# out_dir = 'output/corpus3/' ## works
# out_dir = 'output/corpus3/doc1' ## no error but no dir created (missing tailing /)
# out_dir = 'output/corpus3/doc1/' ## works
# out_dir = 'output/corpus3/doc1/doc.txt' ## no error but no file created (os.makedirs creates dir, not files! ;-)
# out_dir = 'output/corpus2/tfidf/' ## fails with "Errno 20" (existing file named "corpus2")
# out_dir = 'output/corpus3/tfidf/' ## works
# out_dir = 'output/corpus3/a/b/c/d/' ## works
# [2] https://docs.python.org/3/library/os.html#os.makedirs
# Uncomment these to run "Method 1":
#directory = os.path.dirname(out_dir)
#os.makedirs(directory, mode=0o777, exist_ok=True)
# ----------------------------------------------------------------------------
# METHOD 2:
# Re-running does not overwrite existing directories and files; no errors.
# out_dir = 'output/corpus3' ## works
# out_dir = 'output/corpus3/' ## works
# out_dir = 'output/corpus3/doc1' ## works
# out_dir = 'output/corpus3/doc1/' ## works
# out_dir = 'output/corpus3/doc1/doc.txt' ## no error but creates a .../doc.txt./ dir
# out_dir = 'output/corpus2/tfidf/' ## fails with "Errno 20" (existing file named "corpus2")
# out_dir = 'output/corpus3/tfidf/' ## works
# out_dir = 'output/corpus3/a/b/c/d/' ## works
# Uncomment these to run "Method 2":
#import os, errno
#try:
# os.makedirs(out_dir)
#except OSError as e:
# if e.errno != errno.EEXIST:
# raise
# ----------------------------------------------------------------------------
Conclusion: in my opinion, "Method 2" is more robust.
I have put the following down. It's not totally foolproof though.
import os
dirname = 'create/me'
try:
os.makedirs(dirname)
except OSError:
if os.path.exists(dirname):
# We are nearly safe
pass
else:
# There was an error on creation, so make sure we know about it
raise
Now as I say, this is not really foolproof, because we have the possiblity of failing to create the directory, and another process creating it during that period.
I see two answers with good qualities, each with a small flaw, so I will give my take on it:
Try os.path.exists
, and consider os.makedirs
for the creation.
import os
if not os.path.exists(directory):
os.makedirs(directory)
As noted in comments and elsewhere, there's a race condition - if the directory is created between the os.path.exists
and the os.makedirs
calls, the os.makedirs
will fail with an OSError
. Unfortunately, blanket-catching OSError
and continuing is not foolproof, as it will ignore a failure to create the directory due to other factors, such as insufficient permissions, full disk, etc.
One option would be to trap the OSError
and examine the embedded error code (see Is there a cross-platform way of getting information from Python’s OSError):
import os, errno
try:
os.makedirs(directory)
except OSError as e:
if e.errno != errno.EEXIST:
raise
Alternatively, there could be a second os.path.exists
, but suppose another created the directory after the first check, then removed it before the second one - we could still be fooled.
Depending on the application, the danger of concurrent operations may be more or less than the danger posed by other factors such as file permissions. The developer would have to know more about the particular application being developed and its expected environment before choosing an implementation.
I use os.path.exists()
, here is a Python 3 script that can be used to check if a directory exists, create one if it does not exist, and delete it if it does exist (if desired).
It prompts users for input of the directory and can be easily modified.
If you consider the following:
os.path.isdir('/tmp/dirname')
means a directory (path) exists AND is a directory. So for me this way does what I need. So I can make sure it is folder (not a file) and exists.
In Python3, os.makedirs
supports setting exist_ok
. The default setting is False
, which means an OSError
will be raised if the target directory already exists. By setting exist_ok
to True
, OSError
(directory exists) will be ignored and the directory will not be created.
os.makedirs(path,exist_ok=True)
In Python2, os.makedirs
doesn't support setting exist_ok
. You can use the approach in heikki-toivonen's answer:
import os
import errno
def make_sure_path_exists(path):
try:
os.makedirs(path)
except OSError as exception:
if exception.errno != errno.EEXIST:
raise
Starting from Python 3.5, pathlib.Path.mkdir
has an exist_ok
flag:
from pathlib import Path
path = Path('/my/directory/filename.txt')
path.parent.mkdir(parents=True, exist_ok=True)
# path.parent ~ os.path.dirname(path)
This recursively creates the directory and does not raise an exception if the directory already exists.
(just as os.makedirs
got an exists_ok
flag starting from python 3.2).
The relevant Python documentation suggests the use of the EAFP coding style (Easier to Ask for Forgiveness than Permission). This means that the code
try:
os.makedirs(path)
except OSError as exception:
if exception.errno != errno.EEXIST:
raise
else:
print "\nBE CAREFUL! Directory %s already exists." % path
is better than the alternative
if not os.path.exists(path):
os.makedirs(path)
else:
print "\nBE CAREFUL! Directory %s already exists." % path
The documentation suggests this exactly because of the race condition discussed in this question. In addition, as others mention here, there is a performance advantage in querying once instead of twice the OS. Finally, the argument placed forward, potentially, in favour of the second code in some cases --when the developer knows the environment the application is running-- can only be advocated in the special case that the program has set up a private environment for itself (and other instances of the same program).
Even in that case, this is a bad practice and can lead to long useless debugging. For example, the fact we set the permissions for a directory should not leave us with the impression permissions are set appropriately for our purposes. A parent directory could be mounted with other permissions. In general, a program should always work correctly and the programmer should not expect one specific environment.
Use this command check and create dir
if not os.path.isdir(test_img_dir):
os.mkdir(str("./"+test_img_dir))
Using try except and the right error code from errno module gets rid of the race condition and is cross-platform:
import os
import errno
def make_sure_path_exists(path):
try:
os.makedirs(path)
except OSError as exception:
if exception.errno != errno.EEXIST:
raise
In other words, we try to create the directories, but if they already exist we ignore the error. On the other hand, any other error gets reported. For example, if you create dir 'a' beforehand and remove all permissions from it, you will get an OSError
raised with errno.EACCES
(Permission denied, error 13).
Why not use subprocess module if running on a machine that supports shell languages? Works on python 2.7 and python 3.6
from subprocess import call
call(['mkdir', '-p', 'path1/path2/path3'])
Should do the trick on most systems.
You can use mkpath
# Create a directory and any missing ancestor directories.
# If the directory already exists, do nothing.
from distutils.dir_util import mkpath
mkpath("test")
Note that it will create the ancestor directories as well.
It works for Python 2 and 3.
import os
if os.path.isfile(filename):
print "file exists"
else:
"Your code here"
Where your code here is use the (touch) command
This will check if the file is there if it is not then it will create it.