How do you create a daemon in Python?python


Answers

There are many fiddly things to take care of when becoming a well-behaved daemon process:

  • prevent core dumps (many daemons run as root, and core dumps can contain sensitive information)

  • behave correctly inside a chroot gaol

  • set UID, GID, working directory, umask, and other process parameters appropriately for the use case

  • relinquish elevated suid, sgid privileges

  • close all open file descriptors, with exclusions depending on the use case

  • behave correctly if started inside an already-detached context, such as init, inetd, etc.

  • set up signal handlers for sensible daemon behaviour, but also with specific handlers determined by the use case

  • redirect the standard streams stdin, stdout, stderr since a daemon process no longer has a controlling terminal

  • handle a PID file as a cooperative advisory lock, which is a whole can of worms in itself with many contradictory but valid ways to behave

  • allow proper cleanup when the process is terminated

  • actually become a daemon process without leading to zombies

Some of these are standard, as described in canonical Unix literature (Advanced Programming in the UNIX Environment, by the late W. Richard Stevens, Addison-Wesley, 1992). Others, such as stream redirection and PID file handling, are conventional behaviour most daemon users would expect but that are less standardised.

All of these are covered by the PEP 3143 “Standard daemon process library” specification. The python-daemon reference implementation works on Python 2.7 or later, and Python 3.2 or later.

Question

Searching on Google reveals x2 code snippets. The first result is to this code recipe which has a lot of documentation and explanation, along with some useful discussion underneath.

However, another code sample, whilst not containing so much documentation, includes sample code for passing commands such as start, stop and restart. It also creates a PID file which can be handy for checking if the daemon is already running etc.

These samples both explain how to create the daemon. Are there any additional things that need to be considered? Is one sample better than the other, and why?




How to use Daemon that has a while loop?

There are already a number of questions on creating a daemon in Python, like this one, which answer that part nicely.

So, how do you have your daemon do background work?

As you suspected, threads are an obvious answer. But there are three possible complexities.


First, there's shutdown. If you're lucky, your crunchData function can be summarily killed at any time with no corrupted data or (too-significant) lost work. In that case:

def worker():
    while True:
        crunchData()

# ... somewhere in the daemon startup code ...
t = threading.Thread(target=worker)
t.daemon = True
t.start()

Notice that t.daemon. A "daemon thread" has nothing to do with your program being a daemon; it means that you can just quit the main process, and it will be summarily killed.

But what if crunchData can't be killed? Then you'll need to do something like this:

quitflag = False
quitlock = threading.Lock()

def worker():
    while True:
        with quitlock:
            if quitflag:
                return
        crunchData()

# ... somewhere in the daemon startup code ...
t = threading.Thread(target=worker)
t.start()

# ... somewhere in the daemon shutdown code ...
with quitlock:
    quitflag = True
t.join()

I'm assuming each iteration of crunchData doesn't take that long. If it does, you may need to check quitFlag periodically within the function itself.


Meanwhile, you want your request handler to access some data that the background thread is producing. You'll need some kind of synchronization there as well.

The obvious thing is to just use another Lock. But there's a good chance that crunchData is writing to its data frequently. If it holds the lock for 10 seconds at a time, the request handler may block for 10 seconds. But if it grabs and releases the lock a million times, that could take longer than the actual work.

One alternative is to double-buffer your data: Have crunchData write into a new copy, then, when it's done, briefly grab the lock and set currentData = newData.

Depending on your use case, a Queue, a file, or something else might be even simpler.


Finally, crunchData is presumably doing a lot of CPU work. You need to make sure that the request handler does very little CPU work, or each request will slow things down quite a bit as the two threads fight over the GIL. Usually this is no problem. If it is, use a multiprocessing.Process instead of a Thread (which makes sharing or passing the data between the two processes slightly more complicated, but still not too bad).