By re-raising any exceptions that `transcode` encounters while running
the transcode, it ensures that `set_generated` will also see the
exception, preventing it from caching an incomplete transcode.
When a connection that is consuming a generated response is closed,
Flask closes the generator making it raise the special `GeneratorExit`
exception when the program tries to yield from it again. Because the
`transcode` function was called (returning a generator) before being
passed into `set_generated`, the exception was being handled in the
wrong order.
By passing the `transcode` function to `set_generated` and letting
`set_transcode` call it to return a generator while generating the
response for the client, the exception properly bubbles up through
`transcode` into `set_generated`. This allows both of them to handle it
properly by stopping the subproceses and not caching the incomplete
response data respectively.
- `os.scandir` (provided by a 3rd party package in 2.7)
- `os.replace` (doesn't exist in 2.7 - have to use `os.rename` instead)
- `os.utime` (the times param is required in 2.7)
Quick summary
-------------
- Adds a Cache class (plus tests for it) that provides an API for
managing a cache of files on disk
- Adds two new settings to the configuration file: `cache_size` (default
512MB) and `transcode_cache_size` (default 1GB).
- Creates two cache managers using the settings above: one for general
stuff (currently album art) and one for transcodes
- Adds the caching of transcoded files to disk for future use
- Modifies the existing image caching to use the cache manager
Longer explanations and justifications
--------------------------------------
The reason I separated out transcodes into an entirely separate cache is
that I could imagine a single transcode pushing out a ton of smaller
images or other cached content. By separating them it should reduce the
number of deletes caused by adding something to the cache.
The cache manager allows for caching a value from a generator via
passthrough. This means that a generator can be transparently wrapped to
save its output in the cache. The bytes from the generator will be
written to a temp file in the cache and yielded back. When it completes,
the temp file will be renamed according to the provided cache key. This
is how caching transcoded music is implemented.
If multiple generators for the same key are started, they will all write
to individual temp files until they complete and race to overwrite each
other. Since the key should uniquely represent the content it indexes
the files will be identical so overwriting them is harmless.
The cache will store everything for a minimum amount of time
(configurable, default is set at 5 minutes). After this time has
elapsed, the data can be deleted to free up space. This minimum is so
that when you cache a file to the disk you can expect it to be there
after, even if another large file is added to the cache and requests
that some files are deleted to make space.
To ensure that a file will not be paged out of the cache regardless of
the minimum time, there is a `protect` context manager that will refuse
the delete the key from the cache as long as it's active.
The cache has a maximum size, however this is more of a recommendation
as opposed to a hard limit. The actual size will frequently exceed the
limit temporarily until something can be paged out.
Asking nicely with a SIGTERM doesn't cause the transcoding process(es)
to exit. Using SIGKILL gets the job done.
This was verified by manually sending SIGTERM and SIGKILL signals to
hung transcoding processes, as well as getting a client to abort stream
requests before they had completed.
Fixes#55
Defined in a dedicated 'pony database', allowing to check only this table
to determine if we need to create the tables, and so existing tables getting
a new attribute won't trigger a table creation