Caching in Large Image

There are two main caches in large_image in addition to caching done on the operating system level. These are:

Tile Cache: The tile cache stores individual images and/or numpy arrays for each tile processed.
- May be stored in the python process memory or memcached. Other cache backends can also be added.
- The cache key is a hash that includes the tile source, tile location within the source, format and compression, and style.
- If memcached is used, cached tiles can be shared across multiple processes.
- Tiles are often bigger than what memcached was optimized for, so memcached needs to be set to allow larger values.
- Cached tiles can include original as-read data as well as styled or transformed data. Tiles can be synthesized for sources that are missing specific resolutions; these are also cached.
- If using memcached, memcached determines how much memory is used (and what machine it is stored on). If using the python process, memory is limited to a fraction of total memory as reported by psutils.
Tile Source Cache: The source cache stores file handles, parsed metadata, and other values to optimize reading a specific large image.
- Always stored in the python process memory (not shared between processes).
- Memory use is wildly different depending on tile source; an estimate is based on sample files and then the maximum number of sources that should be tiled is based on a frame of total memory as reported by psutils and this estimate.
- The cache key includes the tile source, default tile format, and style.
- File handles and other metadata are shared if sources only differ in style (for example if ICC color correction is applied in one and not in another).
- Because file handles are shared across sources that only differ in style, if a source implements a custom __del__ operator, it needs to check if it is the unstyled source.