PK }C_m˜R R graphite-0.9.11/whisper.html
Whisper is a fixed-size database, similar in design and purpose to RRD (round-robin-database). It provides fast, reliable storage of numeric data over time. Whisper allows for higher resolution (seconds per point) of recent data to degrade into lower resolutions for long-term retention of historical data.
Data points in Whisper are stored on-disk as big-endian double-precision floats. Each value is paired with a timestamp in seconds since the UNIX Epoch (01-01-1970). The data value is parsed by the Python float() function and as such behaves in the same way for special strings such as 'inf'. Maximum and minimum values are determined by the Python interpreter’s allowable range for float values which can be found by executing:
python -c 'import sys; print sys.float_info'
Whisper databases contain one or more archives, each with a specific data resolution and retention (defined in number of points or max timestamp age). Archives are ordered from the highest-resolution and shortest retention archive to the lowest-resolution and longest retention period archive.
To support accurate aggregation from higher to lower resolution archives, the precision of a longer retention archive must be divisible by precision of next lower retention archive. For example, an archive with 1 data point every 60 seconds can have a lower-resolution archive following it with a resolution of 1 data point every 300 seconds because 60 cleanly divides 300. In contrast, a 180 second precision (3 minutes) could not be followed by a 600 second precision (10 minutes) because the ratio of points to be propagated from the first archive to the next would be 3 1/3 and Whisper will not do partial point interpolation.
The total retention time of the database is determined by the archive with the highest retention as the time period covered by each archive is overlapping (see Multi-Archive Storage and Retrieval Behavior). That is, a pair of archives with retentions of 1 month and 1 year will not provide 13 months of data storage as may be guessed. Instead, it will provide 1 year of storage - the length of it’s longest archive.
Whisper databases with more than a single archive need a strategy to collapse multiple data points for when the data rolls up a lower precision archive. By default, an average function is used. Available aggregation methods are: * average * sum * last * max * min
When Whisper writes to a database with multiple archives, the incoming data point is written to all archives at once. The data point will be written to the lowest resolution archive as-is, and will be aggregated by the configured aggregation method (see Rollup Aggregation) and placed into each of the higher-retention archives.
When data is retrieved (scoped by a time range), the first archive which can satisfy the entire time period is used. If the time period overlaps an archive boundary, the lower-resolution archive will be used. This allows for a simpler behavior while retrieving data as the data’s resolution is consistent through an entire returned series.
Whisper is somewhat inefficient in its usage of disk space because of certain design choices:
Whisper is fast enough for most purposes. It is slower than RRDtool primarily as a consequence of Whisper being written in Python, while RRDtool is written in C. The speed difference between the two in practice is quite small as much effort was spent to optimize Whisper to be as close to RRDtool’s speed as possible. Testing has shown that update operations take anywhere from 2 to 3 times as long as RRDtool, and fetch operations take anywhere from 2 to 5 times as long. In practice the actual difference is measured in hundreds of microseconds (10^-4) which means less than a millisecond difference for simple cases.
WhisperFile | Header,Data | |||
Header | Metadata,ArchiveInfo+ | |||
Metadata | aggregationType,maxRetention,xFilesFactor,archiveCount | |||
ArchiveInfo | Offset,SecondsPerPoint,Points | |||
Data | Archive+ | |||
Archive | Point+ | |||
Point | timestamp,value |
Data types in Python’s struct format:
Metadata | !2LfL |
ArchiveInfo | !3L |
Point | !Ld |
...
When we talk about “Carbon” we mean one or more of various daemons that make up the storage backend of a Graphite installation. In simple installations, there is typically only one daemon, carbon-cache.py. This document gives a brief overview of what each daemon does and how you can use them to build a more sophisticated storage backend.
All of the carbon daemons listen for time-series data and can accept it over a common set of protocols. However, they differ in what they do with the data once they receive it.
carbon-cache.py accepts metrics over various protocols and writes them to disk as efficiently as possible. This requires caching metric values in RAM as they are received, and flushing them to disk on an interval using the underlying whisper library.
carbon-cache.py requires some basic configuration files to run:
As the number of incoming metrics increases, one carbon-cache.py instance may not be enough to handle the I/O load. To scale out, simply run multiple carbon-cache.py instances (on one or more machines) behind a carbon-aggregator.py or carbon-relay.py.
Warning
If clients connecting to the carbon-cache.py are experiencing errors such as connection refused by the daemon, a common reason is a shortage of file descriptors.
In the console.log file, if you find presence of:
Could not accept new connection (EMFILE)
or
exceptions.IOError: [Errno 24] Too many open files: '/var/lib/graphite/whisper/systems/somehost/something.wsp'
the number of files carbon-cache.py can open will need to be increased. Many systems default to a max of 1024 file descriptors. A value of 8192 or more may be necessary depending on how many clients are simultaneously connecting to the carbon-cache.py daemon.
In Linux, the system-global file descriptor max can be set via sysctl. Per-process limits are set via ulimit. See documentation for your operating system distribution for details on how to set these values.
carbon-relay.py serves two distinct purposes: replication and sharding.
When running with RELAY_METHOD = rules, a carbon-relay.py instance can run in place of a carbon-cache.py server and relay all incoming metrics to multiple backend carbon-cache.py‘s running on different ports or hosts.
In RELAY_METHOD = consistent-hashing mode, a DESTINATIONS setting defines a sharding strategy across multiple carbon-cache.py backends. The same consistent hashing list can be provided to the graphite webapp via CARBONLINK_HOSTS to spread reads across the multiple backends.
carbon-relay.py is configured via:
carbon-aggregator.py can be run in front of carbon-cache.py to buffer metrics over time before reporting them into whisper. This is useful when granular reporting is not required, and can help reduce I/O load and whisper file sizes due to lower retention policies.
carbon-aggregator.py is configured via: