Graphite Tag Support¶
From the release of the 1.1.x series, Graphite supports storing data using tags to identify each series. This allows for much more flexibility than the traditional hierarchical layout. When using tag support, each series is uniquely identified by its name and set of tag/value pairs.
To enter tagged series into Graphite, they should be passed to Carbon by appending the tags to the series name:
Carbon will automatically decode the tags, normalize the tag order, and register the series in the tag database.
When querying tagged series, we start with the seriesByTag function:
# find all series that have tag1 set to value1 seriesByTag('tag1=value1')
This function returns a seriesList that can then be used by any other Graphite functions:
# find all series that have tag1 set to value1, sorted by total seriesByTag('tag1=value1') | sortByTotal()
The seriesByTag function supports specifying any number of tag expressions to refine the list of matches. When multiple tag expressions are specified, only series that match all the expressions will be returned.
Tags expressions are strings, and may have the following formats:
tag=spec tag value exactly matches spec tag!=spec tag value does not exactly match spec tag=~value tag value matches the regular expression spec tag!=~spec tag value does not match the regular expression spec
Any tag spec that matches an empty value is considered to match series that don’t have that tag, and at least one tag spec must require a non-empty value.
Regular expression conditions are treated as being anchored at the start of the value.
A more complex example:
# find all series where name matches the regular expression cpu\..*, AND tag1 is not value1 seriesByTag('name=~cpu\..*', 'tag1!=value1')
Once you have selected a seriesList, it is possible to group series together using the groupByTags function, which operates on tags in the same way that groupByNodes works on nodes within a traditional naming hierarchy.
# get a list of disk space used per datacenter for all webheads seriesByTag('name=disk.used', 'server=~web.*') | groupByTags('sumSeries', 'datacenter') # given series like: # disk.used;datacenter=dc1;rack=a1;server=web01 # disk.used;datacenter=dc1;rack=b2;server=web02 # disk.used;datacenter=dc2;rack=c3;server=web01 # disk.used;datacenter=dc2;rack=d4;server=web02 # will return the following new series, each containing the sum of the values for that datacenter: # disk.used;datacenter=dc1 # disk.used;datacenter=dc2
# given series like: # disk.used;datacenter=dc1;rack=a1;server=web01 # disk.used;datacenter=dc1;rack=b2;server=web02 # format series name using datacenter tag: seriesByTag('name=disk.used','datacenter=dc1') | aliasByTags('server', 'name') # will return # web01.disk.used # web02.disk.used
As Whisper and other storage backends are designed to hold simple time-series data (metric key, value, and timestamp), Graphite stores tag information in a separate tag database (TagDB). The TagDB is a pluggable store, by default it uses the Graphite SQLite, MySQL or PostgreSQL database, but it can also be configured to use an external Redis server or a custom plugin.
Tag support requires Graphite webapp & carbon version 1.1.0 or newer.
Local Database TagDB¶
The Local TagDB stores tag information in tables inside the graphite-web database. It supports SQLite, MySQL and Postgres, and is enabled by default.
The Redis TagDB will store the tag information on a Redis server, and is selected by setting
TAGDB='graphite.tags.redis.RedisTagDB' in local_settings.py. There are 3 additional config settings for the Redis TagDB:
TAGDB_REDIS_HOST = 'localhost' TAGDB_REDIS_PORT = 6379 TAGDB_REDIS_DB = 0
The default settings (above) will connect to a local Redis server on the default port, and use the default database.
The HTTP(S) TagDB is used to delegate all tag operations to an external server that implements the Graphite tagging HTTP API. It can be used in clustered graphite scenarios, or with custom data stores. It is selected by setting
TAGDB='graphite.tags.http.HttpTagDB' in local_settings.py. There are 3 additional config settings for the HTTP(S) TagDB:
TAGDB_HTTP_URL = 'https://another.server' TAGDB_HTTP_USER = '' TAGDB_HTTP_PASSWORD = ''
TAGDB_HTTP_URL is required.
TAGDB_HTTP_PASSWORD are optional and if specified will be used to send a Basic Authorization header in all requests.
Adding Series to the TagDB¶
Normally carbon will take care of this, it submits all new series to the TagDB, and periodically re-submits all series to ensure that the TagDB is kept up to date. There are 2 carbon configuration settings related to tagging; the GRAPHITE_URL setting specifies the url of your graphite-web installation (default http://127.0.0.1:8000), and the TAG_UPDATE_INTERVAL setting specifies how often each series should be re-submitted to the TagDB (default is every 100th update).
Series can be submitted via HTTP POST using command-line tools such as
curl or with a variety of HTTP programming libraries.
$ curl -X POST "http://graphite/tags/tagSeries" \ --data-urlencode 'path=disk.used;rack=a1;datacenter=dc1;server=web01' "disk.used;datacenter=dc1;rack=a1;server=web01"
This endpoint returns the canonicalized version of the path, with the tags sorted in alphabetical order.
Removing Series from the TagDB¶
When a series is deleted from the data store (for example, by deleting .wsp files from the whisper storage folders), it should also be removed from the tag database. Having series in the tag database that don’t exist in the data store won’t cause any problems with graphing, but will cause the system to do work that isn’t needed during the graph rendering, so it is recommended that the tag database be cleaned up when series are removed from the data store.
Series can be deleted via HTTP POST to the /tags/delSeries endpoint:
$ curl -X POST "http://graphite/tags/delSeries" \ --data-urlencode 'path=disk.used;datacenter=dc1;rack=a1;server=web01' true