Cartography operations guide

This document contains tips for running Cartography in production.

Maintaining a up-to-date picture of your infrastructure

Running cartography ensures that your Neo4j instance contains the most recent snapshot of your infrastructure. Here’s how that process works.

Update tags

Each sync run has an update_tag associated with it, which is the Unix timestamp of when the sync started. See our docs for more details.

Cleanup jobs

Each node and relationship created or updated during the sync will have their lastupdated field set to the update_tag. At the end of a sync run, nodes and relationships with out-of-date lastupdated fields are considered stale and will be deleted via a cleanup job.

Sync frequency

To keep data updated, you can run cartography as part of a periodic script (cronjobs in Linux, scheduled tasks in Windows). Determine your needs for data freshness and adjust accordingly.

Observability

statsd

Cartography can be configured to send metrics to a statsd server. Specify the --statsd-enabled flag when running cartography for sync execution times to be recorded and sent to 127.0.0.1:8125 by default (these options are also configurable with the --statsd-host and --statsd-port options). You can also provide your own --statsd-prefix to make these metrics easier to find in your own environment.

Docker image

A production-ready docker image is available in GitHub Container Registry. We recommend that you avoid using the :latest tag and instead use the tag or digest associated with your desired release version, e.g.

docker pull ghcr.io/lyft/cartography:0.61.0

This image can then be ran with any of your desired command line flags:

docker run --rm ghcr.io/lyft/cartography:0.61.0 --help