chronicle

Read and query chronicle data files
Experimental - Work in progress

The purpose of this experimental package is to expose functionality to make it easy to read, filter and manipulate chronicle parquet files.

Install

The package is not yet available on PyPi.

pip install py_chronicle

You can install from github:

pip install git+https://github.com/andrie/py-chronicle

How Chronicle stores data

Chronicle collects and stores logs and metrics in a series of parquet files.

Use read_chronicle() to read either logs or metrics, by specifying the path to the parquet set you need.

The file tree looks like this, with logs and metrics in separate folders inside v1.

.
└── v1/
    ├── logs/
    └── metrics/

Inside both logs and metrics the data is stored by date, separated by year, month and day.

.
└── v1/
    ├── logs/
       └── 2023/
           ├── 02/
           │   ├── 01
           │   ├── 02
           │   ├── 03
           │   ├── 04
           │   ├── 05
           │   └── ...
           ├── 03
           ├── 04
           └── ...
    └── metrics/
        └── 2023/
            ├── 02/
               ├── 01
               ├── 02
               ├── 03
               ├── 04
               ├── 05
               └── ...
            ├── 03
            ├── 04
            └── ...

Working with metrics

Some examples.

scan_chronicle_metrics("./data", "2023/04/03").head().collect()
shape: (5, 13)
service host os attributes name description unit type timestamp value_float value_int value_uint value_column
str str str list[struct[2]] str str str str datetime[ms] f64 i64 u64 str
"workbench-metr… "rstudio-workbe… "linux" [] "scrape_samples… "The number of … "" "gauge" 2023-04-03 16:02:20.574 69.0 0 0 "value_float"
"workbench-metr… "rstudio-workbe… "linux" [{"version","go1.14.6"}] "go_info" "Information ab… "" "gauge" 2023-04-03 16:02:20.574 1.0 0 0 "value_float"
"workbench-metr… "rstudio-workbe… "linux" [] "go_memstats_mc… "Number of byte… "" "gauge" 2023-04-03 16:02:20.574 16384.0 0 0 "value_float"
"workbench-metr… "rstudio-workbe… "linux" [{"host","rstudio-workbench-6b9658c77f-mn8hj"}] "rstudio_system… "Graphite metri… "" "gauge" 2023-04-03 16:02:20.574 0.0 0 0 "value_float"
"workbench-metr… "rstudio-workbe… "linux" [] "go_memstats_ms… "Number of byte… "" "gauge" 2023-04-03 16:02:20.574 65536.0 0 0 "value_float"
scan_chronicle_metrics("./data", "2023/04/03").metrics.describe()
service name description value_column
0 system.cpu.time Total CPU seconds broken down by different sta... value_float
1 system.memory.usage Bytes of memory in use. value_int
2 connect-metrics go_goroutines Number of goroutines that currently exist. value_float
3 connect-metrics go_info Information about the Go environment. value_float
4 connect-metrics go_memstats_alloc_bytes Number of bytes allocated and still in use. value_float
... ... ... ... ...
176 workbench-metrics scrape_series_added The approximate number of new series in this s... value_float
177 workbench-metrics statsd_metric_mapper_cache_gets_total The count of total metric cache gets. value_float
178 workbench-metrics statsd_metric_mapper_cache_hits_total The count of total metric cache hits. value_float
179 workbench-metrics statsd_metric_mapper_cache_length The count of unique metrics currently cached. value_float
180 workbench-metrics up The scraping was successful value_float

181 rows × 4 columns

scan_chronicle_metrics("./data", "2023/04/03").metrics.filter("rsconnect_system_memory_used", "memory").head()
host timestamp rsconnect_system_memory_used

Plotting metrics

from chronicle.plot import *
scan_chronicle_metrics("./data", "2023/04/03").metrics.plot("rsconnect_system_memory_used", alias = "memory")
Unable to display output for mime type(s): application/vnd.plotly.v1+json

Working with logs

Some examples.

scan_chronicle_logs("./data",  "2023/04/03").head().collect()
shape: (5, 6)
service host os attributes body timestamp
str str str list[struct[2]] str datetime[ms]
"workbench" "rstudio-workbe… "linux" [{"data","120"}, {"pid","2.36E+02"}, … {"type","session_suspend"}] "{"pid":236,"us… 2023-04-03 18:01:26.665
"workbench" "rstudio-workbe… "linux" [{"data",""}, {"pid","2.36E+02"}, … {"type","session_exit"}] "{"pid":236,"us… 2023-04-03 18:01:26.761
"connect" "rstudio-connec… "linux" [{"user_role","publisher"}, {"user_guid","085ba4be-01b5-478b-877c-321368924c89"}, … {"type","audit"}] "{"action":"add… 2023-04-03 19:30:35.698
"connect" "rstudio-connec… "linux" [{"log.file.name","audit.json"}, {"actor_description","Auth Provider"}, … {"entry_id","3.032E+03"}] "{"action":"add… 2023-04-03 19:30:35.698
"connect" "rstudio-connec… "linux" [{"action","add_group_member"}, {"actor_id","0E+00"}, … {"log.file.name","audit.json"}] "{"action":"add… 2023-04-03 19:30:35.698
scan_chronicle_logs("./data",  "2023/04/03").logs.filter_type("username")