scan_chronicle("./data", "metrics", "2023/04/03")core
Scan chronicle parquet files
Chronicle collects and stores logs and metrics in a series of parquet files.
Use scan_chronicle_logs() to read logs and scan_chronicle_logs() to read metrics, by specifying the path to the parquet set you need.
The file tree looks like this, with logs and metrics in separate folders inside v1.
.
└── v1/
├── logs/
└── metrics/Inside both logs and metrics the data is stored by date, separated by year, month and day.
.
└── v1/
├── logs/
│ └── 2023/
│ ├── 02/
│ │ ├── 01
│ │ ├── 02
│ │ ├── 03
│ │ ├── 04
│ │ ├── 05
│ │ └── ...
│ ├── 03
│ ├── 04
│ └── ...
└── metrics/
└── 2023/
├── 02/
│ ├── 01
│ ├── 02
│ ├── 03
│ ├── 04
│ ├── 05
│ └── ...
├── 03
├── 04
└── ...Using the scan interface
scan_chronicle
scan_chronicle (path:str, type:str='', date:str=None, filename:str=None, version:str='v1')
Read a chronicle parquet file into a polars LazyFrame.
| Type | Default | Details | |
|---|---|---|---|
| path | str | Path to dataset, | |
| type | str | must be metrics or logs |
|
| date | str | None | date in format YYYY/MM/DD |
| filename | str | None | name of parquet file. If empty, will be inferred. |
| version | str | v1 | currently must be v1 |
| Returns | LazyFrame |
scan_chronicle_logs
scan_chronicle_logs (path:NoneType, date:str=None, version:str='v1')
Read a chronicle logs parquet file into a polars dataframe.
| Type | Default | Details | |
|---|---|---|---|
| path | None | Path to dataset, | |
| date | str | None | date in format YYYY/MM/DD |
| version | str | v1 | currently must be v1 |
| Returns | DataFrame |
scan_chronicle_metrics
scan_chronicle_metrics (path:str, date:str=None, version:str='v1')
Read a chronicle metrics parquet file into a polars dataframe.
| Type | Default | Details | |
|---|---|---|---|
| path | str | Path to dataset, | |
| date | str | None | date in format YYYY/MM/DD |
| version | str | v1 | currently must be v1 |
| Returns | DataFrame |
z = scan_chronicle_metrics("./data", "2023/04/03")
assert type(z) == pl.LazyFrame
assert z.collect().columns == [
'service',
'host',
'os',
'attributes',
'name',
'description',
'unit',
'type',
'timestamp',
'value_float',
'value_int',
'value_uint',
'value_column'
]z = scan_chronicle_logs("./data", "2023/04/03")
assert type(z) == pl.LazyFrame
assert z.collect().columns == [
'service',
'host',
'os',
'attributes',
'body',
'timestamp'
]Analyse metrics
ChronicleMetrics
ChronicleMetrics (ldf:polars.lazyframe.frame.LazyFrame)
Initialise a chronicle metrics class
| Type | Details | |
|---|---|---|
| ldf | LazyFrame | A polars DataFrame |
| Returns | LazyFrame |
Use .metrics.describe() to get a DataFrame of the unique metrics in the metrics data, containing the service, name and description of each metric.
ChronicleMetrics.describe
ChronicleMetrics.describe ()
Reads metrics dataframe and returns a pandas dataframe with summary of service, name and description of all metrics
m = scan_chronicle_metrics("./data", "2023/04/03").metrics.describe()
assert list(m) == ['service', 'name', 'description', 'value_column']
mUse .metrics.filter() to filter the DataFrame on the name column.
ChronicleMetrics.filter
ChronicleMetrics.filter (name:str, service:str=None, alias:str=None)
Extract a single metric from a metrics dataframe
| Type | Default | Details | |
|---|---|---|---|
| name | str | name of metric to extract | |
| service | str | None | service to extract metric from |
| alias | str | None | alias to use for new column |
| Returns | DataFrame |
m = scan_chronicle_metrics("./data", "2023/04/03").metrics.filter("rsconnect_system_memory_used")
assert type(m) == pd.DataFrame
assert list(m) == ['host', 'timestamp', 'rsconnect_system_memory_used']
m = scan_chronicle_metrics("./data", "2023/04/03").metrics.filter("rsconnect_system_memory_used", alias="memory")
assert type(m) == pd.DataFrame
assert list(m) == ['host', 'timestamp', 'memory']
m = scan_chronicle_metrics("./data", "2023/04/03").metrics.filter("rsconnect_system_memory_used", service = "connect-metrics", alias = "memory")
assert type(m) == pd.DataFrame
assert list(m) == ['host', 'timestamp', 'memory']Analyse logs
ChronicleLogs
ChronicleLogs (df:polars.dataframe.frame.DataFrame)
Initialise a chronicle logs DataFrame
| Type | Details | |
|---|---|---|
| df | DataFrame | A polars data frame |
| Returns | DataFrame |
Filter logs on type
You can use logs/filter_type() to filter logs on the type column.
scan_chronicle_logs("./data").head(1).explode("attributes").collect()ChronicleLogs.filter_type
ChronicleLogs.filter_type (value:str)
Extract all logs where type == value
| Type | Details | |
|---|---|---|
| value | str | Value to extract |
| Returns | DataFrame |
# scan_chronicle_logs("./data").logs.filter_type("auth_login")
logs = scan_chronicle_logs("./data").logs.filter_type("username")
assert type(logs) == pl.DataFrameUnique Connect actions
ChronicleLogs.unique_connect_actions
ChronicleLogs.unique_connect_actions ()
Extract a sample of unique connect actions
scan_chronicle_logs("./data").logs.unique_connect_actions()
# assert type(logs) == pl.DataFrameConnect logins
ChronicleLogs.connect_logins
ChronicleLogs.connect_logins ()
Extract Connect login logs
path = "./data"
scan_chronicle_logs(path).logs.connect_logins()Extract connect audit logs
ChronicleLogs.extract_connect_audit_logs
ChronicleLogs.extract_connect_audit_logs (type:str)
Extract Connect audit logs
path = "./data"
scan_chronicle_logs(path).logs.extract_connect_audit_logs("user_login")Unique workbench types
ChronicleLogs.unique_workbench_types
ChronicleLogs.unique_workbench_types ()
Extract a sample of unique workbench types
scan_chronicle_logs("./data").logs.unique_workbench_types()
# assert type(logs) == pl.DataFrameWorkbench logins
ChronicleLogs.workbench_logins
ChronicleLogs.workbench_logins ()
Extract Workbench login logs
path = "./data"
scan_chronicle_logs(path).logs.workbench_logins()Extract workbench audit logs
ChronicleLogs.extract_workbench_audit_logs
ChronicleLogs.extract_workbench_audit_logs (type:str)
Extract Workbench login logs
path = "./data"
scan_chronicle_logs(path).logs.extract_workbench_audit_logs("session_start")ChronicleLogs.extract_workbench_audit_cols
ChronicleLogs.extract_workbench_audit_cols (type:str)
Extract Workbench audit columns
scan_chronicle_logs("./data").logs.extract_workbench_audit_cols("session_quit")