core

Scan and query chronicle parquet files.

Scan chronicle parquet files

Chronicle collects and stores logs and metrics in a series of parquet files.

Use scan_chronicle_logs() to read logs and scan_chronicle_logs() to read metrics, by specifying the path to the parquet set you need.

The file tree looks like this, with logs and metrics in separate folders inside v1.

.
└── v1/
    ├── logs/
    └── metrics/

Inside both logs and metrics the data is stored by date, separated by year, month and day.

.
└── v1/
    ├── logs/
       └── 2023/
           ├── 02/
           │   ├── 01
           │   ├── 02
           │   ├── 03
           │   ├── 04
           │   ├── 05
           │   └── ...
           ├── 03
           ├── 04
           └── ...
    └── metrics/
        └── 2023/
            ├── 02/
               ├── 01
               ├── 02
               ├── 03
               ├── 04
               ├── 05
               └── ...
            ├── 03
            ├── 04
            └── ...

Using the scan interface


source

scan_chronicle

 scan_chronicle (path:str, type:str='', date:str=None, filename:str=None,
                 version:str='v1')

Read a chronicle parquet file into a polars LazyFrame.

Type Default Details
path str Path to dataset,
type str must be metrics or logs
date str None date in format YYYY/MM/DD
filename str None name of parquet file. If empty, will be inferred.
version str v1 currently must be v1
Returns LazyFrame
scan_chronicle("./data", "metrics", "2023/04/03")

source

scan_chronicle_logs

 scan_chronicle_logs (path:NoneType, date:str=None, version:str='v1')

Read a chronicle logs parquet file into a polars dataframe.

Type Default Details
path None Path to dataset,
date str None date in format YYYY/MM/DD
version str v1 currently must be v1
Returns DataFrame

source

scan_chronicle_metrics

 scan_chronicle_metrics (path:str, date:str=None, version:str='v1')

Read a chronicle metrics parquet file into a polars dataframe.

Type Default Details
path str Path to dataset,
date str None date in format YYYY/MM/DD
version str v1 currently must be v1
Returns DataFrame
z = scan_chronicle_metrics("./data", "2023/04/03")
assert type(z) == pl.LazyFrame
assert z.collect().columns == [
    'service',
    'host',
    'os',
    'attributes',
    'name',
    'description',
    'unit',
    'type',
    'timestamp',
    'value_float',
    'value_int',
    'value_uint',
    'value_column'
]
z = scan_chronicle_logs("./data", "2023/04/03")
assert type(z) == pl.LazyFrame
assert z.collect().columns == [
    'service', 
    'host', 
    'os', 
    'attributes', 
    'body', 
    'timestamp'
]

Analyse metrics


source

ChronicleMetrics

 ChronicleMetrics (ldf:polars.lazyframe.frame.LazyFrame)

Initialise a chronicle metrics class

Type Details
ldf LazyFrame A polars DataFrame
Returns LazyFrame

Use .metrics.describe() to get a DataFrame of the unique metrics in the metrics data, containing the service, name and description of each metric.


source

ChronicleMetrics.describe

 ChronicleMetrics.describe ()

Reads metrics dataframe and returns a pandas dataframe with summary of service, name and description of all metrics

m = scan_chronicle_metrics("./data", "2023/04/03").metrics.describe()
assert list(m) == ['service', 'name', 'description', 'value_column']
m

Use .metrics.filter() to filter the DataFrame on the name column.


source

ChronicleMetrics.filter

 ChronicleMetrics.filter (name:str, service:str=None, alias:str=None)

Extract a single metric from a metrics dataframe

Type Default Details
name str name of metric to extract
service str None service to extract metric from
alias str None alias to use for new column
Returns DataFrame
m = scan_chronicle_metrics("./data", "2023/04/03").metrics.filter("rsconnect_system_memory_used")
assert type(m) == pd.DataFrame
assert list(m) == ['host', 'timestamp', 'rsconnect_system_memory_used']

m = scan_chronicle_metrics("./data", "2023/04/03").metrics.filter("rsconnect_system_memory_used", alias="memory")
assert type(m) == pd.DataFrame
assert list(m) == ['host', 'timestamp', 'memory']

m = scan_chronicle_metrics("./data", "2023/04/03").metrics.filter("rsconnect_system_memory_used", service = "connect-metrics", alias = "memory")
assert type(m) == pd.DataFrame
assert list(m) == ['host', 'timestamp', 'memory']

Analyse logs


source

ChronicleLogs

 ChronicleLogs (df:polars.dataframe.frame.DataFrame)

Initialise a chronicle logs DataFrame

Type Details
df DataFrame A polars data frame
Returns DataFrame

Filter logs on type

You can use logs/filter_type() to filter logs on the type column.

scan_chronicle_logs("./data").head(1).explode("attributes").collect()

source

ChronicleLogs.filter_type

 ChronicleLogs.filter_type (value:str)

Extract all logs where type == value

Type Details
value str Value to extract
Returns DataFrame
# scan_chronicle_logs("./data").logs.filter_type("auth_login")
logs = scan_chronicle_logs("./data").logs.filter_type("username")
assert type(logs) == pl.DataFrame

Unique Connect actions


source

ChronicleLogs.unique_connect_actions

 ChronicleLogs.unique_connect_actions ()

Extract a sample of unique connect actions

scan_chronicle_logs("./data").logs.unique_connect_actions()
# assert type(logs) == pl.DataFrame

Connect logins


source

ChronicleLogs.connect_logins

 ChronicleLogs.connect_logins ()

Extract Connect login logs

path = "./data"
scan_chronicle_logs(path).logs.connect_logins()

Extract connect audit logs


source

ChronicleLogs.extract_connect_audit_logs

 ChronicleLogs.extract_connect_audit_logs (type:str)

Extract Connect audit logs

path = "./data"
scan_chronicle_logs(path).logs.extract_connect_audit_logs("user_login")

Unique workbench types


source

ChronicleLogs.unique_workbench_types

 ChronicleLogs.unique_workbench_types ()

Extract a sample of unique workbench types

scan_chronicle_logs("./data").logs.unique_workbench_types()
# assert type(logs) == pl.DataFrame

Workbench logins


source

ChronicleLogs.workbench_logins

 ChronicleLogs.workbench_logins ()

Extract Workbench login logs

path = "./data"
scan_chronicle_logs(path).logs.workbench_logins()

Extract workbench audit logs


source

ChronicleLogs.extract_workbench_audit_logs

 ChronicleLogs.extract_workbench_audit_logs (type:str)

Extract Workbench login logs

path = "./data"
scan_chronicle_logs(path).logs.extract_workbench_audit_logs("session_start")

source

ChronicleLogs.extract_workbench_audit_cols

 ChronicleLogs.extract_workbench_audit_cols (type:str)

Extract Workbench audit columns

scan_chronicle_logs("./data").logs.extract_workbench_audit_cols("session_quit")