Labels and Classification in Sophie
This topic describes the different Labels and Classifications in Loom-Systems’ AI platform Sophie.
What are Labels?
Sophie reads log messages and automatically breaks them down into several properties. The most common properties, which represent the most fundamental building blocks of a source type, are Labeled as Host, Message, Timestamp, External ID, and Severity. Though not applicable to every source type, they are imperative for proper data analysis.
Other information in the log message is broken down, wherever possible, into properties such as, IP address, Process ID, and so on.
The Labels are:
- Timestamp: this label tags the property that holds the timestamp of the event. Please note that the timestamp might be absent from the Source Type and extracted from the transport header.
- Severity: this label tags the property representing the log’s severity.
- Message: this label tags the log’s message. Sophie will then use this property to learn the different textual patterns that appears within the data.
- Host: this label tags the property representing the host from which the event was sent. Please note that the host might be absent from the Source Type and extracted from the transport header.
- External ID: this label tags a property that serves as a unique identifier for this event type (e.g. Event ID in Windows Event Log).
What are Classifications?
Sophie uses different sets of algorithms to analyze different properties, based on how they were classified. Each filed can be classified with one of the following properties:
For Meter-type properties, Sophie calculates the number of times these properties exists in each log message. Sophie will present anomalies based on the number of occurrences of a value (e.g. the number of GET requests received status 404 is abnormally high).
Gauge measures the contents of the properties, instead of the number of times it was shown (e.g. the time taken to complete an action was abnormally long). Gauges are relevant only for numeric values. Sophie assumes that the value of this properties is reported in routine times.
Timeless-Gauge measures the contents of the property and not the number of times it was shown. Timeless-Gauges are relevant only for numeric values. Sophie assumes that the value of this property is not reported in routine times.
The property is not reported as an anomaly by itself but is only a part of the Root Cause Analysis of another anomaly (i.e., contextual data like a host, a thread ID or a request ID).
Shows a histogram representation of the property.
This property is neither calculated nor is shown in the ARC.
How to Set Properties Classifications?
Classifying the properties is a one-time activity that is done on the Source Type level. Go to "Source Types" and click "Structure."
- Review the classification of each field and make sure that the classification is matching to the property.
- Please review the following example:
- Make sure that contextual fields / unique identifiers are assigned as ARC. Do not classify such fields as "Meter"!
- Browse through 10 examples to review the classifications.
- Go back to the source types screen and click "Properties". This screen will allow you an overview of all properties in the source type:
Review the classifications one last time.