Data Leak Detection
Goal
Detects data exfiltration attempts by monitoring and analysing network traffic patterns.
Description
The system aims to identify unusual spikes or anomalies in data transfer that may indicate unauthorised data leakage or theft. By leveraging machine learning algorithms, the system provides proactive measures to safeguard sensitive information and maintain data integrity.
Characteristics
Data involved
Network outbound data
Alert Generation
It will produce alerts when the traffic exceeds the normal behaviour of the source component.
Parametrization
The model has the following parameters:
Min_traffic (Minimum Traffic):
Description: This parameter sets the minimum amount of traffic required to consider an activity as anomalous.
Default Value: 100MB.
Purpose: Helps to filter out background noise and focuses on significant data volumes that could indicate a data breach.
Max_average_traffic (Maximum Average Traffic):
Description: Defines the threshold to classify an endpoint as a high data sender and thereby exclude it from analysis.
Default Value: 10MB.
Purpose: Prevents false positives by not considering those endpoints that regularly handle large volumes of data.
Deviation (Standard Deviation Threshold):
Description: The number of standard deviations required to qualify an event as anomalous.
Default Value: 8.
Purpose: Sets a stringent criterion for anomaly detection, reducing the likelihood of irrelevant alerts.
Min_amount_known_ip (Minimum Activity Intervals for Known IPs):
Description: The minimum number of activity intervals required to consider an address in the model
Default Value: 20.
Purpose: Ensures that the model is only applied to IPs with a sufficient history of activity, improving accuracy in detection.
Only_alert_unknown_destination_ip (Alerts for Unknown Destination IPs Only):
Description: This flag determines whether to generate alerts only if the destination IP address is unknown to the source IP address.
Default Value: True.
Purpose: Focuses on detecting potential data exfiltration to unrecognised destinations, often a key indicator of malicious activity.
Raw outputs of the model
The model outputs a boolean value indicating whether an anomaly is detected in network traffic:
True: An anomaly is detected, suggesting potential data exfiltration or suspicious activity, based on parameters like traffic volume and pattern deviations.
False: No anomaly is detected, indicating normal traffic behaviour within the defined thresholds.
This binary output aids in quick decision-making for potential security threats.
Last updated