Introducing FlowMeter

Generate an ML model with ease

Siddharth Satpathy
Better Programming

--

FlowMeter is an open-source tool that analyses and classifies packet captures using machine learning techniques.

It is an experimental project; we’re using it at Deepfence to evaluate how effectively we can train an ML model to discriminate between different types of traffic flows.

You can build and install FlowMeter from GitHub: https://github.com/deepfence/FlowMeter

Why Network Observability?

Network observability is an essential part of security observability at runtime. It is necessary in order to understand the broader context of security events, and provides vital input to attack modeling using the MITRE ATT&CK methodology.

On-host (and on-container) observability provides indicators of compromise — anomalies that may indicate an attacker has gained a degree of influence over the target. These are key signals for modern security observability solutions, but indicators of compromise alone can not tell the full story of attack behavior, and may come too late to respond proactively to an attack.

Effective network observability provides indicators of attack — including reconnaissance, targeted weaponization, lateral spread and exfiltration activities. These provide essential context to attack behavior, describing both the techniques that precede an attempted exploit, and those which follow (discovery, lateral movement, command and control, exfiltration).

For example, in a log4j exploit, almost all of the initial signals are network-based. The initial JNDI recon against multiple workloads, the JNDI request that then triggers an outgoing request (beacon) to an attacker’s listener, the subsequent request that retrieves the Java class to be run… all of these are network-based and cannot be identified by on-host methods alone. The first signal you get from on-host observability may be the filesystem installation of the exploit kit (the crypto-miner for example).

How Might FlowMeter Play a Part?

In Deepfence ThreatStryker, we use a range of advanced techniques to capture traffic over a wide area, reassemble transactions, and classify layer-7 payloads against a large knowledge base of attack and exfiltration signals. When combined with knowledge of the application threat map and additional observability of indicators of compromise, this provides an effective and accurate stream of alerts for ThreatStryker’s attack modeling tools.

Naturally, as infrastructure scales, the resources dedicated to traffic analysis must scale accordingly. Unless that is, we can pre-classify network traffic efficiently and prioritize that which has a higher risk of being anomalous before we incur the cost of reassembly and layer-7 classification.

This is one of the goals of the experimental FlowMeter project; to better understand how we can rapidly filter traffic based on lightweight metadata such as arrival time, packet size, and flow length.

Above is an example output of the code. FlowMeter gives a rich set of features about flows from packet data, and classifies packets as benign or malicious.

Early Indications

Early Indications from FlowMeter are promising. We have trained FlowMeter’s ML engine in a supervised manner, using several public datasets to discriminate between benign and malicious traffic captures. The early results are shared in the project README, and illustrated below:

FlowMeter Training Identifies Distinct Profiles for Benign and Malicious Traffic

You can experiment with FlowMeter yourself, using both the public datasets referenced in the README, and your own packet captures obtained using PacketStreamer or other pcap tools.

We’d welcome any feedback, contributions, and suggestions. Please start with the FlowMeter GitHub repository.

--

--