Securing Cloud Infrastructure with Falco, Prometheus, Grafana & Docker

Cloud adoption is accelerating rapidly. Gartner forecasts worldwide public cloud revenue to grow 17% in 2023 to $591.8 billion, up from $490.3 billion in 2022. And a recent report from Palo Alto Networks found 70% of organizations now host more than half of their workloads in the cloud.

But as more sensitive data and services move to cloud platforms, securing that infrastructure becomes critical. According to IBM, the average cost of a data breach reached $4.35 million in 2022, the highest in the 17-year history of their reporting. And nearly 20% of organizations were breached through their cloud environments.

Traditional network and endpoint security tools aren‘t well suited to the dynamic, ephemeral nature of cloud workloads. Perimeter-based controls become less relevant when infrastructure is software-defined and microservices-based.

Teams need new tools that provide deep visibility and protection at the application layer. That‘s where an open source project like Falco comes in.

What is Falco?

Falco is a Cloud Native Computing Foundation incubating project originally created by Sysdig. It acts as a behavioral activity monitor, detecting unexpected application behavior and policy violations.

Unlike legacy tools that rely on signature matching or network monitoring, Falco takes a novel approach. It taps directly into the stream of system calls being made by applications and containers on a Linux host.

System calls are how applications interface with the operating system kernel. They‘re used to perform privileged actions like opening files, executing processes, and making network connections. By analyzing system calls, Falco can detect malicious activity regardless of how it‘s disguised at a higher layer.

Falco runs as a daemon that consumes a stream of system call events. It then checks those events against a powerful rules engine. If an event matches a rule, Falco can generate an alert, execute a response, or simply log the behavior.

Rules are written in a domain-specific language that lets you express complex logic. For example, you can create a rule that alerts if a shell is spawned inside a container, or if a process tries to change a sensitive file.

Here‘s what the architecture looks like:

Falco architecture diagram

Source: Falco.org

Falco uses either a kernel module or an eBPF probe to tap into system calls. The kernel module was the original implementation while eBPF support was added more recently. eBPF has some advantages – it avoids the need to install a kernel module and allows more selective filtering of events to reduce overhead.

Falco also collects other system state information like container metadata, Kubernetes API audit events, and application metrics. This provides valuable context to enrich alerts.

Detecting unexpected application behavior

Let‘s look at some examples of the threats Falco can detect using its system call analysis.

Shells in containers

One common pattern is attackers exploit a vulnerability to spawn an interactive shell inside a container. They then use that shell for recon and lateral movement. But a container should rarely need an interactive shell, so we can create a Falco rule to detect that:

- rule: Terminal shell in container
  desc: A shell was spawned inside a container with an attached terminal.
  condition: container and proc.name = bash and evt.type = exec and evt.dir=<
  output: "Shell spawned in container (user=%user.name %container.info shell=%proc.name parent=%proc.pname cmdline=%proc.cmdline terminal=%proc.tty)"
  priority: WARNING

This rule looks for a bash process with an attached terminal (TTY) being spawned inside a container. The container macro ensures this only applies to containerized workloads. If triggered, it will generate a warning alert with details on the container and command executed.

Fileless malware

Fileless malware is a growing threat that avoids detection by traditional antivirus tools. Rather than installing malicious binaries on disk, it hides in memory and abuses trusted tools like PowerShell or WMI.

We can use Falco to detect some common fileless techniques. For example, this rule detects a script interpreter like Python spawning a new process:

- macro: trusted_interpreters
  condition: proc.name in (python, python2, python3, perl, ruby, php, node)

- rule: Interpreter Spawning Process
  desc: Scripting interpreter spawned new process other than itself.
  condition: trusted_interpreters and spawned_process and not proc.name in (trusted_interpreters)
  output: "Interpreter spawned process other than itself (user=%user.name parent=%proc.pname cmdline=%proc.cmdline)"
  priority: NOTICE

The trusted_interpreters macro defines a list of known interpreters. The rule then detects any of those interpreters spawning a child process other than itself. That new process could be a malicious script or payload.

Cryptocurrency mining

Cryptocurrency mining is another common attack. Malware or rogue insiders install miners to steal compute resources for profit. We can detect some mining activity by looking for suspicious network connections:

- list: miner_ports
  items: [
    3333, 3334, 3335, 3336, 3357, 4444,
    5555, 5556, 5588, 5730, 6099, 6666,
    7777, 7778, 8000, 8001, 8008, 8080,
    8118, 8333, 8888, 8899, 9332, 9999,
    14433, 14444, 45560, 45700
    ]

- rule: Detect outbound connections to common miner pool ports
  desc: Detect outbound connections to ports commonly used by cryptocurrency miners.
  condition: net.protocol = tcp and net.remote.port in (miner_ports) and container
  output: "Outbound connection to miner port (host=%host.name port=%net.remote.port user=%user.name %container.info)"
  priority: CRITICAL

This rule uses a list of common mining pool ports. It then detects any outbound TCP connections to those ports coming from a container. This could indicate a miner trying to connect to a pool to fetch work or submit hashes.

You could further tune this by looking at data transfer volumes or limiting to unexpected container images. The key is looking for behavior that deviates from your expected application baseline.

Data exfiltration

Data exfiltration is one of the most damaging types of breaches. Attackers look to steal sensitive information like customer PII, financial records, or intellectual property. We can write some Falco rules to detect potential data exfiltration attempts.

First, let‘s detect large outbound data transfers from a container:

- macro: large_outbound_transfer
  condition: evt.type=sendto and evt.dir=< and evt.len > 1000000

- rule: Detect large data egress from container
  desc: Detect outbound data transfers larger than 1MB from a container  
  condition: large_outbound_transfer and container
  output: "Large outbound data transfer (host=%host.name user=%user.name %container.info len=%evt.len)"
  priority: WARNING

The large_outbound_transfer macro looks for sendto system calls sending more than 1MB of data. The rule triggers if that happens inside a container. You could tune the threshold based on your workload‘s typical traffic patterns.

We can also detect access or reads of sensitive files often targeted for exfiltration:

- macro: sensitive_files
  condition: fd.name startswith /etc/shadow or fd.directory in (/var/lib/mongodb, /var/lib/mysql)

- rule: Detect reads of sensitive files
  desc: Detect reads of sensitive files like /etc/shadow
  condition: sensitive_files and open_read and container
  output: "Read of sensitive file (host=%host.name user=%user.name command=%proc.cmdline file=%fd.name %container.info)"
  priority: WARNING

The sensitive_files macro defines a set of important files and directories. The rule triggers if any of those are opened for reading inside a container. This could indicate an attacker trying to access hashed password files or database files.

Integrating Falco with Prometheus, Grafana & Fluentd

Writing effective Falco rules is only part of the equation. To get the most value from Falco alerts, you need to route them to your incident response and visualization tools. Let‘s see how to integrate Falco with some other popular open-source tools.

Prometheus

Prometheus is a popular monitoring and alerting system. It excels at collecting and querying time series data. Falco can export its alerts and metrics to Prometheus using the falco-exporter.

First install the falco-exporter package. Then update Falco‘s configuration in /etc/falco/falco.yaml to enable gRPC output:

json_output: true
json_include_output_property: true

grpc:
  enabled: true
  bind_address: "0.0.0.0:5060"

Restart Falco to load the new configuration, then start the falco-exporter service.

Finally, configure Prometheus to scrape the exporter by adding a new job to prometheus.yml:

- job_name: falco
  scrape_interval: 10s
  static_configs:
  - targets:
    - localhost:9376

Restart Prometheus and you should see Falco metrics populated.

Grafana

Grafana is a powerful open-source dashboarding tool. You can create rich visualizations of data from Prometheus and other sources. Let‘s create a dashboard showing an overview of Falco alerts.

In Grafana, create a new dashboard and add a panel with the query:

sum(falco_alerts_total) by (priority,rule)

This will graph the count of Falco alerts grouped by priority and rule name. You can customize the visualization and add additional queries to show metrics like:

  • Top 10 event-generating rules
  • Alerts by host or application
  • File & network activity counters
  • Policy violation trends

Here‘s an example of what the final dashboard might look like:

Falco Grafana Dashboard

Source: Falco.org

Fluentd

Fluentd is a popular open-source log collector. You can use it to route Falco alerts to a central location for retention and analysis.

To integrate Falco and Fluentd, first install the fluent-plugin-falco gem:

gem install fluent-plugin-falco

Then configure Fluentd to listen for Falco alerts over a TCP socket:

<source>
  @type falco
  port 5140
  bind 0.0.0.0
</source>

<match falco.**>
  @type file
  path /fluentd/log/falco
</match>

This tells Fluentd to listen on port 5140 and output Falco alerts to a log file. You could also route the alerts to Elasticsearch, S3, or any of the other Fluentd output plugins.

Finally, configure Falco to send alerts to Fluentd by adding a new output to falco.yaml:

program_output:
  enabled: true
  keep_alive: false
  program: "\"logger -t falco\"" 

http_output:
  enabled: true
  url: http://localhost:5140/

The program_output section logs alerts to a local file using the logger utility. The http_output section sends them to the Fluentd listener.

With this integration in place, you can centrally collect Falco alerts from multiple hosts. You can then search, analyze, and trigger notifications based on that data.

Conclusion

As organizations rapidly adopt cloud platforms, securing workloads becomes paramount. But traditional tools fall short in dynamic, containerized environments.

Falco‘s approach of analyzing system calls provides deep visibility into application behavior. By defining rules that represent expected versus malicious behavior, Falco can detect threats other tools miss.

Integrating Falco with monitoring and logging tools like Prometheus, Grafana, and Fluentd lets you build a robust security observability pipeline. You can route alerts to your incident response systems, visualize trends over time, and quickly investigate issues.

The examples we‘ve covered today only scratch the surface of Falco‘s capabilities. Its flexible rules engine lets you codify your security policies as code. And its growing community is constantly adding new rules and plugins.

Falco also integrates well with other Cloud Native Computing Foundation projects. You can use Falco with Kubernetes audit logging to detect threats at the orchestration layer. And Falco events can trigger serverless responses via platforms like Knative and OpenFaaS.

As a full-stack developer working with cloud infrastructure, Falco should be a key part of your security toolkit alongside policy management and vulnerability scanning. By catching malicious behavior as it happens, Falco provides a crucial last line of defense.

To learn more about Falco, I recommend starting with the official documentation. The Falco blog is also a great resource for seeing real-world examples of Falco in action from the community.

You can find my example code and configurations from this article on GitHub. I encourage you to try out Falco in a development environment and see what unexpected behaviors you can detect. With some tuning and integration work, Falco can help you build more secure applications.

Cloud security is a constantly evolving space. By leveraging powerful open source tools like Falco, Prometheus, and Grafana, you can adapt quickly and keep your infrastructure secure.

Similar Posts