Overview
Capsule8’s recommended deployment for non-cloud environments is to use HDFS (Hadoop Distributed File System) for storage and Presto for the querying engine.
Configuration
Investigations is configured in the /etc/capsule8/capsule8-sensor.yaml
file. By default, the Process Events, Sensor, and Container Events are enabled. Which means that if no table
key is provided, those tables will be turned on automatically with a default row size – unique for each MetaEvent type. To configure additional tables, specify them directly.
This is a complete example with every MetaEvent type configured to write to HDFS:
Investigations:
reporting_interval: 5m
sinks:
- name: >-
[The address of the name node here]:9000/[directory on hdfs to store
data, absolute path]
backend: hdfs
automated: true
credentials: null
blob_storage_hdfs_user: '[hadoop username that has write access]'
blob_storaee_create_buckets_enabled: true
flight_recorder:
enabled: true
Tables:
- name: shell_commands
enabled: true
- name: tty_data
enabled: true
- name: file_events
enabled: false
- name: connections
enabled: true
- name: sensor_metadata
enabled: true
- name: alerts
enabled: true
- name: sensors
enabled: true
- name: process_events
enabled: true
- name: container_events
enabled: true
Storage Solutions
This section provides guides to aid in the installation/setup of storage solutions for Capsule8’s investigations data.
HDFS
Credentials
Currently, only insecure HDFS is supported. Only the username of a user that has write access to the directory – that would store Investigations data – is required. In addition to this, every sensor writing to HDFS will need to be able to access the namenodes on ports 8020/9000 and all of the datanodes on port 50010 and 50020.
Sensor Configuration
To write MetaEvents to HDFS, you can configure it with a address of the name node with port and a user:
investigations:
reporting_interval: 5m
sinks:
- name: "[The address of the name node here]:9000/[directory on hdfs to store data, absolute path]"
backend: hdfs
automated: true
credentials:
blob_storage_hdfs_user: "[hadoop username that has write access]"
blob_storage_create_buckets_enabled: true
Create Bucket
It is highly recommended to have blob_storage_create_buckets_enabled: true
set for HDFS. This is because of the hierarchical nature of HDFS vs. the flat nature of blob storage. If a table subdirectory or partition folder does not exist, it will fail to write.
Automatic
The following settings will ensure that folders are created in HDFS if they do not exist. In /etc/capsule8/capsule8-sensor.yaml enable the blob_storage_create_buckets_enabled
field. See the example configuration below.
blob_storage_create_buckets_enabled: true blob_storage_hdfs_user: <hdfs user>
Query Solutions
This section provides guides to aid in the installation/setup of query solutions with Capsule8’s investigations.
Presto: Manual
Create and Configure Bucket
HDFS Configuration
Example Queries
Queries are run using SQL syntax. This section provides a few example queries that might be of use during an investigation. For a complete reference of all the available fields that can be queried, see the MetaEvents section at the end of this guide.
Who Has Run a Command Through Sudo?
SELECT from_unixtime(process_events.unix_nano_timestamp / 1000000000) as timestamp, pid, path, username, login_username FROM process_events where event_type = 0 and username != login_username;
Which Programs and their Users Connected to a Given IP?
SELECT DISTINCT from_unixtime(connections.unix_nano_timestamp / 1000000000) AS timestamp, sensors.hostname, process_events.path, container_events.container_name, container_events.image_name, connections.dst_addr, connections.dst_port FROM connections, sensors, container_events, process_events WHERE connections.process_c8id = process_events.process_c8id AND container_events.process_c8id = ## Storage solutionsprocess_events.proce ss_c8id AND connections.dst_addr = '$DESTINATION_IP';
What Containers or Images Ran on my Cluster and where?
SELECT sensors.hostname container_events.image_name, from_unix_time(container_events.unix_nano_timestamp / 1000000000) as timestamp FROM sensors, container_events;
Get All Alerts that Are Part of an Incident
SELECT * FROM alerts where incident_id = '$INCIDENT_ID';
Get All Shell Commands That are Part of an Incident
SELECT from_unixtime(shell_commands.unix_nano_timestamp / 1000000000) AS timestamp, sensors.hostname, array_join(shell_commands.program_arguments, ' ') as args, shell_commands.username FROM shell_commands JOIN sensors ON sensors.sensor_id = shell_commands.sensor_id WHERE shell_commands.incident_id = '$INCIDENT_ID';
Comments
0 comments
Please sign in to leave a comment.