TRAM

Threat Report ATT&CK Mapping (TRAM) is an open-source platform designed to advance research into automating the mapping of cyber threat intelligence reports to MITRE ATT&CK®.

TRAM enables researchers to test and refine Machine Learning (ML) models for identifying ATT&CK techniques in prose-based cyber threat intel reports and allows threat intel analysts to train ML models and validate ML results.

Through research into automating the mapping of cyber threat intel reports to ATT&CK, TRAM aims to reduce the cost and increase the effectiveness of integrating ATT&CK into cyber threat intelligence across the community. Threat intel providers, threat intel platforms, and analysts should be able to use TRAM to integrate ATT&CK more easily and consistently into their products.

TRAM

Installation

Install Docker tools:
- Docker: https://docs.docker.com/get-docker/
- Docker Compose: https://docs.docker.com/compose/install/
- Verify that Docker is running by running docker ps from a shell. If it shows, "CONTAINER ID IMAGE COMMAND" on the first line, then it is running. If it says, "cannot connect to Docker daemon," then Docker is not running.

Download docker-compose.yml for TRAM, using this link or using curl:

$ curl -O https://raw.githubusercontent.com/center-for-threat-informed-defense/tram/787143e4f41f40e4aeb72d811a9d4297c03364d9/docker/docker-compose.yml

If desired, edit the settings in docker-compose.yml. See docker/README.md for more information.
Use Docker Compose to start the TRAM containers.
- Run this command from the same directory where you downloaded docker-compose.yml.
```
$ docker-compose up
```
- The first time you run this command, it will download about 1GB of Docker images. This requires a connection to the internet. If your environment does not have a connection to the internet, refer to Air Gap Installation.
- Once the images are downloaded, TRAM will do a bit of initialization. The following output lines indicate that TRAM is ready to use:
```
tram_1   | [2022-03-30 16:18:44 +0000] [29] [INFO] Starting gunicorn 20.1.0
tram_1   | [2022-03-30 16:18:44 +0000] [29] [INFO] Listening at: http://0.0.0.0:8000 (29)
```
- Note: the log shows the IP address 0.0.0.0, but TRAM requires connections to use one of the hostnames defined in the ALLOWED_HOSTS environment variable.
Navigate to http://localhost:8000/ and login using the username and password specified in docker-compose.yml.
To shut down TRAM, type Ctrl+C in the shell where docker-compose up is running.

Air Gap Installation

If you are unable to pull images from Docker Hub (i.e. due to corporate firewall, airgap, etc.), it is possible to download the images and move them onto the Docker host manually:

Pull the images onto a machine that is able to access Docker Hub:

$ docker pull ghcr.io/center-for-threat-informed-defense/tram:latest
$ docker pull ghcr.io/center-for-threat-informed-defense/tram-nginx:latest

Export the Docker images to compressed archive (.tgz) format:

$ docker save ghcr.io/center-for-threat-informed-defense/tram:latest \
    | gzip > tram-latest.tgz
$ docker save ghcr.io/center-for-threat-informed-defense/tram-nginx:latest \
    | gzip > tram-nginx-latest.tgz

Confirm that the images were exported correctly.

ls -lah tram*.tgz
-rw-r--r--  1 johndoe  wheel   345M Feb 24 12:56 tram-latest.tgz
-rw-r--r--  1 johndoe  wheel   9.4M Feb 24 12:57 tram-nginx-latest.tgz

Copy the images across the airgap.
- This step will depend on your deployment environment, of course.

Import the Docker images on the Docker host.

$ docker load < tram-latest.tgz
$ docker load < tram-nginx-latest.tar.gz

Confirm that the images were loaded on the Docker host.

$ docker images | grep tram
ghcr.io/center-for-threat-informed-defense/tram-nginx   latest    8fa8fb7801b9   2 weeks ago    23.5MB
ghcr.io/center-for-threat-informed-defense/tram         latest    d19b35523098   2 weeks ago    938MB

From this point, you can follow the main installation instructions above.

Installation Troubleshooting

[97438] Failed to execute script docker-compose

If you see this stack trace:

Traceback (most recent call last):
  File "docker-compose", line 3, in <module>
  File "compose/cli/main.py", line 81, in main
  File "compose/cli/main.py", line 200, in perform_command
  File "compose/cli/command.py", line 60, in project_from_options
  File "compose/cli/command.py", line 152, in get_project
  File "compose/cli/docker_client.py", line 41, in get_client
  File "compose/cli/docker_client.py", line 170, in docker_client
  File "docker/api/client.py", line 197, in __init__
  File "docker/api/client.py", line 221, in _retrieve_server_version
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', ConnectionRefusedError(61, 'Connection refused'))
[97438] Failed to execute script docker-compose

Then most likely reason is that Docker is not running and you need to start it.

Report Troubleshooting

How long until my queued report is complete?

A queued report should only take about a minute to complete.

Why is my report stuck in queued?

This is likely a problem with the processing pipeline. If the pipeline is not working when you are running TRAM via docker, then this might be a TRAM-level bug. If you think this is the case, then please file an issue and we can tell you how to get logs off the system to troubleshoot.

Do I have to manually accept all of the parsed sentences in the report?

Yes. The workflow of TRAM is that the AI/ML process will propose mappings, but a human analyst needs to validate/accept the proposed mappings.

Requirements

python3 (3.7+)
Google Chrome is our only supported/tested browser

For Developers

Developer Setup

The following steps are only required for local development and testing. The containerized version is recommended for non-developers.

Install the following packages using your OS package manager (apt, yum, homebrew, etc.):
1. make
2. shellcheck
3. shfmt

Start by cloning this repository.

git clone [email protected]:center-for-threat-informed-defense/tram.git

Change to the TRAM directory.
```
cd tram/
```
Create a virtual environment and activate the new virtual environment.
1. Mac and Linux
```
python3 -m venv venv
source venv/bin/activate
```
2. Windows
```
venv\Scripts\activate.bat
```

Install Python application requirements.

pip install -r requirements/requirements.txt
pip install -r requirements/test-requirements.txt

Install pre-commit hooks
```
pre-commit install
```
Set up the application database.
```
tram makemigrations tram
tram migrate
```

Run the Machine learning training.

tram attackdata load
tram pipeline load-training-data
tram pipeline train --model nb
tram pipeline train --model logreg
tram pipeline train --model nn_cls

Create a superuser (web login)
```
tram createsuperuser
```
Run the application server
```
DJANGO_DEBUG=1 tram runserver
```
Open the application in your web browser.
1. Navigate to http://localhost:8000 and use the superuser to log in

In a separate terminal window, run the ML pipeline

cd tram/
source venv/bin/activate
tram pipeline run

Makefile Targets

Run TRAM application
- make start-container
Stop TRAM application
- make stop-container
View TRAM logs
- make container-logs
Build Python virtualenv
- make venv
Install production Python dependencies
- make install
Install prod and dev Python dependencies
- make install-dev
Manually run pre-commit hooks without performing a commit
- make pre-commit-run
Build container image (By default, container is tagged with timestamp and git hash of codebase) See note below about custom CA certificates in the Docker build.)
- make build-container
Run linting locally
- make lint
Run unit tests, safety, and bandit locally
- make test

The automated test suite runs inside tox, which guarantees a consistent testing environment, but also has considerable overhead. When writing code, it may be useful to run pytest directly, which is considerably faster and can also be used to run a specific test. Here are some useful pytest commands:

# Run the entire test suite:
$ pytest tests/

# Run tests in a specific file:
$ pytest tests/tram/test_models.py

# Run a test by name:
$ pytest tests/ -k test_mapping_repr_is_correct

# Run tests with code coverage tracking, and show which lines are missing coverage:
$ pytest --cov=tram --cov-report=term-missing tests/

Custom CA Certificate

If you are building the container in an environment that intercepts SSL connections, you can specify a root CA certificate to inject into the container at build time. (This is only necessary for the TRAM application container. The TRAM Nginx container does not make outbound connections.)

Export the following two variables in your environment.

$ export TRAM_CA_URL="http://your.domain.com/root.crt"
$ export TRAM_CA_THUMBPRINT="C7:E0:F9:69:09:A4:A3:E7:A9:76:32:5F:68:79:9A:85:FD:F9:B3:BD"

The first variable is a URL to a PEM certificate containing a root certificate that you want to inject into the container. (If you use an https URL, then certificate checking is disabled.) The second variable is a SHA-1 certificate thumbprint that is used to verify that the correct certificate was downloaded. You can obtain the thumbprint with the following OpenSSL command:

$ openssl x509 -in <your-cert.crt> -fingerprint -noout
SHA1 Fingerprint=C7:E0:F9:69:09:A4:A3:E7:A9:76:32:5F:68:79:9A:85:FD:F9:B3:BD

After exporting these two variables, you can run make build-container as usual and the TRAM container will contain your specified root certificate.

Making API Calls

To make API calls, you first need to use a valid username and password to obtain an API token by calling the /api/token/ endpoint.

$ curl -X POST -H "Content-type: application/json" \
       -d '{"username": "admin", "password": "(your password goes here)"}' \
       http://localhost:8000/api/token/
{
    "refresh":"eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoicmVmcmVzaCIsImV4cCI6MTY1MTg0MjAxMCwiaWF0IjoxNjUxNzU1NjEwLCJqdGkiOiI4NjYyNmJhZDZhM2U0ZjRmYjY5MWIwOTY5ZjIxYTliYiIsInVzZXJfaWQiOjR9.(REDACTED)",
    "access":"eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoiYWNjZXNzIiwiZXhwIjoxNjUxNzU1OTEwLCJpYXQiOjE2NTE3NTU2MTAsImp0aSI6IjUwMzAzYWI1MDliNTRmY2RiMThhYzMyNWM0NTU2Yjg5IiwidXNlcl9pZCI6NH0.(REDACTED)"
}

This call returns two tokens. The access token can be used to make authenticated API calls (no session cookie or CSRF token is needed). This token is valid for 1 hour from its issuance.

$ curl -X POST \
       -H "Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoiYWNjZXNzIiwiZXhwIjoxNjUxNzYzNDIxLCJpYXQiOjE2NTE3NTU2MTAsImp0aSI6ImMxNDVhZWVlNmNkZTRkNTA5ZjNmOGVhMjJjMjI1NDZlIiwidXNlcl9pZCI6NH0.(REDACTED)" \
       http://localhost:8000/api/train-model/logreg
{
    "message": "Model successfully trained.",
    "elapsed_sec": 6.13424277305603
}

The other token is the refresh token. When your access token expires, you can use the refresh token (which is valid for 24 hours) to obtain a new access token without needing to provide the username and password again.

curl -X POST -H "Content-type: application/json" \
     -d '{"refresh": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoicmVmcmVzaCIsImV4cCI6MTY1MTg0MjAxMCwiaWF0IjoxNjUxNzU1NjEwLCJqdGkiOiI4NjYyNmJhZDZhM2U0ZjRmYjY5MWIwOTY5ZjIxYTliYiIsInVzZXJfaWQiOjR9.YlkrG6AbL8YEoTtg4yU-N-4AGN0KgCBKXq64nY9msWI"}' \
     http://localhost:8000/api/token/refresh/
{
    "access":"eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoiYWNjZXNzIiwiZXhwIjoxNjUxNzU5OTYwLCJpYXQiOjE2NTE3NTU2MTAsImp0aSI6ImY5NzZmNzg1ZGFmYzRhZjk4Yzc3MTQ4M2VjYTI0MzJhIiwidXNlcl9pZCI6NH0.hH4qsVMIgZx8A0aVVMuUf-H0aYz9bKULo6XPufrRcZw"
}

Machine Learning Development

All source code related to machine learning is located in TRAM src/tram/ml.

Existing ML Models

TRAM has four machine learning models that can be used out-of-the-box:

LogisticRegressionModel - Uses SKLearn's Logistic Regression.
NaiveBayesModel - Uses SKLearn's Multinomial NB.
Multilayer Perception - Uses SKLearn's MLPClassifier.
DummyModel - Uses SKLearn's Dummy Classifier for testing purposes.

All ML models are implemented as an SKLearn Pipeline. Other types of models can be added in the future if there is a need.

Creating Your Own ML Model

In order to write your own model, take the following steps:

Create a subclass of tram.ml.base.SKLearnModel that implements the get_model function. See existing ML Models for examples that can be copied.

class DummyModel(SKLearnModel):
    def get_model(self):
        # Your model goes here
        return Pipeline([
            ("features", CountVectorizer(lowercase=True, stop_words='english', min_df=3)),
            ("clf", DummyClassifier(strategy='uniform'))
        ])

Add your model to the ModelManager registry

Note: This method can be improved. Pull requests welcome!

class ModelManager(object):
    model_registry = {
        'dummy': DummyModel,
        'nb': NaiveBayesModel,
        'logreg': LogisticRegressionModel,
        # Your model on the line below
        'your-model': python.path.to.your.model
    }

You can now train your model, and the model will appear in the application interface.
```
tram pipeline train --model your-model
```
If you are interested in sharing your model with the community, thank you! Please open a Pull Request with your model, and please include performance statistics in your Pull Request description.

How do I contribute?

We welcome your feedback and contributions to help advance TRAM. Please see the guidance for contributors if are you interested in contributing or simply reporting issues.

Please submit issues for any technical questions/concerns or contact [email protected] directly for more general inquiries.

Contribute Training Data

All training data is formatted as a report export. If you are contributing training data, please ensure that you have the right to publicly share the threat report. Do not contribute reports that are proprietary material of others.

To contribute training data, please:

Use TRAM to perform the mapping, and ensure that all mappings are accepted
Use the report export feature to export the report as JSON
Open a pull request where the training data is added to data/training/contrib

Notice

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

This project makes use of MITRE ATT&CK®

ATT&CK Terms of Use

samsplunks / tram Goto Github PK

tram's Introduction