honeycombio / agentless-integrations-for-aws Goto Github PK

View Code? Open in Web Editor NEW

48.0 48.0 24.0 6.33 MB

Lambda-based AWS integrations for Honeycomb

License: Apache License 2.0

Shell 4.88% Go 95.12%

agentless-integrations-for-aws's People

Contributors

Stargazers

Watchers

agentless-integrations-for-aws's Issues

ALB Integration regex bad parsing

Hi,

Creating an issue as per Slack discussion.

Our applications are running on EKS, using the ALB ingress controller and all ALBs are set to put their logs in s3://env-name/app-name. We installed the ALB Integration available from here and the domain column has wrong data: -" 0 2019-05-29T10:49:59.092000Z "fixed-response.

The trace_id column has 1-5ced0c7a-2cb9b8c052ba2620bbaf3480" "example.com inside tho.

Maybe an issue in the regex used for parsing?

Ingest non-json output in cloudwatch

Any logs that we have control over in Cloudwatch are in json, but some things inevitably get dumped into it that are not formatted. Stacktraces, etc.

Could functionality be added to ingest non-json into a specified field so that we can still search it?

"Event dropped due to sampling and response body"

Attempting to send ALB logs. I'm seeing this error message with HONEYCOMB_DEBUG=true

level=error msg="Error sending event to Honeycomb! had code 0, err event dropped due to sampling and response body "

New to honeycomb. Not sure what this means. Did I misconfigure something?

I used the CloudFormation template. The created Lambda function uses the default REGEX_PATTERN:

(?P<type>[^ ]+) (?P<timestamp>[^ ]+) (?P<elb>[^ ]+) (?P<client>[^ ]+) (?P<target>[^ ]+) (?P<request_processing_time>[^ ]+) (?P<target_processing_time>[^ ]+) (?P<response_processing_time>[^ ]+) (?P<elb_status_code>[^ ]+) (?P<target_status_code>[^ ]+) (?P<received_bytes>[^ ]+) (?P<sent_bytes>[^ ]+) "(?P<request>[^"]+)" "(?P<user_agent>[^"]+)" (?P<ssl_cipher>[^ ]+) (?P<ssl_protocol>[^ ]+) (?P<target_group_arn>[^ ]+) "Root=(?P<trace_id>[^"]+)" "(?P<domain_name>[^"]+)" "(?P<chosen_cert_arn>[^"]+)" (?P<matched_rule_priority>[^ ]+) (?P<request_creation_time>[^ ]+) "(?P<actions_executed>[^"]+)" "(?P<redirect_url>[^"]+)" "(?P<error_reason>[^"]+)" "(?P<target_list>[^"]+)" "(?P<target_status_code_list>[^"]+)"

MATCH_PATTERNS: AWSLogs/.*
SAMPLE_RATE: 100

s3-handler: support SNS as an alternative to SQS

Allow the s3-handler to read messages from SNS instead of just SQS.

Requested via Pollinators (link)

Add support for CloudTrail logs

Is your feature request related to a problem? Please describe.

CloudTrail logs can be sent to S3 or CloudWatch, both of which are supported by agentless. However, that fact is not clear without proper documentation and support. Instead, it appears that honeyaws is the only option to pull CloudTrail logs into Honeycomb.

Describe the solution you'd like

Adding a template and documentation for using agentless to ingest CloudTrail logs to Honeycomb.

Describe alternatives you've considered

Additional context

Lambda function can't read S3 logs from Kinesis Firehose

I've set up a WAF with logging per these instructions, with logs being written to an S3 bucket. I'm trying to pipe those logs into Honeycomb.

Kinesis gives you the option to GZIP compress logs, but both settings do not work:

Uncompressed: Logs are stored with Content-Type: application/octet-stream
Compressed: Logs are stored Content-Type: application/octet-stream and Content-Encoding: gzip.

The S3 log parser uses the AWS SDK to download the log file. The SDK will download uncompressed logs in either case, because the Content-Encoding header means it transparently gets unzipped.

The problematic code is here: https://github.com/honeycombio/agentless-integrations-for-aws/blob/master/s3-handler/main.go#L64

This always crashes with "unable to create gzip reader for object". It's buggy for both settings:

Uncompressed: The application/octet-stream header is assumed to mean its a GZIP file. I don't understand why Kinesis Firehose doesn't use a better header, but I think it's wrong to assume it's a GZIP based on this. Crashes on line 72.
Compressed: The Content-Encoding header is transparently stripped. We should go to line 65 here and everything should work but the client doesn't get to see the Content-Encoding header and will crash on line 72.

Any help you can give will be very gratefully received.

Dynamic Sampling

The honeyaws package supports dynamic sampling for ELB and ALB logs. This allows us to see all errors which we return to clients which are rare relative to successes. Porting that feature over to this would be really beneficial.

Verify gzip handling

Fallout from investigating #96 which was fixed with #97

The s3-handler assumes compressed objects received from S3 are uncompressed, because the aws-sdk does that automatically when the object has correct content-encoding (gzip). With ALB access logs, the files were compressed but were missing the content-encoding, which resulted in the ALB integration not parsing anything from the log files.

Verify and potentially apply the same fix to other integrations using the s3-handler:

ELB logs
S3 bucket logs
S3 logs JSON

s3-handler: Add NLB logs support

Add NLB logs support to the s3-handler lambda.

Requested via Pollinators (link)

Build ARM binary as part of publish workflow

It would be good to offer an alternative binary that targets ARM64 in addition to X86 for anyone who prefers to run with that arch. As per @lizthegrey's comment below, the performance is negligible between the two architectures is after sampling with with network I/O being the biggest cost.

This issue is to track building and publishing an ARM64 binary & AWS CloudFormation template.

RDS Integration for Cloudwatch Logs MySQL Handler fails silently

Hey 👋

I noticed this when I invalidated my API key by accident, the logs stopped coming in, but the Lambda function was succeeding. There wasn't even any error logs.

Looking at the code, the problem does not seem limited to authentication errors; here is the API call to send events on the MySQL Handler:

https://github.com/honeycombio/agentless-integrations-for-aws/blob/master/mysql-handler/main.go#L84

This function return an error, but it is ignored:

agentless-integrations-for-aws/vendor/github.com/honeycombio/libhoney-go/libhoney.go

Line 750 in 1744fea

func (e *Event) SendPresampled() (err error) {

The error should certainly be logged. I guess it's arguable if the lambda function should:

a. fail if any error happens
b. succeed independently of the number of errors (these are parallel goroutines)
c. fail if a certain % of goroutines fail?

Also, I think (but I didn't look at it) that the PostgreSQL handler has the same issue, because when testing it I remember that it didn't log anything and the calls were succeeding, although the parsing was failing (later I learned from the Honeycomb team that the Agentless integration for PostgreSQL is deprecated and the rdslogs binary should be used instead).

ALB support

It looks like this supports ingesting ELB access logs, but not ALB access logs.

Support provided.al2 runtime in place of deprecated Go1.x

Is your feature request related to a problem? Please describe.

deprecated Go1.x runtime

Describe the solution you'd like

Set Lambda runtime as provided.al2 and not get runtime errors.

relates to #142

Using the provided.al2 runtime offers several benefits over the go1.x runtime. First, it supports running Lambda functions on AWS Graviton2 processors, offering up to 34% better price-performance compared to functions running on x86_64 processors. Second, it offers a streamlined implementation with a smaller deployment package and faster function invoke path.

https://aws.amazon.com/blogs/compute/migrating-aws-lambda-functions-from-the-go1-x-runtime-to-the-custom-runtime-on-amazon-linux-2/

Describe alternatives you've considered

n/a

Additional context

Errors encountered when attempting to use provided.al2 runtime in lieu of Go1.x

RequestId: 08093629-5cdb-4caa-bf83-cfab152ad2bd Error: Couldn't find valid bootstrap(s): [/var/task/bootstrap /opt/bootstrap]
Runtime.InvalidEntrypoint

Add support for mysql audit logs

Is your feature request related to a problem? Please describe.
Currently audit logs are not supported for mysql RDS databases because the log format is CSV.

Describe the solution you'd like
Add supported for parsing audit logs with the honeytail CSV parser.

Additional Context
Example of audit log support in rdslogs

cloudwatch-handler currently errors on START/END/REPORT lines; issue seems to be fixed in publisher. Is this intentional?

Hello! I've been working with the generic JSON CloudWatch integration, which I believe is in the agentless-integrations-for-aws/cloudwatch-handler repo. It currently errors when handling Lambda's START/END/REPORT lines.

I'm filing this as a question because I found a fix in the agentless-integrations-for-aws/publisher repo, so I wasn't sure if this was perhaps an intentional choice. If it's unwanted behavior, the lines below from publisher seem like they would fix the issue. There's also a publisher_test in the publisher folder that covers this behavior:

line 17

var lambdaReportLineRegex = regexp.MustCompile(`^(START|END|REPORT) RequestId:.+`)

lines 74-79

                // Skip default lambda START, END, REPORT log lines to avoid "unable to parse"
		// spam below, which can get expensive
		match := lambdaReportLineRegex.FindString(event.Message)
		if match != "" {
			continue
		}

I wanted to add:

This is being used in a personal/portfolio project. There's no urgency for this change on my end.
I am very grateful for this software and for all the efforts of those involved. It is not my intention for this request to be viewed as antagonism towards any developer responsible for the code. Thank you for all your hard work.
I added the suggested fix out of a desire to be helpful. I apologize in advance if it's unwelcome.

De-dupe docs

There are currently duplicate docs in the readme and the public docs. This makes it hard to keep them consistent. We have a precedent of moving the majority of usage documentation to the public docs, and referencing them in the readme. We should do this here as well.

Update to cimg/go image and go 1.16.x

Terraform example leads to sporadic errors

time="2018-12-17T00:27:43Z" level=warning msg="Couldn't find specified time field\n Please refer to https://honeycomb.io/docs/json#timestamp-parsing" time_field="start_time" time_value=<nil>

Seeing this in the function logs at times.

Enable Travis CI build/deploy

To make contributions easier without regressions, please enable Travis CI for build and deploy.

I assume the build part for Go is quite straightforward.

The deploy part is linking the following parts together:

deploy when a new Git tag is set. Example: https://github.com/terra-farm/terraform-provider-virtualbox/blob/040fd0fec3dcc7ac6a7771662135166003871c2e/.travis.yml#L48-L50
Use the AWS S3 deployer: https://docs.travis-ci.com/user/deployment/s3/
Set the correct AWS region: https://docs.travis-ci.com/user/deployment/s3/#s3-bucket-regions
Deploy to multiple buckets: https://docs.travis-ci.com/user/deployment/s3/#deploy-to-multiple-buckets

patch openSSL CVE

https://mta.openssl.org/pipermail/openssl-announce/2022-March/000219.html

build artifacts: ~~circleci/golang image is based on golang image which is based on alpine - waiting on the same update~~
- use newer circle images cimg/go:1.18.1

Allow masking values based on regex

Is your feature request related to a problem? Please describe.

In some environments, the request path in an ALB log may include personal information as a querystring or path element. The filter today throws the baby out with the bathwater if you get rid of the whole request field all the time.

This is important for complying with EU and possibly California privacy rules.

Describe the solution you'd like

Replace a sensitive value with "X" based on regex.

Describe alternatives you've considered

There are a few alternatives:

Drop the field if it has a suspicious value.
Output in OTEL format so a collector can be used to run the redaction or filter procressor.

Additional context

s3-handler doesn't provide enough configuration for ALB logs to be correctly imported

Is your feature request related to a problem? Please describe.
ALB logs don't show up properly in honeycomb when you follow the instructions here: https://docs.honeycomb.io/getting-data-in/integrations/aws/aws-application-load-balancer/

There are number of issues with the events that are submitted following the official documentation:

no valid duration field
parsed fields interfere with whatever existing schema you have
spans lack proper tracing fields

Describe the solution you'd like

I'd like to use environment variables like these to configure more reasonable behavior:

{
        "PARSER_TYPE": "regex",
        # https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md
        "FIELD_NAME_ALIAS_JSON": json.encode({
            "serviceName": "aws.alb.elb",
            "name": "aws.alb.request",
            "http.user_agent": "aws.alb.user_agent",
            "http.status_code": "aws.alb.elb_status_code",
            "http.method": "aws.alb.request_method",
            "http.url": "aws.alb.request_uri",
        }),
        "TIME_FIELD_NAME": "aws.alb.request_creation_time",
        "TIME_FIELD_FORMAT": "2006-01-02T15:04:05.9999Z",
        "REQUEST_FIELD_NAME": "aws.alb.request",
        "FORCE_GUNZIP": "true",
        "HONEYCOMB_DEBUG": "true",
        "DURATION_END_FIELD_NAME": "aws.alb.time",
        "DURATION_END_FORMAT": "2006-01-02T15:04:05.9999Z",
        "AMZN_TRACE_ID_FIELD_NAME": "aws.alb.amzn_trace_id",
        "PARSED_FIELD_NAME_PREFIX": "aws.alb.",
        "REGEX_PATTERN": " ".join([
            "(?P<type>[^ ]+)",
            "(?P<time>[^ ]+)",
            "(?P<elb>[^ ]+)",
            "(?P<client>[^ ]+)",
            "(?P<target>[^ ]+)",
            "(?P<request_processing_time>[^ ]+)",
            "(?P<target_processing_time>[^ ]+)",
            "(?P<response_processing_time>[^ ]+)",
            "(?P<elb_status_code>[^ ]+)",
            "(?P<target_status_code>[^ ]+)",
            "(?P<received_bytes>[^ ]+)",
            "(?P<sent_bytes>[^ ]+)",
            '"(?P<request>[^"]+)"',
            '"(?P<user_agent>[^"]+)"',
            "(?P<ssl_cipher>[^ ]+)",
            "(?P<ssl_protocol>[^ ]+)",
            "(?P<target_group_arn>[^ ]+)",
            '"(?P<amzn_trace_id>[^"]+)"',
            '"(?P<domain_name>[^"]+)"',
            '"(?P<chosen_cert_arn>[^"]+)"',
            "(?P<matched_rule_priority>[^ ]+)",
            "(?P<request_creation_time>[^ ]+)",
            '"(?P<actions_executed>[^"]+)"',
            '"(?P<redirect_url>[^"]+)"',
            '"(?P<error_reason>[^"]+)"',
            '"(?P<target_list>[^"]+)"',
            '"(?P<target_status_code_list>[^"]+)"',
        ]),
 }

Here's a patch that adds support for these environment variables to the existing main branch (8f7d958): https://gist.github.com/ajbouh/462b1126e28f2755aeaf085d9b61ed17

Allow API token to be pulled from SSM Secure String

Is your feature request related to a problem? Please describe.
If I elect to use the Honeycomb log forwarding lambda, not only can it be performed with standard AWS Terraform resources, there are even starter code samples available for doing this in this repository, which I am very grateful for. That said, part of those instructions include instructions for optionally encrypting the API key, with the express declaration that this is recommended. Doing this requires first creating a KMS key, and then using it to generate the encrypted payload, and providing that key ID in the environment variables for the lambda.

When designing my infrastructure, I aimed to use Terraform from start-to-finish, with no subsequent steps required for production readiness. That is to say, I want to run terraform apply once, and have everything taken care of when it's done. This precludes me from being able to follow these best practices, because there is no way in Terraform to generate a KMS key, and then use that key to encrypt a value. I have to encrypt the value in the command line, which inherently introduces more than one step in the terraform deploy.

Describe the solution you'd like
I'd like to use a SecureString SSM parameter instead. This way the key is still encrypted at rest, and we can use the SSM client to decrypt it upon request. I can create a KMS key to encrypt a secure string parameter AND I can set the encrypted value of that parameter with SSM SecureStrings.

Describe alternatives you've considered
The alternative is to abandon my ardent desire to have one stage to this deploy.

Additional context
I wanted to disclose that I am:

1.) I am more than happy to implement this feature myself, and have held off from doing so because the OSS guide suggested it.
2.) I am very grateful for this software and for all the efforts of those involved. It is not my intention for this request to be viewed as antagonism towards any developer responsible for the code. Thank you for all your hard work.
3.) Additionally, I am open to the idea that there is a better way to do this, and look forward to discussing the appropriate solution.

Document the regexps now that they're embedded in Go code

We just merged #150 which moved a bunch of complex regexps into the Go code, where they had previously been in human-editable strings.

We should

vet them for correctness (in particular, the ELB one is sketch)
document them in more detail
consider some level of making them more maintainable

Please see review comments on #150 for more.

Allow adding fields in cloudwatch-handler

Is your feature request related to a problem? Please describe.

We ingest cloudwatch-based EKS audit trails from multiple EKS instances, and would love to have a way to discern between the different clusters.

Describe the solution you'd like

The honeycomb k8s integration has processors that can add fields to events - having something similar in the cloudwatch handler would be really cool.

Describe alternatives you've considered

We could ingest these events into different datasets, I guess? Or start tagging everything with env, but that's not extremely expressive - especially as we need multiple labels to tell the clusters apart (purpose and region, for one).

Additional context

Add CloudFront Access Logs Support

Is your feature request related to a problem? Please describe.

CloudFront Access logs can be sent to S3 which is supported by agentless. However, that fact is not clear without proper documentation and support. Instead, it appears that honeyaws is the only option to pull CloudFront logs into Honeycomb.

Describe the solution you'd like

Adding a template and documentation for using agentless to ingest CloudFront Access Logs to Honeycomb.

Describe alternatives you've considered

Additional context

Cloudwatch Handler should filter out timestamp + invocation ID from logs

In NodeJS-based Lambda, AWS has patched console.log to include a timestamp and invocation ID. We should handle this by stripping it out of the message before handing it off to the parser.

Add support for configuring integrations using environment variables

Is your feature request related to a problem? Please describe.
To add flexibility when configuring integrations, it might be desirable to support configuration via environment variables.

see #95 for more details

Describe the solution you'd like
Support configuring integrations using environment variables.

NOTE: Other integrations may also benefit.

Describe alternatives you've considered

Additional context

Add DEBUG mode

When you have a bad API key, the integrations fail silently, because we don't currently have any response queue handling. Let's add a HONEYCOMB_DEBUG mode like we did in the Lambda extension

Open questions:

should this be an env var in the handler function?
should this be a toggle/option in the cloud formation template?

TODO:

add debug mode
add docs about debug mode

honeycombio / agentless-integrations-for-aws Goto Github PK

agentless-integrations-for-aws's People

Contributors

Stargazers

Watchers

Forkers

agentless-integrations-for-aws's Issues

Recommend Projects

Recommend Topics

Recommend Org