The foxsec-pipeline's discuss from mozilla-services

Make IprepdIO.Write timeouts configurable

#305 adds timeouts to the iprepd writer IO transform, but they are hardcoded. These should be configurable with reasonable defaults.

Replace default trigger in Customs

#289 modifies the windowing strategy used in Customs, but makes use of a default trigger.

This should be further enhanced with early pane firings, which could include potentially dealing with suppressing the on-time pane if needed to prevent duplicates.

Develop standardized method for packing an array of values into a metadata key

Integrate code coverage tooling

It would be nice to be able to view metrics on how our code coverage is doing across our unit tests. If we have a tool that can support multiple languages to cover contrib, that would be nice, but making sure can see the % coverage of the Java code is the more important piece.

Expand existing identity class to handle AWS account resolution

The existing identity class can lookup for example a global user ID given aliases

It would be good to expand this for other lookups, for example AWS account IDs to a descriptive name to contain this in a single configuration.

Create mock service using foxsec-pipeline-contrib to test interacting with cloud functions

https://github.com/mozilla-services/foxsec-pipeline-contrib contains a few cloudfunctions that it would be nice to include within the integration tests in this repo.

Since Cloudfunctions written in Go are just libraries, we could create a simple mock service that simulates both data ingestions (Duopull/Auth0pull) and interacting with the pipeline (SlackbotBackground).

Would be especially useful for testing parsing and data model changes.

Add support for loading runtime secrets from GCS

Rather than pass an encrypted string via pipeline options, pass a GCS path that will contain the path to the encrypted string

Include local time zone in AuthProfile alerts

In addition to the UTC timestamp, also include the localized timestamp based on the source address in the alert metadata and the alert payload (e.g., render it in the template/Slack notifications)

Implement escalation metadata and alert suppression in Gatekeeper pipeline

#174 adds a pipeline for monitoring output from ETD and GD.

In the current state, the pipeline will generate alerts but it does not currently add any escalation metadata to the alerts that will result in special handling in AlertIO.

Example from the authentication pipeline:

foxsec-pipeline/src/main/java/com/mozilla/secops/authprofile/AuthProfile.java

Lines 255 to 263 in 7a27d94

    
           private void addEscalationMetadata(Alert a) { 
        
             if (critNotifyEmail != null) { 
        
               log.info( 
        
                   "{}: adding direct email notification metadata route for critical object alert to {}", 
        
                   a.getAlertId().toString(), 
        
                   critNotifyEmail); 
        
               a.addMetadata("notify_email_direct", critNotifyEmail); 
        
             } 
        
           }

We will want to add a similar notification pipeline option to the gatekeeper pipeline and include this metadata option when we generate an alert if the option is set in the configuration, if the option is not set the pipeline will behave as it currently does.

We will also want to potentially look at hooking AlertSuppressor up to the end of the analysis transforms and suppressing possible repeated alerts to avoid generating a large number of escalation notifications under certain circumstances.

Include GCP Project Name and AWS Account Name in Gatekeeper alerts

It would be nice to include the AWS account name / the GCP project name in the Gatekeeper alerts for easier triaging.

For AWS, the IdentityManager can be used to translate the account id to account name:

foxsec-pipeline/src/main/java/com/mozilla/secops/parser/Cloudtrail.java

Lines 80 to 92 in 3f68f65

    
           IdentityManager mgr = state.getParser().getIdentityManager(); 
        
           if (mgr != null) { 
        
             String resId = mgr.lookupAlias(getUser()); 
        
             if (resId != null) { 
        
               n.setSubjectUserIdentity(resId); 
        
             } 
        
             Map<String, String> m = mgr.getAwsAccountMap(); 
        
             String accountName = m.get(event.getRecipientAccountId()); 
        
             if (accountName != null) { 
        
               n.setObject(accountName); 
        
             } 
        
           }

For GCP, projects.get (or similar) can be used during alert creation to grab the project's metadata.

Add resource related metadata to GuardDuty alerts

Add EC2 instance ID, owner, vpc, etc to GuardDuty alerts.
This data is already available within the Finding objects

Add resource related metadata to GuardDuty alerts

Add EC2 instance ID, etc to GuardDuty alerts.
This data is already available within the Finding objects

AuthProfile should use email templates in all circumstances

The current version of the AuthProfile pipeline only uses email templates for a new source, but does not use it for informational alerts.

To be consistent it would be good to just support this across the board in the pipeline, but it would require an additional template.

Include link to AWS Guardduty Finding Type docs in Bugzilla alert text

As an example, have a link to https://docs.aws.amazon.com/guardduty/latest/ug/guardduty_stealth.html within the alert body (https://bugzilla.mozilla.org/show_bug.cgi?id=1626813#c35) since the finding type is Stealth:IAMUser/CloudTrailLoggingDisabled

Add CIDR exclusions to HTTPRequest

Write tests for IprepdIO

#77 (comment)

Improve Gatekeeper's GuardDuty Finding metadata

Currently, we are only tagging gatekeeper alerts with the base metadata on guardduty findings. Some finding types contain a lot more relevant data.

For example, the ssh brute force finding type and recon types contain an "Actor" object, which contains the bad actor IP, location (latitude + longitude), the ISP, etc.

This issue encompasses adding metadata on a per-finding-type basis to alerts

Add EC2 instance name (app tag) whitelist support to Gatekeeper Pipeline

So far Guard Duty has been pretty good at classifying ALL network connection type (brute force attempts, port scans, etc) as low severity.

However, it generates many high severity "DNS REQUEST" type findings for EC2 instances querying bad domains (crypto, phishing, malware, etc).

These are NOT false positives, but in most cases we don't care about them because they are expected due to the nature of the running host.

For example: we run a ton of web crawlers to asses the state of the web in general, etc, these hit those bad domains all the time, and we dont care.

We should have a way of suppressing these type of alerts for these EC2 instances. That should not be hard given the fact that the ec2 instance tags are contained in the finding object.

CC:// @ameihm0912 @ajvb

Persist ALL events somewhere (e.g. BigQuery)

We aren't persisting anything outside of the built-in persistence options of the data sources we read from.

Fulfilling this issue involves storing all generic events somewhere

CODE_OF_CONDUCT.md file missing

As of January 1 2019, Mozilla requires that all GitHub projects include this CODE_OF_CONDUCT.md file in the project root. The file has two parts:

Required Text - All text under the headings Community Participation Guidelines and How to Report, are required, and should not be altered.
Optional Text - The Project Specific Etiquette heading provides a space to speak more specifically about ways people can work effectively and inclusively together. Some examples of those can be found on the Firefox Debugger project, and Common Voice. (The optional part is commented out in the raw template file, and will not be visible until you modify and uncomment that part.)

If you have any questions about this file, or Code of Conduct policies and procedures, please reach out to [email protected].

(Message COC001)

Update nginx parser to support parsing raw nginx logs

The parser currently only supports processing nginx log data in the form of a Stackdriver jsonPayload entry. This should be expanded to also support raw nginx log lines (either in textPayload or on their own).

Include experimental flag in Customs transform docs if not escalating

If a particular transform in Customs has escalation disabled, include (experimental) at the end of the transform doc string

Include query link with AlertSummary notification

Provide a link with notification that results in a query to pull up applicable alerts.

Refine single selector customs transforms with variance component

Several customs transforms should be refined with a variance component including:

CustomsPasswordResetAbuse
CustomsAccountCreation
SourceLoginFailure

to work around issues associated with large residential/mobile NAT pools.

Consider using Google's Java style guide + tools

Style Guide: https://google.github.io/styleguide/javaguide.html
CLI tool: https://github.com/google/google-java-format
Vim plugin: https://github.com/google/vim-codefmt

I have no strong feelings about this style guide, but it would be nice to be able to use the vim plugin to support automatic and consistent formatting.

Use iprepd whitelist for other types besides just IP

Currently, the code the checks if an entry has been whitelisted for iprepd only works with IP addresses; this should be expanded so it works with other types as well (e.g., email).

Depends on mozilla-services/foxsec-pipeline-contrib#21

Add optional minimum distance toggle for customs velocity alerts

The alerts are purely based on exceeding a particular distance over time right now.

The toggle would allow for an optional minimum distance to for example avoid creating alerts if the velocity is exceeded and the distance was only 50km.

Look into alerting subnet summarization

Where addresses within alerts can be reasonably correlated to belonging to the same subnet, for example the same /24, add support to alerting output to potentially generate a secondary alert indicating a probable bad subnet.

This would essentially be something like a reduction operation to output a subnet given an set of input elements.

Reload AuthProfile identity manager cfg without pipeline restart

AuthProfile currently requires being restarted to reload a new identity manager configuration; this could be improved by setting a per-thread timer and reloading within @ProcessElement after a certain expiration time.

If possible, store UA in AuthStateModel.ModelEntry

If known at the time, store UA in AuthStateModel.ModelEntry for state storage

Support injection of manually entered text blocks in cfgtick

Currently cfgtick contains runtime options and transform documentation; it would be good in some cases to support the addition of arbitrary text blocks (potentially via configuration options) as well

Convert pipelines to support templating

Reduce volume of KMS decryption calls during pipeline scaling

When pipelines scale up, it will result in the creation of new worker nodes and additional threads calling setup in certain DoFns. In some cases when the number of new workers is large, it can result in exceptions being generated in the pipeline due to quota limits being hit.

 java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: com.google.api.gax.rpc.ResourceExhaustedException: io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: Quota exceeded for quota metric 'cloudkms.googleapis.com/crypto_requests' and limit 'CryptoRequestsPerMinutePerProject' of service 'cloudkms.googleapis.com' for consumer

Although it generally recovers eventually this causes errors to be propagated by the pipeline and likely slows down scaling responsiveness.

Persist Guardduty alerts to either Bigquery or GCS

As we discussed, it would be helpful to persist all Guardduty alerts in there raw json to either a separate BigQuery table or to GCS.

The goal is to be able to dig into alerts easily during an investigation.

Normalized event tags should be an array

Currently events can be normalized into a single category (e.g., AUTH), but in some cases a single event may represent more than one category type.

Ideal scenario would be the normalized category field being a bitmask or array of values that can represent more than one type.

Alert post-processing pipeline

New pipeline, intended to:

Use standard composite input (read input topic containing Alert JSON strings)
Use standard composite output (e.g., BQ)

Initially it would be ideal for this to consume a topic containing Alert JSON strings, parse them into Alert objects, and push these objects into an analysis step.

The analysis step will look for metadata fields that for example match a given regex, and where a match occurs a new alert will be generated. This alert will contain a metadata field containing the ID of the source alert which it matched on.

This new pipeline will also be useful for future correlation between pipeline alerting output.

Remove legacy whitelisted IP's state code

There is code in IprepdIO to support the transition to a new Datastore format for whitelisted objects, i.e.:

foxsec-pipeline/src/main/java/com/mozilla/secops/IprepdIO.java

Lines 584 to 588 in 56045c4

    
           /** Legacy Kind for whitelisted IP entry in Datastore */ 
        
           public static final String legacyWhitelistedIpKind = "whitelisted_ip"; 
        
           /** Legacy Namespace for whitelisted IP in Datastore */ 
        
           public static final String legacyWhitelistedIpNamespace = "whitelisted_ip";

This can be removed now

Make AlertMeta more strictly defined

Make AlertMeta more strictly defined rather than the current free form key strings everywhere.

Currently there are free form ("magic strings") everywhere for metedata key names. This is also present throughout the cloud functions (https://github.com/mozilla-services/foxsec-pipeline/blob/master/contrib/bugzilla-alert-manager/manager.go#L121).

We should make these keys more strictly defined and do this in a way that can be tested and checked for both the Java and Go code.

Related to #182

Reopen closed bugs in Bugzilla alert manager

If the function is updating a closed bug with new information, it should be reopened as part of the update

Create new encapsulation class to contain Alert and other object types

The output path of the pipelines is currently limited to ingestion of Alert objects.

This makes it difficult to persist other types of data from the pipeline that are not neccessarily alerts but we may want to write to BigQuery. This is discussed a bit in #320.

The intent would be to modify the output path to consume this new container event, which could contain an alert or some other type of object (like a source summary, etc). Alert objects would be handled the way way they currently are once the encapsulation is stripped, but this would provide a means to produce other types of pipeline output.

Support fan out push in IprepdIO

IprepdIO currently only supports submission to a single instance

This should be modified so multiple iprepd instances can be configured, and the transforms will publish to all configured instances

Create Papertrail IO transform

API: https://help.papertrailapp.com/kb/how-it-works/search-api/

Some past work in this area (Lua): https://github.com/mozilla-services/lua_sandbox_extensions/tree/master/papertrail/sandboxes/heka/input

Improve exception filtering in HTTPRequest

The exception filtering currently filters any request of a given method and path; extend this to only filter if the request is either 200, or >= 500, but do not filter 4xx errors.

Unpin maven docker image version

The maven docker image version was pinned in 9c50679 to work around an openjdk bug introduced recently related with relative classpath loading that causes surefire to crash

A workaround is available, but rather then work around it for now this should be reverted once a fixed image is available.

Write integration tests between the pipeline and cloud functions in contrib/

We should write tooling to support integration tests between the cloud functions in contrib/ and the pipeline code.

Some examples:

Test that Guardduty findings get processed through Gatekeeper, get sent to bugzilla-alert-manager, and create a new bug
Test that a new auth event for an unknown IP gets an alert from Authprofile and gets sent to slackbot-background

There is currently not emulator tool for Cloud Functions written in Go, but these functions are written as a library, so wiring up our own runner for them should be easy enough (this is also how some of the unit tests work for the cloud functions).

Customs velocity alerts should capture previous/current geo info

The alerts currently only store the new location in the metadata. Modify to also store the previous location.

Use event timestamps everywhere

Within our parsing logic, we make use of the pubsub timestamp rather than the parsed events timestamp (Parser.stripStackdriverEncapsulation). This creates problems in the case of old messages getting backfilled into pubsub for whatever reason and causes a disconnect between our tests and production.

Instead, we should make use of the parsed out event timestamp.

	private void addEscalationMetadata(Alert a) {
	if (critNotifyEmail != null) {
	log.info(
	"{}: adding direct email notification metadata route for critical object alert to {}",
	a.getAlertId().toString(),
	critNotifyEmail);
	a.addMetadata("notify_email_direct", critNotifyEmail);
	}
	}

	IdentityManager mgr = state.getParser().getIdentityManager();
	if (mgr != null) {
	String resId = mgr.lookupAlias(getUser());
	if (resId != null) {
	n.setSubjectUserIdentity(resId);
	}

	Map<String, String> m = mgr.getAwsAccountMap();
	String accountName = m.get(event.getRecipientAccountId());
	if (accountName != null) {
	n.setObject(accountName);
	}
	}

	/** Legacy Kind for whitelisted IP entry in Datastore */
	public static final String legacyWhitelistedIpKind = "whitelisted_ip";

	/** Legacy Namespace for whitelisted IP in Datastore */
	public static final String legacyWhitelistedIpNamespace = "whitelisted_ip";

mozilla-services / foxsec-pipeline Goto Github PK

foxsec-pipeline's Issues

Recommend Projects

Recommend Topics

Recommend Org