Giter Site home page Giter Site logo

datagov-brokerpak-eks's People

Contributors

adborden avatar bengerman13 avatar fuhuxia avatar mogul avatar nickumia-reisys avatar srinirei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datagov-brokerpak-eks's Issues

Security Policy violation Repository Administrators

This issue was automatically created by Allstar.

Security Policy Violation
Users are not allowed to be administrators of this repository.
Instead a team should be added as administrator.

To add a team as administrator From the main page of the repository, go to Settings -> Manage Access.
(For more information, see https://docs.github.com/en/organizations/managing-access-to-your-organizations-repositories)


Issued created by GSA-TTS Allstar

This issue will auto resolve when the policy is in compliance.

Issue created by Allstar. See https://github.com/ossf/allstar/ for more information. For questions specific to the repository, please contact the owner or maintainer.

Migrate from PodSecurityPolicy (PSP) to Pod Security Standards (PSS)

Key points:

  • Kubernetes version 1.21 to 1.25
  • PodSecurityPolicy (PSP) to built-in Kubernetes Pod Security Standards (PSS)

This might be a no-op. But creating an issue since I haven't worked with EKS recently enough to remember the setup. The following is an email from AWS,

  • What is changing?
    PodSecurityPolicy (PSP) was deprecated [1] in Kubernetes version 1.21 and has been removed in Kubernetes version 1.25 [2]. If you are using PSPs in your cluster, then you must migrate from PSP to the built-in Kubernetes Pod Security Standards (PSS) or to a policy as code solution before upgrading your cluster to version 1.25 to avoid interruption to your workloads.
  • What actions can customers take?
    PSP resources were used to specify a set of requirements that pods had to meet before they could be created. Since PSPs have been removed in Kubernetes version 1.25, you must replace those security controls. Two solutions can fill this need:
  1. Kubernetes Pod Security Standards (PSS)
  2. Policy-as-code solutions from the Kubernetes ecosystem

In response to the PSP deprecation and the ongoing need to control pod security out-of-the-box, the Kubernetes community created a built-in solution with PSS [3] and Pod Security Admission (PSA) [4]. The PSA webhook implements the controls defined in the PSS. To review best practices for migrating PSPs to the built-in Pod Security Standards, see references [5] and [6].

Policy-as-code solutions provide guardrails to guide cluster users, and prevent unwanted behaviors, through prescribed and automated controls. Policy-as-code solutions typically use Kubernetes Dynamic Admission Controllers to intercept the Kubernetes API server request flow, via a webhook call, and mutate and validate request payloads, based on policies written and stored as code. There are several open source policy-as-code solutions available for Kubernetes. To review best practices for migrating PSPs to a policy-as-code solution, see reference [7].

You can run the following command to view the PSPs in your cluster: kubectl get psp. If you see the eks.privileged PSP in your cluster, it will be automatically migrated to PSS by Amazon EKS. No action is needed on your part.

To summarize, if you are using PSP in your cluster, then you must migrate from PSP to the built-in Kubernetes PSS or to a policy as code solution before upgrading your cluster to version 1.25 to avoid interruptions to your workloads. EKS offers best practices for pod security and guidance for implementing pod security standards [8]. You can find details on PSP Migration in EKS documentation [1].

If you have any questions or concerns, please reach out to AWS Support [9].

[1] https://docs.aws.amazon.com/eks/latest/userguide/pod-security-policy-removal-faq.html
[2] https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html#kubernetes-release-calendar
[3] https://kubernetes.io/docs/concepts/security/pod-security-standards/
[4] https://kubernetes.io/docs/concepts/security/pod-security-admission/
[5] https://aws.github.io/aws-eks-best-practices/security/docs/pods/#pod-security-standards-pss-and-pod-security-admission-psa
[6] https://kubernetes.io/docs/tasks/configure-pod-container/migrate-from-psp/
[7] https://aws.github.io/aws-eks-best-practices/security/docs/pods/#policy-as-code-pac
[8] https://aws.amazon.com/blogs/containers/implementing-pod-security-standards-in-amazon-eks/
[9] https://aws.amazon.com/support

Sincerely,
Amazon Web Services

Limit EKS log retention to 180 days

User Story

In order to avoid ballooning storage requirements for our logs, the data.gov team wants to cap EKS instance log retention at 180 days.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

  • GIVEN [a contextual precondition]
    [AND optionally another precondition]
    WHEN [a triggering event] happens
    THEN [a verifiable outcome]
    [AND optionally another verifiable outcome]

Background

[Any helpful contextual notes or links to artifacts/evidence, if needed]

Security Considerations (required)

[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!]

Sketch

[Notes or a checklist reflecting our understanding of the selected approach]

Route logs for EKS control plane and pods into Cloudtrail

User Story

In order to aggregate, analyze, and alert on logs from EKS instances, we want to configure EKS instances to send control plane and pod logs to Cloudtrail logs when we provision them.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

  • GIVEN I have provisioned an instance of the EKS service
    AND I am authenticated with the AWS Console for the SSB account
    WHEN I look at Cloudtrail
    THEN I see logs corresponding to the EKS instance control plane
    AND I see logs corresponding to the EKS data plane (workloads in Fargate)

Background

Necessary for meeting compliance controls. See the AU family of controls in particular.

Security Considerations (required)

Once this story is complete, we will be able to demonstrate full visibility of logging from provisioned EKS clusters via Cloudtrail. This meets some of the NIST compliance requirements for auditing.

Sketch

Here's how to do it for EKS and how to do it for pods. There are Terraform docs for this too.

Security Policy violation SECURITY.md

This issue was automatically created by Allstar.

Security Policy Violation
Security policy not enabled.
A SECURITY.md file can give users information about what constitutes a vulnerability and how to report one securely so that information about a bug is not publicly visible. Examples of secure reporting methods include using an issue tracker with private issue support, or encrypted email with a published key.

To fix this, add a SECURITY.md file that explains how to handle vulnerabilities found in your repository. Go to https://github.com/GSA-TTS/datagov-brokerpak-eks/security/policy to enable.

For more information, see https://docs.github.com/en/code-security/getting-started/adding-a-security-policy-to-your-repository.


Issued created by GSA-TTS Allstar

This issue will auto resolve when the policy is in compliance.

Issue created by Allstar. See https://github.com/ossf/allstar/ for more information. For questions specific to the repository, please contact the owner or maintainer.

Verify provisioned clusters pass the EKS CIS benchmark

User Story

In order to give auditors confidence that provisioned EKS clusters are following best-practices, we should be able to demonstrate that a provisioned cluster can pass the CIS EKS benchmark.

Acceptance Criteria

  • GIVEN I have installed the tree kubectl plugin
    WHEN I use kubectl tree on pods and nodes
    THEN I see resources that containing scanning results are present
  • WHEN I run kubectl get CISKubeBenchReport <nodename> -o wide
    THEN I see a report indicating no tests failed
  • WHEN I run kubectl get CISKubeBenchReport <nodename> -o yaml
    THEN I see a detailed report
  • WHEN I run make clean build up demo-up demo-test
    THEN I see that there is a test for a CISKubeBenchReport with zero FAIL results

Background

[Any helpful contextual notes or links to artifacts/evidence, if needed]

See also this GSA ISE hardening guide for EKS

Security Considerations (required)

This change will ensure that any new deployment of the eks-brokerpak will only deploy CIS-compliant instances of AWS EKS. This will bolster confidence in the configuration of the EKS instances we create.

Sketch

  1. Install the Aquasec starboard-operator
  2. Add lines at the end of the tests that check that the AWS EKS CIS benchmark had zero FAIL results
  3. Document how someone can check these reports on any existing instance

Note that AWS Security Hub can ingest kube-bench results. We may want to set this up if it turns out that we need to continuously report on existing instances, but it's probably out of scope for this story. Let's wait to see if it's required, and write that separate story when it's time.

Optimize ingress to use just one ALB per cluster

User Story

In order to reduce the cost of operating EKS cluster instances, and reduce dependence on a single k8s provider, the team would like to provision just a single ALB per AWS EKS cluster, rather than one per individual ingress.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

  • GIVEN I have provisioned an EKS instance
    AND I have deployed two ingresses
    AND I am authenticated with the AWS console
    WHEN I look at the AWS EC2 "Load Balancer" list
    THEN I see just one LB associated with the EKS cluster

Background

People using cloud.gov to provision and bind k8s service should, as much as possible, not care about and not refer to the underlying implementation when they use the service. This leaves the provider of the k8s service flexibility to use a different implementation (eg GCP or Azure instead of AWS) without customers of the service preventing migration.

Normally, using AWS EKS requires customers to know about and use the AWS-specific ingress annotations in order to make their deployments accessible to the outside world. By using a second level ingress controller based on the widely used and documented nginx-ingress, we can enable cross-provider specifications on ingress using labels and tags that will not need to be changed.

Using a secondary controller has the added benefit of requiring just a single AWS Load Balancer instance for all the workloads in the cluster, no matter how many. This in turn cuts down on the government's cost to run the service.

Security Considerations (required)

The secondary nginx-ingress controller is only accessible from within Fargate, and for traffic to reach that service, it must traverse the AWS ALB first. So no additional network exposure is implied here.

Because customers of the service cannot specify tags or labels that the ALB controller will act on, there is no way for customers to introduce a separate ingress to the cluster.

Sketch

There is precedent for setting up this architecture with AWS EKS and Fargate.

[Cost Improvements] Consolidate NAT Gateways per AZ

Email from AWS Support:

Hello,

We have observed that your Amazon VPC resources are using a shared NAT Gateway across multiple Availability Zones (AZ). To ensure high availability and minimize inter-AZ data transfer costs, we recommend utilizing separate NAT Gateways in each AZ and routing traffic locally within the same AZ.

Each NAT Gateway operates within a designated AZ and is built with redundancy in that zone only. As a result, if the NAT Gateway or AZ experiences failure, resources utilizing that NAT Gateway in other AZ(s) also get impacted. Additionally, routing traffic from one AZ to a NAT Gateway in a different AZ incurs additional inter-AZ data transfer charges. We recommend choosing a maintenance window for architecture changes in your Amazon VPC.

Security Policy violation SECURITY.md

This issue was automatically created by Allstar.

Security Policy Violation
Security policy not enabled.
A SECURITY.md file can give users information about what constitutes a vulnerability and how to report one securely so that information about a bug is not publicly visible. Examples of secure reporting methods include using an issue tracker with private issue support, or encrypted email with a published key.

To fix this, add a SECURITY.md file that explains how to handle vulnerabilities found in your repository. Go to https://github.com/GSA-TTS/datagov-brokerpak-eks/security/policy to enable.

For more information, see https://docs.github.com/en/code-security/getting-started/adding-a-security-policy-to-your-repository.


This issue will auto resolve when the policy is in compliance.

Issue created by Allstar. See https://github.com/ossf/allstar/ for more information. For questions specific to the repository, please contact the owner or maintainer.

2048.yml vulnerabilities

Date of report: 12/06/2022
Severity: Moderate and Low (not active in production)

Due date is based on severity and described in RA-5. 15-days for Critical, 30-days for High, and 90-days for Moderate and lower.

  • Container is running without root user control (Moderate)
    • Detailed paths
    • This issue is...
      • Container is running without root user control
    • The impact of this is...
      • Container could be running with full administrative privileges
    • You can resolve it by...
      • Set securityContext.runAsNonRoot to true
  • Container does not drop all default capabilities (Moderate)
    • Detailed paths
      • Introduced through: [DocId: 0] › input › spec › template › spec › containers[app-2048] › securityContext › capabilities › drop
    • This issue is...
      • All default capabilities are not explicitly dropped
    • The impact of this is...
      • Containers are running with potentially unnecessary privileges
    • You can resolve it by...
      • Add ALL to securityContext.capabilities.drop list, and add only required capabilities in securityContext.capabilities.add
  • Container is running without liveness probe (Low)
    • Detailed paths
      • Introduced through: [DocId: 0] › spec › template › spec › containers[app-2048] › livenessProbe
    • This issue is...
      • Liveness probe is not defined
    • The impact of this is...
      • Kubernetes will not be able to detect if application is able to service requests, and will not restart unhealthy pods
    • You can resolve it by...
      • Add livenessProbe attribute
  • Container is running with writable root filesystem (Low)
    • Detailed paths
      • Introduced through: [DocId: 0] › input › spec › template › spec › containers[app-2048] › securityContext › readOnlyRootFilesystem
      • This issue is...
        • readOnlyRootFilesystem attribute is not set to true
      • The impact of this is...
        • Compromised process could abuse writable root filesystem to elevate privileges
      • You can resolve it by...
        • Set securityContext.readOnlyRootFilesystemtotrue`
  • Container has no CPU limit (Low)
    • Detailed paths
      • Introduced through: [DocId: 0] › input › spec › template › spec › containers[app-2048] › resources › limits › cpu
    • This issue is...
      • Container has no CPU limit
    • The impact of this is...
      • CPU limits can prevent containers from consuming valuable compute time for no benefit (e.g. inefficient code) that might lead to unnecessary costs. It is advisable to also configure CPU requests to ensure application stability.
    • You can resolve it by...
      • Add resources.limits.cpu field with required CPU limit value
  • Container is running without memory limit (Low)
    • Detailed paths
    • Introduced through: [DocId: 0] › input › spec › template › spec › containers[app-2048] › resources › limits › memory
    • This issue is...
      • Memory limit is not defined
    • The impact of this is...
      • Containers without memory limits are more likely to be terminated when the node runs out of memory
    • You can resolve it by...
      • Set resources.limits.memory value

Funnel all app ingress through TLS

User Story

In order to ensure security from the outside world to our brokered cluster, we want provision TLS certificates with ACM and have the ingress ALB configured to use them.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

  • GIVEN I have provisioned an EKS instance
    AND I have deployed a sample workload (eg the 2048 game)
    WHEN I visit the URL listed in the kubernetes ingress for the sample workload
    THEN I see that I am redirected from http:// to https://
    AND I see that there is a valid certificate in place for the TLS connection.

Background

Federal compliance requires that we use TLS for any connection over the internet.

Security Considerations (required)

Implementing this story helps us comply with the SC family of NIST controls

Sketch

Here are the docs on setting up cert auto-discovery and redirecting HTTP to HTTPS.

Create a DNS entry in Route53 for each ingress

User Story

In order to make deployments addressable by the outside world, the EKS brokerpak should manage DNS entries in Route53 pointing to each ingress.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

  • GIVEN a provisioned EKS service instance
    AND a valid kubeconfig.yml for using the service instance
    AND the domain_name for the service instance
    AND I run kubectl --kubeconfig kubeconfig.yml apply -f terraform/provision/2048_fixture.yml
    AND I wait two minutes
    WHEN I visit https://ingress-2048.<k8sdomain>
    THEN I see the 2048 game.

Background

[Any helpful contextual notes or links to artifacts/evidence, if needed]

Security Considerations (required)

  • We must limit the ServiceAccount that external-dns uses to a role that can only manage records in the Route53 zone that corresponds to the specific cluster, and only for the domains expected of that cluster.
  • We also need to ensure these endpoints end up in a list exported for scanning by NetSparker (by dumping the zone).

Sketch

[Notes or a checklist reflecting our understanding of the selected approach]

Ensure all inter-pod traffic uses TLS

User Story

In order to have TLS on every network hop between the outside world and individual pods, we want EKS clusters configured to use AWS App Mesh and cert-manager.

Acceptance Criteria

  • GIVEN I have provisioned an EKS instance
    AND I have deployed the 2048 fixture
    AND I have accessed the 2048 application using my browser
    WHEN I run kubectl -n default exec -it ${2048_POD_NAME} -c envoy -- curl -s localhost:9901/stats | grep ssl.handshake
    THEN I see a non-zero count of ssl_handshake entries between the 2048 pod and the nginx-ingress pod.

Background

[Any helpful contextual notes or links to artifacts/evidence, if needed]

Security Considerations (required)

This work will help us meet our compliance requirements. See section 10.9.6.

Sketch

For this story, we only need to work up through step 4.1 of the referenced blog post... That is, we want to demonstrate mTLS between the nginx-ingress pod and the 2048 pod.

We can work up through step 5 (TLS between the ALB controller and nginx-ingress controller) in a separate/future story.

We're now considering 4 options going forward:

  1. Remove nginx-ingress to get as close to the AWS-supported configuration as possible (adds ALB costs)
  2. Try the new solr-operator support for inter-node TLS (solves for Solr, further work needed in future for other k8s services)
  3. Try the AWS+Kong documented method that uses Kong as the ingress controller (keeps single ALB)
  4. Keep trying to debug existing path

See also https://docs.aws.amazon.com/app-mesh/latest/userguide/getting-started-kubernetes.html

Security Policy violation Repository Administrators

This issue was automatically created by Allstar.

Security Policy Violation
Users are not allowed to be administrators of this repository.
Instead a team should be added as administrator.

To add a team as administrator From the main page of the repository, go to Settings -> Manage Access.
(For more information, see https://docs.github.com/en/organizations/managing-access-to-your-organizations-repositories)


This issue will auto resolve when the policy is in compliance.

Issue created by Allstar. See https://github.com/ossf/allstar/ for more information. For questions specific to the repository, please contact the owner or maintainer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.