malirezai / advanced-security-getting-started Goto Github PK

1.0 1.0 0.0 26 KB

Guide on Getting Started with Advanced Security

advanced-security-getting-started's Introduction

Hello! 👋

My name is Mahdi and I'm a Solutions Engineer at GitHub. Previously, I worked at Microsoft focused on Microservices, Serverless, and Blockchain. Before that, I worked at Xamarin.

Ask me about:

🇨🇦 Toronto, Canada - where I live!
Cars - dreaming of a 997.2 Targa these days!
Scuba Diving
Blockchain tech, smart contracts and crypto

advanced-security-getting-started's People

Contributors

Stargazers

Watchers

advanced-security-getting-started's Issues

Task Fourteen: Capture discussion about secure code development decisions

One of the later tasks will be collecting feedback from the developers involved as part of the PoC. The prior tasks have enabled you to gather time and number metrics, but nothing so far paints a picture of how people find using the tool. A big part of deciding what tool should be used is how much the people using it ... enjoy using it. We recommend collecting feedback from the developers involved by some form of questionnaire/feedback form.

On a rating scale of one to ten (one being poor, ten being excellent), how would you rate your overall experience with GitHub Code Scanning?
- In 100 words or less, why did you choose the above rating?
On a rating scale of one to ten (one being poor, ten being excellent), how would you rate your overall experience with GitHub Secret Scanning?
- In 100 words or less, why did you choose the above rating?
On a rating scale of one to ten (one being poor, ten being excellent), how would you rate your overall experience with current SAST tooling?
- In 100 words or less, why did you choose the above rating?
On a rating scale of one to ten (one being poor, ten being excellent), how would you rate your experience specifically with setting up & configuring GitHub Code Scanning?
- Helper text: Setting up and configuring is defined as from not enabled on the repo to first scan ran on a branch or pull request.
- In 100 words or less, why did you choose the above rating?
On a rating scale of one to ten (one being poor, ten being excellent), how would you rate your experience specifically with finding the results from a Code Scanning scan, and understanding what the vulnerabilities were found?
- Helper text: finding results and understanding results is defined as from the moment a scan ran, how easy was it to locate where the results of that scan were, and then finished looking through the results and understanding what it had reported.
- In 100 words or less, why did you choose the above rating?
On a rating scale of one to ten (one being poor, ten being excellent), how would you rate your experience specifically with the quality of the results from a GitHub Code Scanning scan?
- Helper text: quality is defined as how precise the results were. How many false positives did you find? Did the number go up as you added different packs?
- In 100 words or less, why did you choose the above rating?
On a rating scale of one to ten (one being poor, ten being excellent), how would you rate your experience specifically with customising the codeql-analysis.yml file to suit your repositories needs?
- Helper text: customising is defined as making changes to the file to suit your needs. For example, custom build processes, ignoring folders/files, not running on certain changes, not running on branches, etc.
- In 100 words or less, why did you choose the above rating?
On a rating scale of one to ten (one being poor, ten being excellent), how would you rate your experience specifically with the finding current security posture of a repository?
- Helper text: current security posture is in the default branch of a repository, how many vulnerabilities are there, has any been fixed in development/feature/bug branches?
- In 100 words or less, why did you choose the above rating?
In 500 words or less, summarise what you found best about GitHub Code Scanning.
In 500 words or less, summarise what you found needed to be improved about GitHub Code Scanning.

Task Two: Run default code-scanning queries

Run a scan using the default queries. The way this looks like for GitHub Actions is:

-   name: Initialize CodeQL
    uses: github/codeql-action/init@v1

Review the findings from these queries. These queries only return high precision results, so there should be a low percentage of false positives (if any).

Think about how this could be used within your company. Applications with a low tolerance of possible false positives or lower risk could benefit from this query pack as only vulnerabilities of high confidence will be brought to the attention of the developers. Allowing developers to focus on delivering business value whilst having security monitoring without distractions.

Task Thirteen: Test Custom Token Expressions

Every company has their own patterns when it comes to secrets. You can now define your own patterns on what them secrets are, and have our engine scan for them secrets whenever they are pushed to a repository with Secret Scanning enabled.

To create a custom pattern look at the documentation here: Defining custom patterns for secret scanning.

An example may look like this:

RSA Key:`--BEGIN (?:[A-Z]+ )?PRIVATE KEY--+[a-zA-Z0-9+/=\s]+--+END (?:[A-Z]+ )?PRIVATE KEY--`

You can create as many custom patterns as you like.

Creating and Testing Custom Patterns

When creating & testing custom patterns you should:

Enable Secret Scanning on a test repository
- a fork / clone of a production repository work as well
- this will help with minimizing impact of poorly written patterns being saved and run on production repositories
Define a pattern at the repository level
- syntax documentation
- Regex zero-lenght assertions and look ahead are not supported
- use additional match requirements as a form of validating the matching pattern
Test pattern
- use the built-in pattern testing feature
Proactively push a change test case(s) to a file inside the respository which matches the new custom secret
- this will confirm the pattern finds the pattern; and
- confirms the pattern doesn't have false positives
Iterate over previous three steps until the pattern raises no False Positives and all True Findings are discovered
Observe how quickly the result appears within the Secrets tab of the Security header

Once a pattern is ready to use in production, patterns can be defined at either Repository, Organization, or Enterprise levels.

Other Examples

Other custom pattern examples can be found at octodemo/custom-pattern-secrets.

Note: These patterns are maintained by the GitHub Field team and might contain patterns with False Positives.

Task One: Enabling Code Scanning and Secret Scanning

If this POC is successful, one of the critical points of success is how quickly teams can help onboard themselves in a self-service way. In a perfect world, teams would enable Code Scanning and get running without talking to anyone, especially if you are running 100's, maybe 1000's of teams.

Firstly, head to the Setting part of the repository, head to the security and analysis tab and enable GHAS and Code Scanning (alongside any other feature you would like to test.

Or, of course, instead, you can send a PATCH request to the API to enable it; more information can be found here: Update a Repository.

Then, follow the steps here: Setting up Code Scanning to get set up using a GitHub Action and see how easy it is to configure a GitHub Workflow to run CodeQL within GitHub Code Scanning. When you adopt any new security tool at scale, you need a tool that can enable teams to scale autonomously with that demand. Compare how long it takes to get set up with your current tool, especially for non-build languages.

Task Three: Run additional code-scanning queries

After the first set of queries which by default run the standard pack, revisit the codeql-analysis configuraton file and add either security-extended or security-and-qualityqueries to expand the scope of the scan. Running Additional Queries

Why would you do this? The value is you can adjust what level of scans run on different "risk" acceptance in an application. For example, you may have an internal tool where the standard scan queries meet your risk level, as it isn't customer-facing. However, on the other hand, you may have a customer-facing, high-risk application where you are more cautious, and your tolerance of possible false positives are higher. In that case, you would utilize a different pack such as security-extended or security-and-quality.

This means that you no longer have to run the same scan for a proof of concept and a high-risk customer application!

Actions

An example of how you would do this is by adding the following:

-   name: Initialize CodeQL
    uses: github/codeql-action/init@v1
    with:
     queries: security-extended

-   name: Initialize CodeQL
    uses: github/codeql-action/init@v1
    with:
     queries: security-and-quality

Specifcally, you are adding:

    with:
     queries: ********

This is telling Code Scanning to run a different set of queries then the standard pack.

CodeQL CLI

Here is an example of how to set other query suite using the CodeQL CLI directly:

codeql database analyze ${CODEQL_DATABASE} ${LANGUAGE}-security-extended

codeql database analyze ${CODEQL_DATABASE} ${LANGUAGE}-security-and-quality

Task Six: Render results of other SARIF-based SAST tools directly within the GitHub UI

You may have other security/scanning/linting tools that support SARIF output.

For example you may have Prisma running your container scanning tool, Acunetix running DAST and ESLint running linting for your JavaScript applications. Code Scanning isn't a tool that does everything and by default we would like to be extensible and integratable.

Check out information about the different ways to integrate with GitHub Advanced Security:

Create a task that integrates a third party SARIF file (even if it is a linter) into a repository security dashboard.

Hint Hint: Take a look at @microsoft/eslint-formatter-sarif and Uploading a SARIF file to GitHub.

Task Fifteen: View Security Overview Dashboard

Task Fifteen: Security Overview Dashboard

We provide a Security Overview Dashboard where Organizations Admins and Security Managers can get a rolloup of the the alerts across all their repositories.

With the recent ship of the Organization Security Manager role, you can now dedicate teams permission to manage security alerts and settings on all your repositories, without needing to grant organization ownership to folks. The "security manager" role can be applied to any team and grants the team's members the following permissions:

Read access on all repositories in the organization
Write access on all security alerts in the organization
Access to the organization-level security tab
Write access on security settings at the organization level
Write access on security settings at the repository level

A reminder that all our scanning services are axtensivle with API documentation. We understand if we may not have a tool or store the data in a way that suits you. The API allows you to customise what you do with the data to suit your own needs. We try and provide a service to meet most of your needs, but if there is something custom to your company that we may not know about, use the API to solve some of the gaps.
A docs to check out are:

Some scripts you can take advantage of as starting points today:

GHAS POC Planning - Start Here! 👋

Welcome 👋🏼

Welcome to the GitHub Advanced Security Trial!

GitHub Advanced Security is a suite of capabilities for improving the security posture of your code.

Background and POV Flow

There are 2 general phases to a GHAS POV, outlined below. These phases correspond to the relevant issue labels. For example, filter by the "prep work" label to find issues that contain information on the necessary pre-work.

prep work: Necessary planning and configuration before the GHAS feature is enabled on your organization
GHAS POC: The process of evaluating GHAS against your documented Success Criteria.

Code you plan to analyze

Generally we want to use codebases in the POV that reflect as close to production conditions as possible. Thus, new or stub repos, very small repos, etc., are usually not suitable candidates because they are unlikely to generate a volume of alerts sufficient to evaluate GHAS Code Scanning. Selecting the proper repos is a somewhat subjective task, but bear in mind the following suggested guidelines:

Aim to have at least 500kloc of code with high security impact.
For compiled languages (everything except js/ts/python) we will need to know how to build the code from a fresh checkout.
Assess the complexity of each codebase. This is somewhat subjective, but find out the following information to make your assessment:
- language (e.g. compiled = slightly more complex, C++ = more complex)
- build system (less familiar = more complex, distributed = more complex)
- CI system and build dependencies (does every repo use a standard CI image? or does this project have their own special one?)
- approx. number of lines of code (larger = more complex)
- approx. project age (older = more complex)
While there is nothing in the POC to limit how many repositories you can analyze, we are a bit limited in how many we can help you with during a trial. Please pick 4-7 repositories that best fit the criteria above as your primary targets of the POC

People and Dev Teams

List all people from your side who will be involved with the PoC.
Please list not only those people that will be actively engaged with the PoC but also the ones that will be impacted by it.

GitHub Advanced Security Code Scanning was specifically designed to provide a superior developer workflow experience. Therefore, it is critical that the POC include at least one development team using GHAS Code Scanning in their production workflow for at least a week. Please ensure that one or more teams have been identified and are able to participate during the 3 week trial period. List teams below, and note team membership in the list of overall POV participants in the second list.

Dev Team(s)

Team Name	Leads	Dates available	Participation confirmed
------	----	------------------	--------

Participants

Name	GitHub Username	Role	Team(s)
------	----	------------------	--------

Task Eight: Bulk Enabling Code Scanning across multiple Repositories Quickly

When you enable Code Scanning in a production environment, you would probably like to be able to easily turn it on and enable Code Scanning across multiple repositories. For example, let's say you have a team that have 30 repositories that all use JavaScript. You don't want to have to manually go through 30 repositories and manually enable Code Scanning and manually drop the same codeql-analysis.yml file into each repository. Take a look at some tools already available for you to use to automate the enablement of Code Scanning across multiple repositories.

Bulk Setup Code Scanning

Task Ten: Core Language Support for your Organisation

As with any security tool, you would like it to cover as many languages as your company has. But what is most important is it covers the languages that most matters. There are two ways to look at this:

The number of repositories covered: Let's say you have 400 repositories if 300 of them are JavaScript, 30 are Java, 60 are Python, and the rest are spread out between C++, C, Go, PHP, etc., you are going to want a tool that supports JavaScript well, and also Python and Java. It would be fantastic for a tool to support every language, but focusing on the language will cover the most ground.
The repositories of high risk/importance: The other end to look at this is there will be your high-risk repositories that are maybe customer-facing or involved in the supply chain, that if they get breached, can cause high impact. These are the repositories you want to be scanned. Look at the languages that are core to your critical applications, and make sure they are covered.

So, as part of the POC, ensure that the languages you are testing cover a high % of languages used across your GitHub Organisation and/or the most critical applications.

Task Eleven: Parallel scans

When you work with some security tools, you could be constrained by the number of scans you can run simultaneously, within a repository, and within an organisation. There may be scenarios where you only have six "executors" or "runners" available (as an example), and if you have 15 repositories that want to run a scan, six will run and then nine will be put into a queue. This can be incredibly frustrating if you are running this as part of a CI/CD run, and halfway through, you have a scan that is being queued outside of that CI/CD runs control.

So, a great metric to consider as part of your POC is to see how many scans you can run simultaneously and then compare that to any current SAST tool you have. This is especially important as your security scales; you need a tool that can scale equivalently at the same time.

Task Seven: Compare Other SAST and CodeQL Results

CodeQL Defaults results are very precise.

We advocate comparing results to other SAST tools. Still, when comparing, we recommended that a minimum threshold of security-extended be used for comparison, but security-and-quality will yield maximum results.

When comparing results from other SAST tools, look at the quality of the responses back, not the number. Remember, if your current SAST tool returns 20 vulnerabilities, that doesn't mean that 20 need to be fixed. The higher the number of vulnerabilities, the longer it will take a developer to look through the data to understand false positives versus correct matches.

Code Scanning is precise, which means the results from the default pack should be accurate and of high quality, meaning less time spent understanding false positives and quicker delivery time for your business whilst staying as secure!

Additionally, a developer will be more likely to properly look through data of a tool that returns streamlined and high-quality results than a wide casting tool and may be wasting their time. Meaning hopefully, you are going to be more secure as you are increasing your adoption of security.

Task Four: Configuring CodeQL Scans

There are multiple ways to configure the CodeQL configuration file, but we recommended checking out the following:

Configure Frequency

Try changing the frequency to run on push and pull requests to main and develop, but not feature branches. Why? You may not want to run code scanning on every commit and every pull request to every branch. Try and be specific, customise!

Ignore them Markdown files

Isn't it so annoying when you have to run a scan for every file change! Not anymore. Simply customise the ignore part of your workflow to stop scanning on pull requests with .md and .txt extensions.

Configure that multi language repository

One of the most significant values of Code Scanning is how easy it is to configure (add) new languages. Go ahead and add a few languages to a repository and update the matrix to include whatever languages you choose. (Note: Take a look at the speed of multi-language repositories! With Cloud becoming more predominant and multilingual repositories becoming mainstream, a considerable metric to look at is the time of scans for polyglot repositories).

The whole purpose of this metric is to show how customisable Code Scanning is from one application to another. Nowadays, repositories differ, and you want your security scanning tool to accommodate.

Task Twelve: Detection of secret keys from known token formats committed to private repositories

Status Checks

Each Secret Scanning alert has a status attached to it with several states

Revoked
False positive
Used in tests
Won't Fix

The worst thing is when you get a security alert from a tool and it's from a test that uses a dummy auth123 bearer token within the Authorization header.

Create a test file with a dummy secret and see how simple it is now for developers to quickly ignore secret scanning results that are no longer valid, such as ones uses in tests, etc.

Task Thirteen: Secret Scanning Integration

Intergrations

Most compaines have their own tool where secrets are stored, and responded to. The best practice for consuming these events is from a secret scanning webhook into your SIEM of choice. Additionally you can consume the information from the API, here is a starter script to help you get started: API Script for consuming results on GHEC

Task Five: Establish Continuous Application Security Scanning

Even when you don't push any changes to your repository, you still would like a security scan to run. Just because you aren't making changes to the code doesn't mean that you do not want to check for new CVE's which been disclosed, which could affect your application. Taking this a step further, you don't want to have to create a reminder to do this manually every month or even write lots of code to build a script to automate it.

Take a look at the Scanning on a schedule article and see how this works for you. You can configure Code Scanning with one line to run on any CRON timestamp that suits your application. For high risk, maybe once a week, for POC's, maybe once a month. A key metric to consider is how easy it is to maintain non-active developed projects.

Task Nine: Developer Experience Task

Developer Experience is where Code Scanning shines. Take a look at how extensive the service is around Code Scanning. A few things to try are:

API Docs

Use the API to try and pull the results of a Code Scan and store them in a JSON array. We understand if we may not have a tool or store the data in a way that suits you. The API allows you to customise what you do with the data to suit your own needs. We try and provide a service to meet most of your needs, but if there is something custom to your company that we may not know about, use the API to solve some of the gaps.

CLI Trigger

Trigger a scan from the GitHub CLI! You would like to do a one-off scan; you don't want to manually go through the GUI, find the repository and kick off the scan from there. Use the CLI to trigger the workflow that runs Code Scanning. You can even create a small helper bash script that triggers it to your commands.

Ease of use

A developer will take the path of least resistance; they would like simplicity. You want your developers to be happy and run into fewer barriers because the fewer problems they run into, the more likely they are to fix security problems before they become a problem, meaning less hold-up time before a release (quicker value), and happy business partners as your delivery is faster. One of the most significant values of Code Scanning is the general ease of use. Create a metric around how long it takes a developer to set up, run a scan, find a result, and most importantly understand what that result means. Compare that to what you see with your previous SAST scan.