ocsf / ocsf-docs Goto Github PK

View Code? Open in Web Editor NEW

107.0 18.0 19.0 3.49 MB

OCSF Documentation

License: Apache License 2.0

ocsf-docs's People

Stargazers

Watchers

Forkers

fasere75 sunatthegilddotcom sparrell katin jrandomsage pagbabian-splunk qngo-barracuda jp-harvey sumodgeorge adplotzk ankit-nassa mjschultz mikeradka robert-kvam alanisaac jetlime query-jeremy dwjohnson222 nrapendra786

ocsf-docs's Issues

Define Groups

Add to Docs information on Groups.

In the following example a group is part of the Event Class.

group → For each attribute ensure you add a group value. Valid values are - Classification, Context, Occurrence, Primary

example:
"hw_bios_manufacturer": {
"description": "The BIOS manufacturer.",
"group": "primary",
"requirement": "optional"
},

Add a doc explaining available Data Types in OCSF

Currently, the only way to view all the supported data types in OCSF is via a local instance of the ocsf_server. Instead, we should provide an easy access to it by creating a .md file accessible in the repo.

A simple .md file should suffice.

Once created, link this file in the contribution guideline file - line# 66

Observable Datatype's relationship to Observable Objects

Originated from ocsf-schema PR ocsf/ocsf-schema#807

I believe there is an important relationship between the observable datatypes and how the observable objects are identified.

For instance, I believe the OCSF translator looks at the datatype, and when the datatype of a given object matches an observable type, it identifies that object as an observable.

Therefore, removal of an observable datatype from an object could be a breaking change.

We should find some way to work this into our documentation (and our process)

Introduction: schema vs. data

"Understanding OCSF" discusses personas at a high level, but in order to be "agnostic to storage format, data collection and ETL processes", an additional level of communication detail is required. Being agnostic requires defining what information is collected, stored and transformed without regard to the data formats used to represent that information.

Current:

The author persona is who creates or extends the schema.
The producer persona is who generates events natively into the schema.
The mapper persona is who translates or creates events from another source to the schema.
The analyst persona is the end user who searches the data, writes rules or analytics against the schema, or creates reports from the schema.

The author works directly with the schema as described. But producers, mappers and analysts work with data that corresponds to (can be validated by) the schema.

Proposed:

The producer persona is who generates event data that is defined by (is valid according to) the schema.
The mapper persona is who translates events from another source into data defined by the schema.
The analyst persona is the end user who searches, writes rules or analytics for, or creates reports from data defined by the schema.

A communication diagram would show personas exchanging documents or messages, or in NIEM terminology, "Information Exchange Packages (IEPs) are the actual messages that carry data and are exchanged between stakeholders." The schema itself is not a persona or communication endpoint, it is used by them when validating data or writing analytics.

FAQs

I recommend adding a FAQs directory to this (ocsf-docs) directory. It could refer to other existing docs (eg the white papers, the contributing page, the governance page). One useful section would be on a series of questions on how OCSF relates to other activities (OCA, Mitre Att&ck, STIX, CACAO, TAC, OpenC2, Kestrel, PACE, NIEM, ...).

timestamp_t value discrepency

https://github.com/ocsf/ocsf-docs/blob/main/Understanding%20OCSF.md documents the timestamp_t format as "milliseconds since the unix epoch", however the example payload provided by https://schema.ocsf.io/sample/1.2.0/classes/base_event produces a value for the time field in the quadrillions (e.g. "time": 1718905365392963,) which only makes sense if interpreted as microseconds, not milliseconds.

Is the error in the documentation, or the sample event implementation?

Are profile #includes optional?

The documentation says:

Profiles overlay additional related attributes into event classes and objects allowing for cross-category event class augmentation and filtering. Event classes register for profiles which can be optionally applied, or mixed into event classes and objects, by a producer or mapper.

The system event is:

  "attributes": {
    "$include": [
      "profiles/host.json",
      "profiles/user.json",
      "profiles/malware.json"
    ],
    "device": {
      "group": "primary",
      "requirement": "required",
      "profile": null
    },
    "actor_process": {
      "requirement": "required",
      "profile": null
    }
  }

The host profile is:

  "attributes": {
    "device": {
      "requirement": "recommended"
    },
    "actor_process": {
      "requirement": "optional"
    }
  }

The $include directive seems to say that the host profile is always included in the system event, downgrading the device and actor_process attributes from required to not required. There are only two events that have properties modified by a profile: inventory and system, in both cases by the host profile. That raises the question of whether that is an error in the two events, or in the host profile. It seems strange that a system event would always require a device attribute except when operating under the host profile - wouldn't it make more sense to just make the device attribute always non-required regardless of profile?

Is there ever a circumstance in which the system event would not include the host (or user or malware) profile? If so, what controls whether $include directives are executed? The context is schema generation - it seems that a system event could include any attribute listed in the system event or any included profile. The expected behavior of producers and consumers seems ill-defined if the attributes permitted in an event is variable.

FAQ - OCSF relation to STIX

I am currently trying to understand how OCSF compares to STIX. I noticed in the present FAQ (https://github.com/ocsf/ocsf-docs/tree/main/FAQs) that you planned to add an explanation on how they are complementary.
As I cannot seem to find an answer to my question online, would it be possible to obtain one here?

Thanks.

FAQ Suggestion: Using Extensions vs Profiles

A topic of discussion that often comes up from OCSF adopters is "When should I use/create an Extension versus when should I use/create a Profile. The Understanding OCSF document does a pretty good job of explaining the two, but I think adopters still run in to problems where they are not sure the correct approach to take under various scenarios. Perhaps we can add a few scenarios where we can outline when it is appropriate to choose a Profile and when it is appropriate to choose an Extension?

Write up for Supported Logs

Not sure how or if this should be merged with our formal docs. However we discussed this question of how to determine if a log is supported the other day.

Here is my attempt to describe a process to determine if a log is supported. I would call this 'ocsf data onboarding'. I think it could use additional details on navigating the schema, different attribute types, using profiles, etc.

+++++++++
How to determine if a Log or Event is support by OCSF?

Target data is supported in OCSF when the sample matches an appropriate Category/Event Class as well as contains all required attributes. If the sample does not contain required fields, and required value data can not be constructed at parse time, the log is unsupported.

Obtain JSON sample & any available documentation.
Review Category descriptions and select a match to the sample.
Review Event Class descriptions and select a match to the sample.
After selecting an appropriate Category and Event class identify "required" attributes.
Drill down into nested Objects within Event class and identify any additional "required" attributes.
If using the OCSF Sever, leverage the Schema and Sample button (top right) to better understand the standard.
If the sample does not contain required attributes, determine if they can be generated or derived. If a required attribute is not contained in raw data and can't be generated by "source" the log can not be supported for the given Category/Event Class. Restart analysis and find a better matching Category/Event Class.
Create a matrix of attribute types:
a. required OCSF attributes.
b. attributes within raw data that can be mapped to OCSF
c. attributes calculated/derived from raw data
d. attributes that can not be mapped by the standard. See unmapped.
e. additional attributes included with data that provide additional context or metadata. See enrichments.
Determine any conditional logic that maybe required at the parsing level based upon data sample attribute value. *Logic that determines one Object or another based upon sample data value.
Generate a converted sample. Either manually or create a script for parsing based upon these steps.
Determine a validation mechanism. If using the OCSF Server review the 'validate' API endpoint.
Optional: Automate steps to generate sample raw data, generate OSCF output, and validate.
Optional: Based upon steps design an implementation parsing process. Include data format, include raw & parsed events, and transfer mechanism based upon the target technology.

FAQ

How exactly is "supported logs" determined?
A: Sample must contain required fields.
What can I do if the log is unsupported?
A: Work with source data provider and/or OCSF community to add support.
What are some example parsed/translated samples for known data?
A: Sysmon or other common event translated to OCSF <>.
What data format should I use for OCSF?
A. That is an implementation detail that depends upon the target technology. Different vendors might have different format or transfer mechanisms. Work with that vendor to determine.

https://schema.ocsf.io/ is down

https://schema.ocsf.io/ is down, I am not able to view the OCSF schema.

OCSF Taxonomy-3.pdf is marked Confidential

Either the document should be removed from the repository or a new version should be posted that is non-confidential.

Ideally this should be posted as a Markdown file so it can be collaborated on.

Add a doc explaining the grammar & conventions in OCSF

Currently, the only way to view the guidelines of attribute grammar in OCSF is via a local instance of the ocsf_server. Instead, we should provide an easy access to it by creating a .md file accessible in the repo.

A simple .md file should suffice.

Once created, link this file in the contribution guideline file - line# 39

Add a FAQ - What constitutes a breaking change in OCSF schema?

The topic that triggered this - Changing captions of an enum attribute is also a breaking change, since captions are used as values of the sibling string attributes.

We should create a FAQ to formalize what constitutes as a breaking change. We can use this issue to discuss and arrive at a consensus for what a breaking change for OCSF is.

Accept PRs for White Paper

Thank you for writing such an excellent white paper that introduces the schema so thoroughly. It is incredibly well written and complete. There are a few places where I think that I can contribute (editorially, of course -- I would never presume to change the content) and would love to submit PRs.

I cannot seem to find the right place to do that. In fact, I am sure that using the Issues in this repo is not even the best place for this message -- I apologize.

I look forward to helping, if I can!

ocsf / ocsf-docs Goto Github PK

ocsf-docs's People

Stargazers

Watchers

Forkers

ocsf-docs's Issues

Recommend Projects

Recommend Topics

Recommend Org