Giter Site home page Giter Site logo

soma-b2b-saas's Introduction

SOMA: B2B SaaS

What is SOMA

SOMA is an open-source, community-driven approach towards Standard Operating Metrics & Analytics.

This project exists to make it easy for companies to define, create, and work with operating metrics.

SOMA does this by providing:

  • Specifications for naming, defining, and structuring metrics;
  • Specifications for the data models and telemetry that support those metrics;
  • Scripts and semantic layer definitions to support the caching and presentation of those metrics; and
  • Samples of standard analytics to be performed with those metrics.

Why

We believe:

  • Metrics are what matters. For most companies, most of the value of data will come from answering the 4 Fundamental Questions of Analytics:

    1. What happened or is happening in my business?
    2. Why did it happen?
    3. What’s going to happen?
    4. What should or could we do next?

    Answering those questions effectively and quickly requires a company to have enough of the right metrics. The right metrics are easy to understand, easy to benchmark, and effectively proxy the health of underlying business processes. Having enough of these means having coverage over all the critical processes within the business.

    Having the right metrics helps a company see reality clearly; having enough helps a company see reality completely.

  • For the most part, companies shouldn’t be innovating on metrics. Companies that share a business model (e.g. B2B SaaS) are largely the same in how they operate. And yet, the metrics, data models, visualizations, and analyses that support these companies are largely bespoke.

    Sometimes, there are good reasons for this wheel-reinventing. Most of the time, though, we see this work as arbitrary uniqueness and a misallocation of scarce data resources.

    Instead, we believe that businesses can address the lion's share of their analytics needs by coupling their metrics and analytical use cases to common, open, industry standards.

    The central hope of the SOMA project is that standardizing more of the undifferentiated heavy lifting that companies take on with respect to data can help those companies instead focus on the differentiated work that builds real competitive advantage.

Domains

SOMA standards are organized under Domains. This repo is concerned with the B2B SaaS domain. The other Domains currently under development are B2C SaaS, ECommerce, Marketplaces, and Logistics.

Approach

Metrics are the heart of SOMA. The list of metrics, their definitions, and metadata are our starting point.

In developing SOMA, we spend most of our energy getting the metrics right. From there, we work backwards to define the upstream telemetry and data models, and forwards to define the downstream reporting and analyses – and all vis-a-vis the metrics.

Likewise, in using SOMA, users are advised to start by selecting an anchor set of metrics and working backwards and forwards to support the development and use of that metric set.

We think many companies get this wrong and their resulting data ecosystems are either volatile, difficult to manage, or are slow to develop. Metrics are rightfully the core primitive of the entire data ecosystem. Metrics must come first – and all other primitives, pipelines, and artifacts derive from those metrics.

Footprint

The full footprint of SOMA is distributed across four Scopes.

image

Definitions

The Definitions Scope expresses the logical definitions, semantics, and metadata for Metrics and key business Terms and concepts. Definitions are the heart of SOMA.

Models

The Models Scope specifies foundational data models and telemetry that support the generation of the SOMA Metrics in the Definition Layer.

Unlike other approaches to automating the generation of metrics, SOMA does not work directly with “raw” data from, or as represented in, source systems. While we believe that there is much arbitrary uniqueness in how companies define metrics, we also recognize that source systems are set up and used in highly unique ways and that this heterogeneity requires that raw data first be “mapped” to standard abstractions.

SOMA specifies two types of abstractions: Activities and Entities.

Activities

Activities are business events represented as data.

SOMA identifies the set of business events that are relevant for the generation of Metrics and specifies the semantics, structure, metadata, and trigger conditions for Activities that should be “raised” to represent those events.

[Sample Activity Diagram]

These activities are “raised” into, and managed in, Activity Streams – which are simply tables in a data warehouse. Activity Streams are append-only, immutable ledgers.

Activities can be “raised” into Activity Streams in two ways:

  1. Directly from source systems as the relevant events transpire; or
  2. Data pipelines (one pipeline per Activity) increment synthetic events into Activity Streams using source system data already available within the data warehouse.


This pattern has considerable overlap with Narrator’s Activity Schema, though takes a more prescriptive and portable (that is, usable broadly outside of Narrator) approach to specifying the content of Activity Streams.

Activities are SOMA’s recommended modeling abstraction. Relative to alternatives, we believe that using Activities as a primary abstraction:

  1. Provides a significantly more intuitive interface for reasoning about and understanding a business;
  2. Affords greater auditability and composability;
  3. Produces data pipelines that are easier to understand and manage;
  4. Lowers the friction for interleaving business events with important telemetry from marketing analytics and product analytics.

Entities

Entities are specifications for wide facts and dimension tables. Entities are structurally similar to dimension tables in classical dimensional modeling, though with greater denormalization. These Entities are meant to augment and not replace existing constellations of models in company data warehouses.

SOMA provides specifications for the structure and semantics of these Entities that users can use to author pipelines. For users that are using Activities, SOMA provides SQL scripts that can generate Entities from Activity Streams.

Expression

The Expression Scope provides users with the selection of two types of interfaces to express Metrics and make them available for consumption.

Semantic Layers

Semantic Layers like LookML, Cube, Metricflow, AtScale, etc. provide a virtual abstraction layer for companies to unify metric definitions and semantics. Concretely, SOMA provides “view” files/scripts that express Metrics and Terms that can be copied over into a company’s existing BI-native (e.g. LookML) or transformation-native (e.g. Cube) Semantic Layer.

Nets

Nets are tables with the cached measurements for metrics pre-computed across key dimensions.

They are essentially “flattened” OLAP cubes (“net” is a SOMA pun that is bad enough that it requires explanation: a cube projected in 2D space is called a “net”). Nets support faster retrieval and more portability relative to Semantic Layers.

image

For users that want to use Nets, SOMA provides SQL scripts that generate SOMA Metrics from Entities and Activities in the Mapping Layer. These scripts can then be brought into an orchestration or transformation tool.

Note: those scripts must be configured to increment into Nets which, like Activity Streams, are immutable ledgers.

The ledger design allows Nets to support the bi-temporal nature of metrics. That is, on Jan 2nd, we may believe the churn rate for Jan 2nd is 20%, but on Jan 4th, we may come to believe (e.g. because of late-arriving facts, data quality issues, or changes in definitions) that the value is 12%. This would be reflected with new Nets entries that have new Measurement Dates for prior Metric Dates.

Consumption

The Consumption Scope pertains to the “last mile” where Metrics are consumed, shared, and analyzed. Exposures are the “A” in SOMA, and are represented by templates for standard dashboards, visualizations, and analyses.

Governance

Committees

SOMA is governed by committee, with each Domain having its own corresponding committee of SMEs, operators, and data and finance practitioners.

SOMA is intended to evolve in response to feedback from users and changes in operating norms for companies, which may lead to Metrics being added, modified, or archived.

Domain Committees meet monthly to review proposed changes and formally version the standard and its specifications if necessary.

Contribute

Users that would like to contribute back to SOMA may do so in the Domain Github Repos, like this one.

Contributors are requested to raise Pull Requests for proposed edits to SQL scripts and Issues for suggested additions, edits, and deletions to other specifications.

Domain Statuses

image

soma-b2b-saas's People

Contributors

ergest avatar abhisivasailam avatar

Stargazers

Joel Schneider avatar Scott Arbeitman avatar Brendan Murnen avatar Luc BARO avatar Joe Gordor avatar Olivia Chen avatar Joe Schneider avatar  avatar Toufeeq Ockards avatar  avatar Harlan Harris avatar Nuriel Zuarez avatar  avatar  avatar  avatar Nitesh S avatar Alan Millington avatar Borys Aptekar avatar Andrei Bas avatar Femi Kamau avatar Ana Lima avatar Andrew Morris avatar Philip Knape avatar Deannah avatar Thirunavukkarasu Muthusamy avatar Lu Zhu avatar Stephen Bronstein avatar Benjamin Engen avatar Lynne Whitehorn avatar  avatar Marc Ramirez Invernon avatar Nathan Gold avatar Ethan Finkel avatar  avatar  avatar Arman Bhalla avatar Tarek Elqoulaq avatar Martin Reeves avatar Dmitry Ustimov avatar Jacob Baruch avatar Michael Krisher avatar JP (he/him) avatar Seth Miller avatar Kimmo Hintikka avatar Zach Olivier avatar Juan Muñoz avatar colton avatar  avatar  avatar Joel Smith avatar Bill Wilson avatar  avatar Matías Baudino avatar David Solito avatar Alex Alves avatar Jake Thomas avatar etienne avatar  avatar Phil Cooper avatar Rafael Coimbra avatar Paul Monk avatar Rushank Patil avatar  avatar Josh Temple avatar Andrew Hawker avatar J Laspa avatar Qing Ye avatar  avatar Denis A avatar Patrick Park avatar  avatar Shehab Tarek avatar  avatar Brent Brewington avatar Leandro G. Almeida avatar Santiago Jauregui avatar Matthew Thornton avatar Zsombor Foldesi avatar takashi-uchida avatar  avatar  avatar  avatar Agus Velazquez avatar Chris Davis avatar Rob Mills avatar Bora avatar Tim den Engelsen avatar Claudiu Soare avatar Jay McGrath avatar Nikhil Rao avatar Chris Bolyard avatar Sebastian Kraus avatar Bruno avatar David Griffiths avatar Alex Breux avatar Naga Malleswara Rao Borra avatar  avatar Butch Davis avatar SJ Hong avatar Jay Modi avatar

Watchers

Brian Rue avatar Abhik Khanra avatar Ersin Er avatar Jan Katins avatar Ari Bajo avatar  avatar Fouad Djebbar avatar Aravind Yarram avatar Rob Dearborn avatar Kevin Weatherwalks avatar Paul Russum avatar Aaron ("AJ") Steers avatar  avatar Yuval Goldberg avatar Erald David avatar  avatar  avatar  avatar  avatar Janani_Moly avatar  avatar Deannah avatar  avatar  avatar  avatar Nitesh Jindal avatar  avatar  avatar  avatar

soma-b2b-saas's Issues

MRR Calculation

Hi,

I am looking for the table that represents the total mrr.
I assumed the metric will be in this table: ga_cube_total_revenue, but I see that it is taking only the cumulative sum of net_rr and not including the retained_rr.

Am I missing anything?

Thanks

A bit lost..

Hi @ergest , I love the idea for this framework! Working as an analyst we often have to repeat the work and a standard would help a lot!

That being said, I appreciate the hard work of creating all the models, but I am missing a guide/example to better understand the models. Taking ga_cube_revenue_growth_rate.sql for instance, I can see the SQL, and deduce that the idea is to track the revenue growth rate but I don't immediately understand the fields that are being selected, or when to use this.

Do you plan on describing the ideas behind the models/metrics, or creating a guide per business process, or what is your view on spreading the use of the framework?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.