Giter Site home page Giter Site logo

awesome-infra's Introduction

Awesome Infrastructure Awesome

A collection of awesome software infrastructure projects and companies.


AI/ML

Machine learning and artificial intelligence infrastructure.

Change Data Capture

Change data capture.

  • Arcion - Arcion is a change data capture platform that enables you to stream data from your database to your data warehouse in real-time.
  • Debezium - Debezium is an open source distributed platform for change data capture.

Graph Databases

Graph databases.

  • ArangoDB - Graph database that also works as a multimodal database supporting documents.
  • Dgraph - Dgraph is an open source, low latency, high throughput, native and distributed graph database.
  • Kuzu - Embeddable property graph database management system built for query speed and scalability. Implements Cypher.
  • Neo4j - Neo4j is a native graph database, built from the ground up to leverage not only data but also data relationships.
  • TigerGraph - TigerGraph is a native parallel graph database platform for enterprise applications.
  • Neptune - Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets.

Key-Value Stores

Key-value stores.

  • Venice - Venice is a derived data platform providing high throughput ingestion from batch, streams, and lambda/kappa architectures, and low latency online reads, for ML feature storage, etc.

OLTP Databases

Online transaction processing databases.

  • Neon - Serverless Postgres. Neon separates storage and compute to offer autoscaling, branching, and bottomless storage.
  • TigerBeetle - TigerBeetle is a financial accounting database designed for mission critical safety and performance to power the future of financial services.

OLAP Databases

Online analytical processing databases.

  • Clickhouse - ClickHouse is a column-oriented database that enables its users to generate powerful analytics, using SQL queries, in real-time.
  • Materialize - Materialize is a data warehouse purpose-built for operational workloads where an analytical data warehouse would be too slow, and a stream processor would be too complicated.
  • Pinot - Apache Pinot is a distributed OLAP datastore, designed to answer OLAP queries with low latency.

Search Engines

Search engines.

Vector Stores

Vector stores.

  • LanceDB - LanceDB is an open-source database that uses the Lance fileformat for vector-search.
  • Turbopuffer - A serverless database for low latency vector search.

Data Lakes

Data lakes.

  • Bauplan - A serverless lakehouse for complex data workloads.

Durable Execution

Durable execution systems.

File Formats

File formats.

  • GraphAR - An open source, standard data file format for graph data storage and retrieval.
  • Lance - Modern columnar data format for ML and LLMs implemented.
  • ORC - Apache ORC is a self-describing, type-aware columnar file format designed for Hadoop workloads.
  • Parquet - Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval.

Functions as a Service

Functions as a service.

  • Lambda - AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume.
  • Google Cloud Functions - Google Cloud Functions is a serverless execution environment for building and connecting cloud services. With Cloud Functions you write simple, single-purpose functions that are attached to events emitted from your cloud infrastructure and services.
  • Azure Functions - Azure Functions is a serverless compute service that lets you run event-triggered code without having to explicitly provision or manage infrastructure.
  • OpenFaaS - OpenFaaS makes it easy for developers to deploy event-driven functions and microservices to Kubernetes without repetitive, boiler-plate coding.
  • Knative - Kubernetes-based platform to build, deploy, and manage modern serverless workloads.
  • Fission - Fission is a framework for serverless functions on Kubernetes. It allows you to easily create HTTP services on Kubernetes from functions.
  • OpenLambda - OpenLambda is an Apache-licensed serverless computing project, written (mostly) in Go and based on Linux containers.
  • Wasmer Edge - Wasmer Edge allows running cloud apps easily at the Edge, scaling them like they are serverless.

Workflow

Workflow.

  • Airflow - Airflow is a platform to programmatically author, schedule and monitor workflows.
  • Flyte - Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
  • Kestra - Scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
  • Prefect - Prefect is a workflow management system, designed for modern infrastructure and powered by the open-source Prefect Core workflow engine.
  • Dagster - Dagster is a data orchestrator for machine learning, analytics, and ETL.

Query Engines

  • Calcite - Apache Calcite is a dynamic data management framework. It contains many of the pieces that comprise a typical database management system but omits the storage primitives.
  • Data Fusion - DataFusion is a very fast, extensible query engine for building high-quality data-centric systems.
  • Substrait - A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
  • Velox - A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.

Service Mesh

Service mesh.

  • Istio - IstIstio is an open platform for providing a uniform way to integrate microservices, manage traffic flow across microservices, enforce policies, and aggregate telemetry data.
  • Linkerd - Linkerd is an ultralight, security-first service mesh for Kubernetes. Linkerd adds critical security, observability, and reliability features to your Kubernetes stack with no code change required.

Message Brokers

Message brokers.

  • WarpStream - WarpStream is a Kafka compatible data streaming platform built directly on top of S3.

Stream Processing

Stream processing.

  • Apache Flink - Stateful computations over bounded and unbounded data Streams.
  • Decodable - A managed platform for stream processing and real-time ETL, powered by Apache Flink and Debezium.
  • Kafka Streams - A stateful stream processing library for Kafka.
  • Responsive - Responsive is the platform for developers building stateful reactive applications on the modern cloud. Focused on Kafka streams.
  • RisingWave - RisingWave is a distributed SQL database for stream processing. It consumes streaming data, performs incremental computations when new data comes in, and updates results dynamically. As a database system, RisingWave maintains results in its own storage so that users can access data efficiently.

Miscellaneous

  • Bacalhau - Compute over Data framework for public, transparent, and optionally verifiable computation.

Contributing

Contributions welcome! Read the contribution guidelines first.

License

CC0

To the extent possible under law, criccomini has waived all copyright and related or neighboring rights to this work.

awesome-infra's People

Contributors

criccomini avatar mgilbir avatar jacopotagliabue avatar felixgv avatar gunnarmorling avatar tchiotludo avatar tekumara avatar syrusakbary avatar wyhyhyhyh avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.