Giter Site home page Giter Site logo

almanac's Introduction

almanac

Build Status Go Report Card

A distributed log storage and serving system.

Design goals:

  • Easy to deploy on cloud infrastructure such as GCP or AWS.
  • Minimal operational burden, i.e., deployments should be easy to upgrade, restart, modify, etc.
  • System cost scales with usage rather than uptime, making the system viable for small and large deployments.
  • Does not require operating a resilient and fault-tolerant storage system.

Design

The design doc for the system can be found here. As parts of the design go from being under discussion to being more consolidated, the design will gradually move into markdown in this repo.

Building and running

If you have a working go environment, you will need to run the following as one-time setup:

  • ./tools/fetch-deps.sh
  • dep ensure

Running the demo

Run the demo binary by executing:

go run ./cmd/almanac/almanac.go

This will start a single-process cluster and will print the locations of a few relevant web pages which can be used to play around manually. By default, the demo runs against an in-memory storage implementation. In order to use an actual GCS bucket, execute:

GOOGLE_APPLICATION_CREDENTIALS=<path> go run ./cmd/almanac/almanac.go --storage=gcs --storage.gcs.bucket=<bucket>

Running tests

To run all the tests, execute:

go test ./...

almanac's People

Contributors

dinowernli avatar joshpmcghee avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

joshpmcghee

almanac's Issues

Add a discovery implementation which interfaces with kubernetes

In a kubernetes cluster, the expectation is that every almanac service has its own kubernetes service. As a result, service discovery can be done over DNS and it should be possible to teach the ServiceDiscovery object to find services using DNS queries.

Write a load generator binary

This should be a standalone subcommand of the almanac binary. It should be possible to point the binary at an ingester and it should fire rpcs at the ingester to generate load. Various parameters should be controllable by flags, e.g., log size, log frequency, etc.

./almanac loadgen --log_size_bytes=1000 --logs_per_second=5 --ingesters=localhost:123,localhost:124

Setup breaks on resolution of project when cloned from Github

[josh:~/repos/go/src/github.com/dinowernli/almanac] master 3s 1 ± ./tools/fetch-deps.sh
+ go get -u github.com/golang/dep/cmd/dep
+ go get -u github.com/jteeuwen/go-bindata/...
⌂72% [josh:~/repos/go/src/github.com/dinowernli/almanac] master 25s ± dep ensure
The following errors occurred while deducing packages:
  * "dinowernli.me/almanac/pkg/index": unable to deduce repository and source type for "dinowernli.me/almanac/pkg/index": unable to read metadata: go-import metadata not found
  * "dinowernli.me/almanac/proto": unable to deduce repository and source type for "dinowernli.me/almanac/proto": unable to read metadata: go-import metadata not found
  * "dinowernli.me/almanac/pkg/service/discovery": unable to deduce repository and source type for "dinowernli.me/almanac/pkg/service/discovery": unable to read metadata: go-import metadata not found
  * "dinowernli.me/almanac/pkg/util": unable to deduce repository and source type for "dinowernli.me/almanac/pkg/util": unable to read metadata: go-import metadata not found
  * "dinowernli.me/almanac/pkg/service/janitor": unable to deduce repository and source type for "dinowernli.me/almanac/pkg/service/janitor": unable to read metadata: go-import metadata not found
  * "dinowernli.me/almanac/pkg/cluster": unable to deduce repository and source type for "dinowernli.me/almanac/pkg/cluster": unable to read metadata: go-import metadata not found
  * "dinowernli.me/almanac/pkg/storage": unable to deduce repository and source type for "dinowernli.me/almanac/pkg/storage": unable to read metadata: go-import metadata not found
  * "dinowernli.me/almanac/pkg/http": unable to deduce repository and source type for "dinowernli.me/almanac/pkg/http": unable to read metadata: go-import metadata not found
  * "dinowernli.me/almanac/pkg/service/appender": unable to deduce repository and source type for "dinowernli.me/almanac/pkg/service/appender": unable to read metadata: go-import metadata not found
  * "dinowernli.me/almanac/pkg/service/ingester": unable to deduce repository and source type for "dinowernli.me/almanac/pkg/service/ingester": unable to read metadata: go-import metadata not found
  * "dinowernli.me/almanac/pkg/service/mixer": unable to deduce repository and source type for "dinowernli.me/almanac/pkg/service/mixer": unable to read metadata: go-import metadata not found
  * "dinowernli.me/almanac/pkg/http/templates": unable to deduce repository and source type for "dinowernli.me/almanac/pkg/http/templates": unable to read metadata: go-import metadata not found

validateParams: could not deduce external imports' project roots

Sure-fire way to fix this seems to be to s/dinowernli.me/github.com\/dinowernli/g. Will open a PR.

Add a proof-of-concept minikube cluster

The current kubernetes configs (see kube directory) are proof-of-concept in the sense that they run a single almanac binary which runs all the services. This should be replaced/complemented with a kubernetes setup which runs each of the services separately. Rough outline:

  1. Add supprt in the almanac binary for specifying via flags which services the binary should run. It should still be possible to run all services, but the common case will be to run a single one per binary by invoking something like ./almanac serve --appender_ports=123,124,125 --mixer_ports=456,457
  2. Create a kubernetes service each for ingester, appender, mixer, and janitor.
  3. As a proof of concept, the services should share a kubernetes volume which they all talk to using disk storage. A second step would be to have them all point to an actual s3/gcs bucket.
  4. Teach service discovery to resolve services based on the kubernetes DNS names.

Cache chunks on disk and/or in memory

This task is about figuring out how to intelligently use disk and memory storage available in order to cache loaded chunks. The scope of this is quite large, but one way of approaching things could be:

  1. Implement a very basic caching layer in the form of a CachingBackend which caches and forward calls to an actual Backend implementation. This is mostly plumbing in perparation for part (2) below.
  2. Add metrics which allow evaluation of cache hit rates and expose the backend request metrics.
  3. Implement caching strategies in CachingBackend which use disk and/or memory. Use the load generation binary (see other task) to validate that the strategies work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.