Giter Site home page Giter Site logo

googlearchive / pubsubbeat Goto Github PK

View Code? Open in Web Editor NEW
41.0 12.0 23.0 21.14 MB

An Elastic Beat to ingest data from Google Pub/Sub

License: Other

Makefile 3.86% Go 72.47% Python 8.58% Shell 13.65% Dockerfile 1.43%
elasticsearch beats elastic elasticbeats pubsub google-cloud-platform

pubsubbeat's Introduction

Status: Archived

This project is no longer actively maintained by Google.


Pubsubbeat

Pubsubbeat is an elastic Beat for Google Cloud Pub/Sub. This Beat subscribes to a topic and ingest messages.

The main motivation behind the development of this plugin is to ingest Stackdriver Logs via the Exported Logs feature and send them directly to Elasticsearch ingest nodes.

This is not an officially supported Google product.

Getting Started with Pubsubbeat

Requirements

Build

To build the binary for Pubsubbeat run the command below. This will generate a binary in the same directory with the name pubsubbeat.

make

Run

To run Pubsubbeat with debugging output enabled, run:

./pubsubbeat -c pubsubbeat.yml -e -d "*"

Test

To test Pubsubbeat, run the following command:

make test

Cleanup

To clean Pubsubbeat source code, run the following commands:

make pre-commit

To clean up the build directory and generated artifacts, run:

make clean

Packaging

To build releases for available platforms:

make release

This will fetch and create binaries for all Linux, Windows and OSX

pubsubbeat's People

Contributors

andrewteall avatar dlmather avatar ikbenale avatar josephlewis42 avatar rosbo avatar superq avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pubsubbeat's Issues

Some settings not working

We're seeing a very strange issue with the new release that seems to ignore our settings:

We set this in our config yml:

output.elasticsearch:
  index: pubsub-rails-inf-gstg

But the published index is the default:

2020-01-14_10:52:51.50355 2020-01-14T10:52:51.503Z      INFO    instance/beat.go:297    Setup Beat: pubsubbeat; Version: 7.5.1
2020-01-14_10:52:51.50386 2020-01-14T10:52:51.503Z      INFO    [index-management]      idxmgmt/std.go:182      Set output.elasticsearch.index to 'pubsubbeat-7.5.1' as ILM is enabled.

Feature request: Add monitoring endpoint

I would like to be able to monitor this software. It would be useful to have a Prometheus-compatible metrics interface on the http listener.

I'm willing to contribute this code.

Package does not compile

kibana.NewGenerator has changed - elastic/beats@01a9e43

vendor/github.com/elastic/beats/dev-tools/cmd/kibana_index_pattern/kibana_index_pattern.go:40:52: not enough arguments in call to kibana.NewGenerator
have (string, string, string, string, common.Version)
want (string, string, string, string, string, common.Version)
make: *** [update] Error 2

Support for typeless mapping in ES7

Background

In ES7, types got removed.

Issue

Currently, the pubsubbeat adds a mapping type regardless of the ES version. (To be specific: It adds a doc mapping type for ES6 or above and a _default_ for everything below). Users are not able to get the current pubsubbeat working with ES7 because ES7 no longer allows mapping types. A sample error is:

ERROR pipeline/output.go:74 Failed to connect: Connection marked as failed because the onConnect callback failed: Error loading Elasticsearch template: could not load template: couldn't load template: couldn't load json. Error: 400 Bad Request: {"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"Root mapping definition has unsupported parameters: [doc : {_meta={version=6.2.2}, dynamic_templates=[{fields={path_match=fields.*, mapping={type=keyword}, match_mapping_type=string}}, {docker.container.labels={path_match=docker.container.labels.*, mapping={type=keyword}, match_mapping_type=string}}, {fields={path_match=fields.*, mapping={type=keyword}, match_mapping_type=string}}, {docker.container.labels={path_match=docker.container.labels.*, mapping={type=keyword}, match_mapping_type=string}}, {strings_as_keyword={mapping={ignore_above=1024, type=keyword}, match_mapping_type=string}}], properties={kubernetes={properties={container={properties={image={ignore_above=1024, type=keyword}, name={ignore_above=1024, type=keyword}}}, node={properties={name={ignore_above=1024, type=keyword}}}, pod={properties={name={ignore_above=1024, type=keyword}}}, namespace={ignore_above=1024, type=keyword}, annotations={type=object}, labels={type=object}}}, message_id={norms=false, type=text}, error={properties={code={type=long}, type={ignore_above=1024, type=keyword}, message={norms=false, type=text}}}, message={norms=false, type=text}, tags={ignore_above=1024, type=keyword}, docker={properties={container={properties={image={ignore_above=1024, type=keyword}, name={ignore_above=1024, type=keyword}, id={ignore_above=1024, type=keyword}, labels={type=object}}}}}, @timestamp={type=date}, meta={properties={cloud={properties={machine_type={ignore_above=1024, type=keyword}, availability_zone={ignore_above=1024, type=keyword}, instance_id={ignore_above=1024, type=keyword}, instance_name={ignore_above=1024, type=keyword}, project_id={ignore_above=1024, type=keyword}, provider={ignore_above=1024, type=keyword}, region={ignore_above=1024, type=keyword}}}}}, publish_time={type=date}, beat={properties={hostname={ignore_above=1024, type=keyword}, timezone={ignore_above=1024, type=keyword}, name={ignore_above=1024, type=keyword}, version={ignore_above=1024, type=keyword}}}, json={type=object}, attributes={type=object}, fields={type=object}}, date_detection=false}]"}]

Proposal

Make change so that no mapping type is provided if ES version is >= 7 and make corresponding changes that are required for it (references...etc). I plan to send a PR for it but wanted to capture the background and thought process here following the contribution guideline. (As of writing this, Github is down... so once they recover...)

"no such file or directory" error with Docker

Hi!
I'm getting the following error trying to make pubsubbeat running inside of docker container:

standard_init_linux.go:211: exec user process caused "no such file or directory"

My Dockerfile is as follows:

FROM golang:alpine

ENV PUBSUB_VERSION 1.2.0

RUN  mkdir /pubsubbeat && \
     wget https://github.com/GoogleCloudPlatform/pubsubbeat/releases/download/$PUBSUB_VERSION/pubsubbeat-linux-amd64.tar.gz && \
     tar -xzf pubsubbeat-linux-amd64.tar.gz --strip 1 -C /pubsubbeat && \
     rm pubsubbeat-linux-amd64.tar.gz

RUN chown -R 0:0 /pubsubbeat/pubsubbeat.yml

WORKDIR /pubsubbeat

ENTRYPOINT ["./pubsubbeat", "-c", "pubsubbeat.yml", "-e", "-d", "*"]

This happens with versions 1.2.0 , 1.1.0 and 1.0.0 . Version 1.0.1 works fine.

beat fails without admin or editor groups

The beat fails with the error Exiting: fail to create subscription: rpc error: code = PermissionDenied desc = User not authorized to perform this action. if it doesn't have the ability to create a subscription on the account, even if the subscription already exists.

As a user I expect to have the ability to configure the beat with a minimal set of permissions--ideally just subscribe--for security purposes and not have it create resources on my account.

Exiting: 'xpack.monitoring.elasticsearch.hosts' and 'output.elasticsearch.hosts' are configured Exiting: 'xpack.monitoring.elasticsearch.hosts' and 'output.elasticsearch.hosts' are configured

Can you allow for handling of separate hosts for elastic events and metrics to be sent to? We are using separate instances for each task and getting the below error.

output.elasticsearch:
hosts: ["http://10.0.0.10:9200"]
xpack.monitoring.elasticsearch:
  hosts: ["http://10.0.0.2:9200"]

2019-06-06T18:44:29.673Z INFO instance/beat.go:468 Home path: [/ Config path: [/] Data path: [data/palo] Logs path: [/]
2019-06-06T18:44:29.673Z DEBUG [beat] instance/beat.go:495 Beat metadata path: data/palo/meta.json
2019-06-06T18:44:29.673Z INFO instance/beat.go:475 Beat UUID: 75f3406e-5ae2-4b6b-8ce4-426302782b8d
2019-06-06T18:44:29.673Z INFO instance/beat.go:213 Setup Beat: pubsubbeat; Version: 6.2.2
2019-06-06T18:44:29.673Z DEBUG [beat] instance/beat.go:230 Initializing output plugins
2019-06-06T18:44:29.673Z DEBUG [processors] processors/processor.go:49 Processors:
2019-06-06T18:44:29.673Z INFO elasticsearch/client.go:145 Elasticsearch url: http://10.0.0.10:9200
2019-06-06T18:44:29.674Z INFO pipeline/module.go:76 Beat name: ****
2019-06-06T18:44:29.674Z INFO [PubSub: ] beater/pubsubbeat.go:54 config retrieved: &{Project CredentialsFile:** Subscription:{Name:*** RetainAckedMessages:false RetentionDuration:168h0m0s Create:false} Json:{Enabled:false AddErrorKey:false FieldsUnderRoot:false FieldsUseTimestamp:false FieldsTimestampName:@timestamp FieldsTimestampFormat:}}
2019-06-06T18:44:29.674Z ERROR instance/beat.go:667 Exiting: 'xpack.monitoring.elasticsearch.hosts' and 'output.elasticsearch.hosts' are configured
Exiting: 'xpack.monitoring.elasticsearch.hosts' and 'output.elasticsearch.hosts' are configured

Pubsubbeat pull order

After configuring pubsubbeat to pull mysql slow logs from GCP (Google Cloud Platform) that looks as following:
Screenshot 2020-05-04 at 11 58 32

I noticed that the order those log lines are being pulled is getting mixed and instead of being in the same order as they should be (filtered only the "textPayload" for better readibility):
Screenshot 2020-05-04 at 12 05 04

I've checked the pubsubbeat configuration settings and the Google's Pub/Sub subscriptions and topics options but was unable to found a way to define the order log objects are being pulled so can you please clarify if there is option not to get mixed messages. Thanks!

Output?

I would like a filebeat with pubsub output... Should this be a different beat? Or since this is called 'pubsubbeat' is it preferred that this handles both (input, as it does, as well as output)?

Make command no longer produces ./pubsubbeat binary

Running make on its own used to produce the pubsubbeat binary in the project directory. Now it can only be produced via make build which also changes the path to ./bin/. Happy to raise a PR on the README, but it was unclear to me if this was an intentional change.

Publish container image

It would be useful to publish an official container image in addition to the base binaries.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.