cloudfoundry / cf-k8s-logging Goto Github PK

License: Apache License 2.0

Shell 75.76% Go 24.24%

cf-k8s-logging's Introduction

cf-k8s-logging

cf-k8s-logging contains the portions of cf-for-k8s which enable logging outcomes. See our public roadmap to find out about our current efforts and future plans.

Configuration via values.yml

Log Destinations

To send all app logs to a destination via syslog you can setup app log destinations in your cf-values.yml file:

app_log_destinations:
#@overlay/append
- host: <hostname>
  port: <port_number>
  transport: <tls/tcp> #defaults to tls
  insecure_disable_tls_validation: <false/true> #defaults false
#@overlay/append
- host: <hostname>
  port: <port_number>
  transport: <tls/tcp> #defaults to tls
  insecure_disable_tls_validation: <false/true> #defaults false

Debug logging in cf-k8s-logging fluentd

To diagnose issues with Fluentd, you can increase the log level by setting the environment variable FLUENTD_FLAGS on Fluentd, like so

env:
- name: "FLUENTD_FLAGS"
  value: "-vvv"

-vvv is the highest logging level

Another way to see what is being sent is by replacing the output with a stdout logger:

<match **>
    @type stdout
</match>

API

Application logs can enter the logging system through two different paths:

App Containers

Logs from app containers are automatically ingested and egressed from cf-k8s-logging. App containers are expected to contain cloudfoundry.org/ labels which contain important app information, namely app_guid and source_type.

System Components/Injected Logs

Cloud Foundry components that wish to emit logs on behalf of apps may do so via the Fluentd forward input. This protocol consists of tagged log messages encoded in MessagePack over TCP. Injected logs will be sent to the same destinations as app container logs, as long as they contain the tags listed below under Log Format.

Logs should be sent to the Fluentd ingress service called fluentd-forwarder-ingress at port 24224 over the Fluent forwarding protocol. This protocol can be implemented using one of the following client libraries:

or by placing a Fluentd/Fluent Bit pod next to the component with a forward output plugin.

Examples are located in the examples folder.

NOTE: To communicate with the Forwarder API, Istio sidecar injection must be enabled with the istio-injection=enabled label in the component's namespace.

Log Format

Logs emitted to cf-k8s-logging by system components must include the fields:

app_id: id of the app for which the logs are being emitted.
instance_id: id of the instance for which the logs are being emitted.
source_type: source type of the logs. eg. STG
log: log message to emit

{"log":"This is a test log from a fluent log producer","app_id":"11111111-1111-1111-1111-111111111111","instance_id":"1", "source_type":"APP"}

Development flow

Make needed updates (update vendir, update k8s files, etc).
Make local commit (allows reverting of image tags).
Run ./scripts/build-images.sh, setting $REPOSITORY to a docker repository you can push to.
Run ./scripts/bump-cf-for-k8s.sh .
1. Bump cf-for-k8s should add all the kubernetes files needed to run integration tests
Follow cf-for-k8s deployment steps.

Running Integration Tests

Run ./scripts/bump-cf-for-k8s.sh. -- this sets up test dependencies to be installed
Deploy cf-for-k8s accoding to the documentation
Set the TEST_API_ENDPOINT TEST_USERNAME TEST_PASSWORD environment variables
Optionally, set TEST_SKIP_SSL environment variable
run ./hack/run_integration_tests.sh

cf-k8s-logging's People

Contributors

Stargazers

Watchers

Forkers

alex-slynko gdankov xanderstrike ciriarte cf-routing paulcwarren birdrock isabella232 masslessparticle yu-jin-song

cf-k8s-logging's Issues

Logs are not visible in log cache for a docker image deployed to kind

Upstream bug: cloudfoundry/cf-for-k8s#157

Investigate alternatives to Log Cache for short-term log access

(Part of #59.)

We would want to understand the following characteristics of any system for short-term log access:

What ingestion/egress throughput do these systems have?
What is their resource consumption?
- CPU
- Memory
- Disk (IO and space)
How complex are they to deploy for "kick-the-tires" levels of performance?
How far can their throughput and retention scale?
What does the ecosystem around them look like?
- Contributions from a diverse set of companies?
- Breadth of knowledge on using/scaling?

Build logs are not displayed during cf-push

Expected Behavior

During app staging, I expect cf-push to stream build logs.

Actual Behavior

No build logs are shown to the user.

Steps to Reproduce the Problem

Run through steps in cf-for-k8s install steps
Run cf push app -p /path/to/java-app
Notice the lack of build logs

Additional information

See this slack thread for more information.

If instance index is not a string, conversion into syslog fails

CAPI team recently used the Fluent Logging Ruby to send their logs to the Forwarder API. The team was sending logs in the correct format, but the instance_id was 0. This caused parsing errors in the output plugin that eventually showed up as parse errors in log cache.

Log delivery is unreliable in VMware TKGs

Smoke tests that use the CF CLI to get logs from Log Cache are unreliable in this Kubernetes provider specifically.

test issue

Migrate DaemonSets from extensions/v1beta1 to apps/v1

Kubernetes 1.16 removed the extensions/v1beta1 API group. https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/

config/manifests/500-fluentd-daemonset.yaml includes a resource definition from that API group.

Please migrate that resource definition to meet the apps/v1 specification.

Note that apps/v1/DaemonSet requires that spec.selector needs to be a subset matching spec.template.metadata.labels.

This came to CF for K8s / Release Integration's attention through: cloudfoundry/cf-for-k8s#22.

Enriching logs with Kubernetes labels/annotations via a Fluentd input plugin should be possible with less overhead

When enriching Fluentd logs via filtering, I want Kubernetes labels and annotations to be added without dropping performance or adding overhead, so that my log transport infrastructure doesn't overload my cluster.

AC

Given a Fluentd
When I add the new Fluentd filter plugin
And configure it with my desired labels, annotations, and other necessary information
Then I see my desired labels and annotations on outgoing logs

Log Cache should allow operators to set a fixed memory limit

We did a scalability tests on cf-for-k8s version 0.6.0. We were succeeded in deploying 1600 apps. To simulate real workload scenario we deploy apps which generate 1 log/sec and 1 req/sec. During the tests we could observe log-cache consumes > 10Gi of memory . We tried by scaling to two replicas, even with that it memory spikes up.

Below graph show you the number of apps deployed vs log-cache memory.

Just a note: Fluentd was consuming an average of 2.5Gi

Kubernetes metadata filter plugin adds undue overhead

Expected behavior: Enriching app logs with Kubernetes metadata should not add undue overhead or dramatically decrease throughput.

Actual behavior: Adding the Kubernetes filter plugin dramatically reduces throughput.

💡 Allow horizontal scaling of Log Cache

Enable higher availability and throughput of the Log Cache API, which serves cf logs. It will be possible to scale the storage layer and API layer separately.

Configuring GitBot is recommended

Pivotal provides the GitBot service to synchronize pull requests and/or issues made against public GitHub repos with Pivotal Tracker projects. This service does not track individual commits.

If you are a Pivotal employee, you can configure Gitbot to sync your GitHub repo to your Pivotal Tracker project with a pull request. An ask+rd@ ticket is the fastest way to get write access if you get a 404 to the config repo.

If you do not want have pull requests and/or issues copied from GitHub to Pivotal Tracker, you do not need to take any action.

If there are any questions, please reach out to [email protected].

remove imagePullSecrets configuration from DaemonSet

if users need to have image pull secret they can always add it via an overlay. /cc @ciriarte

Task logs don't come out if there's only one task

Push a logspinner to cf-for-k8s
run a cf task against the logspinner: cf run-task test "echo "LISTING" && pwd; done" --name TASK_NAME
No logs come out from cf logs
run many tasks: for i in $(seq 100); do cf run-task test "echo "LISTING" && pwd; done" --name TASK_NAME$i; done
logs come out!

This might be because the task is super short lived. How long does it take for fluent tail to recognize new containers?

All images should be based on Ubuntu Bionic

For coherence with the rest of the images on the platform (and closed-source OSL reasons), the Fluentd Docker image (and everything else) should be based on Ubuntu Bionic.

Spike: Fluent-Bit-in-sidecar proof of concept

Questions to answer:

What log throughput can this support?
- Per app?
- Per node?
What is the memory/CPU footprint of each sidecar?
- When idle?
- When busy?
Can we add Kubernetes metadata at the sidecar?
- How does this change the load on the Kubernetes API?

Improve Log Cache's disaster recovery, or
Use a different tool for short-term log storage

💡 Configurable log metadata

Possibilities:

extra fixed metadata (environment tags, etc.)
filtering out dynamic metadata

support for batching?
configurability of JSON format?