Giter Site home page Giter Site logo

ceph-helm's Introduction

Ceph - a scalable distributed storage system

See https://ceph.com/ for current information about Ceph.

Status

Issue Backporting

Contributing Code

Most of Ceph is dual-licensed under the LGPL version 2.1 or 3.0. Some miscellaneous code is either public domain or licensed under a BSD-style license.

The Ceph documentation is licensed under Creative Commons Attribution Share Alike 3.0 (CC-BY-SA-3.0).

Some headers included in the ceph/ceph repository are licensed under the GPL. See the file COPYING for a full inventory of licenses by file.

All code contributions must include a valid "Signed-off-by" line. See the file SubmittingPatches.rst for details on this and instructions on how to generate and submit patches.

Assignment of copyright is not required to contribute code. Code is contributed under the terms of the applicable license.

Checking out the source

Clone the ceph/ceph repository from github by running the following command on a system that has git installed:

git clone [email protected]:ceph/ceph

Alternatively, if you are not a github user, you should run the following command on a system that has git installed:

git clone https://github.com/ceph/ceph.git

When the ceph/ceph repository has been cloned to your system, run the following commands to move into the cloned ceph/ceph repository and to check out the git submodules associated with it:

cd ceph
git submodule update --init --recursive --progress

Build Prerequisites

section last updated 27 Jul 2023

Make sure that curl is installed. The Debian and Ubuntu apt command is provided here, but if you use a system with a different package manager, then you must use whatever command is the proper counterpart of this one:

apt install curl

Install Debian or RPM package dependencies by running the following command:

./install-deps.sh

Install the python3-routes package:

apt install python3-routes

Building Ceph

These instructions are meant for developers who are compiling the code for development and testing. To build binaries that are suitable for installation we recommend that you build .deb or .rpm packages, or refer to ceph.spec.in or debian/rules to see which configuration options are specified for production builds.

To build Ceph, make sure that you are in the top-level ceph directory that contains do_cmake.sh and CONTRIBUTING.rst and run the following commands:

./do_cmake.sh
cd build
ninja

do_cmake.sh by default creates a "debug build" of Ceph, which can be up to five times slower than a non-debug build. Pass -DCMAKE_BUILD_TYPE=RelWithDebInfo to do_cmake.sh to create a non-debug build.

Ninja is the buildsystem used by the Ceph project to build test builds. The number of jobs used by ninja is derived from the number of CPU cores of the building host if unspecified. Use the -j option to limit the job number if the build jobs are running out of memory. If you attempt to run ninja and receive a message that reads g++: fatal error: Killed signal terminated program cc1plus, then you have run out of memory. Using the -j option with an argument appropriate to the hardware on which the ninja command is run is expected to result in a successful build. For example, to limit the job number to 3, run the command ninja -j 3. On average, each ninja job run in parallel needs approximately 2.5 GiB of RAM.

This documentation assumes that your build directory is a subdirectory of the ceph.git checkout. If the build directory is located elsewhere, point CEPH_GIT_DIR to the correct path of the checkout. Additional CMake args can be specified by setting ARGS before invoking do_cmake.sh. See cmake options for more details. For example:

ARGS="-DCMAKE_C_COMPILER=gcc-7" ./do_cmake.sh

To build only certain targets, run a command of the following form:

ninja [target name]

To install:

ninja install

CMake Options

The -D flag can be used with cmake to speed up the process of building Ceph and to customize the build.

Building without RADOS Gateway

The RADOS Gateway is built by default. To build Ceph without the RADOS Gateway, run a command of the following form:

cmake -DWITH_RADOSGW=OFF [path to top-level ceph directory]

Building with debugging and arbitrary dependency locations

Run a command of the following form to build Ceph with debugging and alternate locations for some external dependencies:

cmake -DCMAKE_INSTALL_PREFIX=/opt/ceph -DCMAKE_C_FLAGS="-Og -g3 -gdwarf-4" \
..

Ceph has several bundled dependencies such as Boost, RocksDB and Arrow. By default, cmake builds these bundled dependencies from source instead of using libraries that are already installed on the system. You can opt to use these system libraries, as long as they meet Ceph's version requirements. To use system libraries, use cmake options like WITH_SYSTEM_BOOST, as in the following example:

cmake -DWITH_SYSTEM_BOOST=ON [...]

To view an exhaustive list of -D options, invoke cmake -LH:

cmake -LH

Preserving diagnostic colors

If you pipe ninja to less and would like to preserve the diagnostic colors in the output in order to make errors and warnings more legible, run the following command:

cmake -DDIAGNOSTICS_COLOR=always ...

The above command works only with supported compilers.

The diagnostic colors will be visible when the following command is run:

ninja | less -R

Other available values for DIAGNOSTICS_COLOR are auto (default) and never.

Building a source tarball

To build a complete source tarball with everything needed to build from source and/or build a (deb or rpm) package, run

./make-dist

This will create a tarball like ceph-$version.tar.bz2 from git. (Ensure that any changes you want to include in your working directory are committed to git.)

Running a test cluster

From the ceph/ directory, run the following commands to launch a test Ceph cluster:

cd build
ninja vstart        # builds just enough to run vstart
../src/vstart.sh --debug --new -x --localhost --bluestore
./bin/ceph -s

Most Ceph commands are available in the bin/ directory. For example:

./bin/rbd create foo --size 1000
./bin/rados -p foo bench 30 write

To shut down the test cluster, run the following command from the build/ directory:

../src/stop.sh

Use the sysvinit script to start or stop individual daemons:

./bin/init-ceph restart osd.0
./bin/init-ceph stop

Running unit tests

To build and run all tests (in parallel using all processors), use ctest:

cd build
ninja
ctest -j$(nproc)

(Note: Many targets built from src/test are not run using ctest. Targets starting with "unittest" are run in ninja check and thus can be run with ctest. Targets starting with "ceph_test" can not, and should be run by hand.)

When failures occur, look in build/Testing/Temporary for logs.

To build and run all tests and their dependencies without other unnecessary targets in Ceph:

cd build
ninja check -j$(nproc)

To run an individual test manually, run ctest with -R (regex matching):

ctest -R [regex matching test name(s)]

(Note: ctest does not build the test it's running or the dependencies needed to run it)

To run an individual test manually and see all the tests output, run ctest with the -V (verbose) flag:

ctest -V -R [regex matching test name(s)]

To run tests manually and run the jobs in parallel, run ctest with the -j flag:

ctest -j [number of jobs]

There are many other flags you can give ctest for better control over manual test execution. To view these options run:

man ctest

Building the Documentation

Prerequisites

The list of package dependencies for building the documentation can be found in doc_deps.deb.txt:

sudo apt-get install `cat doc_deps.deb.txt`

Building the Documentation

To build the documentation, ensure that you are in the top-level /ceph directory, and execute the build script. For example:

admin/build-doc

Reporting Issues

To report an issue and view existing issues, please visit https://tracker.ceph.com/projects/ceph.

ceph-helm's People

Contributors

a-robinson avatar amandacameron avatar andresbono avatar edsiper avatar electroma avatar flah00 avatar foxish avatar gtaylor avatar h0tbird avatar jackzampolin avatar jainishshah17 avatar jotadrilo avatar kevinschumacher avatar kfox1111 avatar lachie83 avatar linki avatar mgoodness avatar migmartri avatar nitisht avatar prydonius avatar rimusz avatar rootfs avatar scottrigby avatar sebgoa avatar sstarcher avatar technosophos avatar tompizmor avatar unguiculus avatar viglesiasce avatar yuvipanda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ceph-helm's Issues

Deploying ceph multiple times causes authentication error.

Is this a request for help?:
Yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Version of Helm and Kubernetes:
kubernetes 1.8.3

Which chart:
ceph

What happened:
I have a kubernetes deployment running. When I try to deploy ceph for the first time, it works fine.
But when I purge ceph helm chart and redeploy ceph, ceph-mon, ceph-mon-check works fine but ceph-mds, ceph-mgr have an authentication failure and timeout connecting to ceph-mon.

What you expected to happen:
I would expect that all the services are able to connect to monitor without any authentication issues.

How to reproduce it (as minimally and precisely as possible):
Deploy ceph, purge ceph chart, deploy ceph again.

Anything else we need to know:
Something to note: I am not deleting the hostpath folder in the nodes on which ceph is being re deployed. I was expecting that redeploying would persist the content of ceph.

[secrets "ceph-bootstrap-osd-keyring" not found] ceph-osd fail on k8s 1.10.1 install by kubeadm


kubectl get po -n ceph
NAME                                            READY     STATUS                  RESTARTS   AGE
ceph-mds-696bd98bdb-bnvpg                       0/1       Pending                 0          18m
ceph-mds-keyring-generator-q679r                0/1       Completed               0          18m
ceph-mgr-6d5f86d9c4-nr76h                       1/1       Running                 1          18m
ceph-mgr-keyring-generator-v825z                0/1       Completed               0          18m
ceph-mon-86lth                                  1/1       Running                 0          18m
ceph-mon-check-74d98c5b95-wf9tm                 1/1       Running                 0          18m
ceph-mon-keyring-generator-rfg8j                0/1       Completed               0          18m
ceph-mon-pp5hc                                  1/1       Running                 0          18m
ceph-namespace-client-key-cleaner-g9dri-sjmqd   0/1       Completed               0          1h
ceph-namespace-client-key-cleaner-qwkee-pdkh6   0/1       Completed               0          21m
ceph-namespace-client-key-cleaner-t25ui-5gkb7   0/1       Completed               0          2d
ceph-namespace-client-key-generator-xk4w6       0/1       Completed               0          18m
ceph-osd-dev-sda-6jbgd                          0/1       Init:CrashLoopBackOff   8          18m
ceph-osd-dev-sda-khfhw                          0/1       Init:CrashLoopBackOff   8          18m
ceph-osd-dev-sda-krkjf                          0/1       Init:CrashLoopBackOff   8          18m
ceph-osd-keyring-generator-mvktj                0/1       Completed               0          18m
ceph-rbd-provisioner-b58659dc9-nhx2q            1/1       Running                 0          18m
ceph-rbd-provisioner-b58659dc9-nnlh2            1/1       Running                 0          18m
ceph-rgw-5bd9dd66c5-gh946                       0/1       Pending                 0          18m
ceph-rgw-keyring-generator-dz9kd                0/1       Completed               0          18m
ceph-storage-admin-key-cleaner-1as0t-fq589      0/1       Completed               0          1h
ceph-storage-admin-key-cleaner-oayjp-fglzr      0/1       Completed               0          2d
ceph-storage-admin-key-cleaner-zemvx-jxn7c      0/1       Completed               0          21m
ceph-storage-keys-generator-szps9               0/1       Completed               0          18m

Version of Helm and Kubernetes:

  • k8s 1.10.1
  • helm
helm version
Client: &version.Version{SemVer:"v2.8.2", GitCommit:"a80231648a1473929271764b920a8e346f6de844", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.8.2", GitCommit:"a80231648a1473929271764b920a8e346f6de844", GitTreeState:"clean"}

Which chart:

commit 70681c8218e75bb10acc7dc210b791b69545ce0d
Merge: a4fa8a1 6cb2e1e
Author: Huamin Chen <[email protected]>
Date:   Thu Apr 26 12:11:43 2018 -0400

  • pod describe
kubectl describe -n ceph po  ceph-osd-dev-sda-6jbgd
Name:           ceph-osd-dev-sda-6jbgd
Namespace:      ceph
Node:           k8s-2/192.168.16.40
Start Time:     Sat, 28 Apr 2018 15:54:08 +0800
Labels:         application=ceph
                component=osd
                controller-revision-hash=1450926272
                pod-template-generation=1
                release_group=ceph
Annotations:    <none>
Status:         Pending
IP:             192.168.16.40
Controlled By:  DaemonSet/ceph-osd-dev-sda
Init Containers:
  init:
    Container ID:  docker://8c9536cc4c5d811f57ba6349c87245121651841f52db682f858ae0ac70555856
    Image:         docker.io/kolla/ubuntu-source-kubernetes-entrypoint:4.0.0
    Image ID:      docker-pullable://kolla/ubuntu-source-kubernetes-entrypoint@sha256:75116ab2f9f65c5fc078e68ce7facd66c1c57496947f37b7209b32f94925e53b
    Port:          <none>
    Host Port:     <none>
    Command:
      kubernetes-entrypoint
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Sat, 28 Apr 2018 15:54:34 +0800
      Finished:     Sat, 28 Apr 2018 15:54:36 +0800
    Ready:          True
    Restart Count:  0
    Environment:
      POD_NAME:              ceph-osd-dev-sda-6jbgd (v1:metadata.name)
      NAMESPACE:             ceph (v1:metadata.namespace)
      INTERFACE_NAME:        eth0
      DEPENDENCY_SERVICE:    ceph-mon
      DEPENDENCY_JOBS:
      DEPENDENCY_DAEMONSET:
      DEPENDENCY_CONTAINER:
      COMMAND:               echo done
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-z5m75 (ro)
  ceph-init-dirs:
    Container ID:  docker://1562879ebbc52c47cfd9fb292339e548d26450207846ff6eeb38594569d5ec5f
    Image:         docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04
    Image ID:      docker-pullable://ceph/daemon@sha256:687056228e899ecbfd311854e3864db0b46dd4a9a6d4eb4b47c815ca413f25ee
    Port:          <none>
    Host Port:     <none>
    Command:
      /tmp/init_dirs.sh
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Sat, 28 Apr 2018 15:54:38 +0800
      Finished:     Sat, 28 Apr 2018 15:54:39 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /run from pod-run (rw)
      /tmp/init_dirs.sh from ceph-bin (ro)
      /var/lib/ceph from pod-var-lib-ceph (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-z5m75 (ro)
      /variables_entrypoint.sh from ceph-bin (ro)
  osd-prepare-pod:
    Container ID:  docker://2b5bed33de8f35533eb72ef3208010153b904a8ed34c527a4916b88f549d5f6b
    Image:         docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04
    Image ID:      docker-pullable://ceph/daemon@sha256:687056228e899ecbfd311854e3864db0b46dd4a9a6d4eb4b47c815ca413f25ee
    Port:          <none>
    Host Port:     <none>
    Command:
      /start_osd.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Sat, 28 Apr 2018 16:11:06 +0800
      Finished:     Sat, 28 Apr 2018 16:11:07 +0800
    Ready:          False
    Restart Count:  8
    Environment:
      CEPH_DAEMON:         osd_ceph_disk_prepare
      KV_TYPE:             k8s
      CLUSTER:             ceph
      CEPH_GET_ADMIN_KEY:  1
      OSD_DEVICE:          /dev/mapper/centos-root
      HOSTNAME:             (v1:spec.nodeName)
    Mounts:
      /common_functions.sh from ceph-bin (ro)
      /dev from devices (rw)
      /etc/ceph/ceph.client.admin.keyring from ceph-client-admin-keyring (ro)
      /etc/ceph/ceph.conf from ceph-etc (ro)
      /etc/ceph/ceph.mon.keyring from ceph-mon-keyring (ro)
      /osd_activate_journal.sh from ceph-bin (ro)
      /osd_disk_activate.sh from ceph-bin (ro)
      /osd_disk_prepare.sh from ceph-bin (ro)
      /osd_disks.sh from ceph-bin (ro)
      /run from pod-run (rw)
      /start_osd.sh from ceph-bin (ro)
      /var/lib/ceph from pod-var-lib-ceph (rw)
      /var/lib/ceph/bootstrap-mds/ceph.keyring from ceph-bootstrap-mds-keyring (ro)
      /var/lib/ceph/bootstrap-osd/ceph.keyring from ceph-bootstrap-osd-keyring (ro)
      /var/lib/ceph/bootstrap-rgw/ceph.keyring from ceph-bootstrap-rgw-keyring (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-z5m75 (ro)
      /variables_entrypoint.sh from ceph-bin (ro)
Containers:
  osd-activate-pod:
    Container ID:
    Image:         docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /start_osd.sh
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Liveness:       tcp-socket :6800 delay=60s timeout=5s period=10s #success=1 #failure=3
    Readiness:      tcp-socket :6800 delay=0s timeout=5s period=10s #success=1 #failure=3
    Environment:
      CEPH_DAEMON:         osd_ceph_disk_activate
      KV_TYPE:             k8s
      CLUSTER:             ceph
      CEPH_GET_ADMIN_KEY:  1
      OSD_DEVICE:          /dev/mapper/centos-root
      HOSTNAME:             (v1:spec.nodeName)
    Mounts:
      /common_functions.sh from ceph-bin (ro)
      /dev from devices (rw)
      /etc/ceph/ceph.client.admin.keyring from ceph-client-admin-keyring (ro)
      /etc/ceph/ceph.conf from ceph-etc (ro)
      /etc/ceph/ceph.mon.keyring from ceph-mon-keyring (ro)
      /osd_activate_journal.sh from ceph-bin (ro)
      /osd_disk_activate.sh from ceph-bin (ro)
      /osd_disk_prepare.sh from ceph-bin (ro)
      /osd_disks.sh from ceph-bin (ro)
      /run from pod-run (rw)
      /start_osd.sh from ceph-bin (ro)
      /var/lib/ceph from pod-var-lib-ceph (rw)
      /var/lib/ceph/bootstrap-mds/ceph.keyring from ceph-bootstrap-mds-keyring (ro)
      /var/lib/ceph/bootstrap-osd/ceph.keyring from ceph-bootstrap-osd-keyring (ro)
      /var/lib/ceph/bootstrap-rgw/ceph.keyring from ceph-bootstrap-rgw-keyring (ro)
      /var/log/ceph from pod-var-log-ceph (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-z5m75 (ro)
      /variables_entrypoint.sh from ceph-bin (ro)
Conditions:
  Type           Status
  Initialized    False
  Ready          False
  PodScheduled   True
Volumes:
  devices:
    Type:          HostPath (bare host directory volume)
    Path:          /dev
    HostPathType:
  pod-var-lib-ceph:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  pod-var-log-ceph:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/ceph/osd
    HostPathType:
  pod-run:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  Memory
  ceph-bin:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ceph-bin
    Optional:  false
  ceph-etc:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ceph-etc
    Optional:  false
  ceph-client-admin-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-client-admin-keyring
    Optional:    false
  ceph-mon-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-mon-keyring
    Optional:    false
  ceph-bootstrap-osd-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-bootstrap-osd-keyring
    Optional:    false
  ceph-bootstrap-mds-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-bootstrap-mds-keyring
    Optional:    false
  ceph-bootstrap-rgw-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-bootstrap-rgw-keyring
    Optional:    false
  default-token-z5m75:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-z5m75
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  ceph-osd=enabled
                 ceph-osd-device-dev-sda=enabled
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
Events:
  Type     Reason                 Age                From            Message
  ----     ------                 ----               ----            -------
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "pod-var-log-ceph"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "devices"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "pod-var-lib-ceph"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "pod-run"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "ceph-bin"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "ceph-etc"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "default-token-z5m75"
  Warning  FailedMount            19m (x2 over 19m)  kubelet, k8s-2  MountVolume.SetUp failed for volume "ceph-bootstrap-osd-keyring" : secrets "ceph-bootstrap-osd-keyring" not found
  Warning  FailedMount            19m (x2 over 19m)  kubelet, k8s-2  MountVolume.SetUp failed for volume "ceph-bootstrap-rgw-keyring" : secrets "ceph-bootstrap-rgw-keyring" not found
  Warning  FailedMount            19m (x2 over 19m)  kubelet, k8s-2  MountVolume.SetUp failed for volume "ceph-client-admin-keyring" : secrets "ceph-client-admin-keyring" not found
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "ceph-client-admin-keyring"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "ceph-bootstrap-osd-keyring"
  Warning  FailedMount            19m (x4 over 19m)  kubelet, k8s-2  MountVolume.SetUp failed for volume "ceph-mon-keyring" : secrets "ceph-mon-keyring" not found
  Warning  FailedMount            19m (x4 over 19m)  kubelet, k8s-2  MountVolume.SetUp failed for volume "ceph-bootstrap-mds-keyring" : secrets "ceph-bootstrap-mds-keyring" not found
  Normal   SuccessfulMountVolume  19m (x2 over 19m)  kubelet, k8s-2  (combined from similar events): MountVolume.SetUp succeeded for volume "ceph-mon-keyring"
  Warning  BackOff                4m (x65 over 18m)  kubelet, k8s-2  Back-off restarting failed container

  • pod log
kubectl  logs  -n ceph   ceph-osd-dev-sda-6jbgd
Error from server (BadRequest): container "osd-activate-pod" in pod "ceph-osd-dev-sda-6jbgd" is waiting to start: PodInitializing

Behavior when removing an OSD node

When removing a OSD node from the cluster:
$ kubectl label node mira115 ceph-osd=disabled --overwrite

The ceph-osd PODs are deleted however the OSD are marked down and are still present in the cluster.
The expected behavior since it's a node removal should include running the following commands:

  • for each OSD: ceph osd purge <osd>
  • Then: ceph osd crush rm <hostname>
  • maybe: zap the OSD drives?

Otherwise, adding back this node, we end up with:

-2       12.59991     host mira115
 1   hdd  0.89999         osd.1      down        0 1.00000
 2   hdd  0.89999         osd.2      down        0 1.00000
 3   hdd  0.89999         osd.3      down        0 1.00000
 5   hdd  0.89999         osd.5      down        0 1.00000
 6   hdd  0.89999         osd.6      down        0 1.00000
17   hdd  0.89999         osd.17     down        0 1.00000
18   hdd  0.89999         osd.18     down  1.00000 1.00000
20   hdd  0.89999         osd.20       up  1.00000 1.00000
21   hdd  0.89999         osd.21       up  1.00000 1.00000
22   hdd  0.89999         osd.22       up  1.00000 1.00000
23   hdd  0.89999         osd.23       up  1.00000 1.00000
24   hdd  0.89999         osd.24       up  1.00000 1.00000
25   hdd  0.89999         osd.25       up  1.00000 1.00000
26   hdd  0.89999         osd.26       up  1.00000 1.00000

Change osd and mon path on minikube

Is this a request for help?:

no

Is this a BUG REPORT or FEATURE REQUEST? (choose one):

FEATURE REQUEST

Version of Helm and Kubernetes:

Client: &version.Version{SemVer:"v2.12.0", GitCommit:"d325d2a9c179b33af1a024cdb5a4472b6288016a", GitTreeState:"clean"}
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T10:39:04Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}

Which chart:

ceph-helm

What happened:

does not work on minikube

What you expected to happen:

work on minikube

How to reproduce it (as minimally and precisely as possible):

follow the install process

Anything else we need to know:

in order to fix this update :

osd_directory: /var/lib/ceph-helm
mon_directory: /var/lib/ceph-helm

to

osd_directory: /data/ceph-helm
mon_directory: /data/lib/ceph-helm

perhaps this should be specified in the documention.

diff --git a/ceph/ceph/values.yaml b/ceph/ceph/values.yaml
index 5831c53..72a74b7 100644
--- a/ceph/ceph/values.yaml
+++ b/ceph/ceph/values.yaml
@@ -254,8 +254,8 @@ ceph:
     mgr: true
   storage:
     # will have $NAMESPACE/{osd,mon} appended
-    osd_directory: /var/lib/ceph-helm
-    mon_directory: /var/lib/ceph-helm
+    osd_directory: /data/ceph-helm
+    mon_directory: /data/lib/ceph-helm
     # use /var/log for fluentd to collect ceph log
     # mon_log: /var/log/ceph/mon
     # osd_log: /var/log/ceph/osd

can't use md0 as device

Is this a request for help?: no

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Version of Helm and Kubernetes:
helm

Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}

kubectl

Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.5-gke.4", GitCommit:"6265b9797fc8680c8395abeab12c1e3bad14069a", GitTreeState:"clean", BuildDate:"2018-08-04T03:47:40Z", GoVersion:"go1.9.3b4", Compiler:"gc", Platform:"linux/amd64"}

Which chart:
ceph-helm

What happened:
Tried to setup an mdadm device to stripe 2 disks in raid0 and handle them as a single osd.
It does not finish the osd-device setup properly.

What you expected to happen:
for the setup to finish and work as well as it does with any other sdb/sdc/sdd...

How to reproduce it (as minimally and precisely as possible):
create an md0 device and use it as you would any other sdX device (OSD device).
the setup fails because the osd-activate-pod crashes with:

2018-08-13T17:14:53.260443418Z command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs -i 9 --monmap /var/lib/ceph/tmp/mnt.eawl9p/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.eawl9p --osd-journal /var/lib/ceph/tmp/mnt.eawl9p/journal --osd-uuid 36356a07-a91f-4625-8b6f-864dd991de5f --setuser ceph --setgroup disk
2018-08-13T17:14:53.324461741Z 2018-08-13 17:14:53.324174 7fc7564cee00 -1 filestore(/var/lib/ceph/tmp/mnt.eawl9p) mkjournal(1066): error creating journal on /var/lib/ceph/tmp/mnt.eawl9p/journal: (2) No such file or directory
2018-08-13T17:14:53.324483608Z 2018-08-13 17:14:53.324256 7fc7564cee00 -1 OSD::mkfs: ObjectStore::mkfs failed with error (2) No such file or directory
2018-08-13T17:14:53.324849587Z 2018-08-13 17:14:53.324610 7fc7564cee00 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.eawl9p: (2) No such file or directory
2018-08-13T17:14:53.329122347Z mount_activate: Failed to activate
2018-08-13T17:14:53.329225389Z unmount: Unmounting /var/lib/ceph/tmp/mnt.eawl9p
2018-08-13T17:14:53.3294495Z command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.eawl9p
2018-08-13T17:14:53.375884854Z Traceback (most recent call last):
2018-08-13T17:14:53.375907887Z   File "/usr/sbin/ceph-disk", line 9, in &lt;module&gt;
2018-08-13T17:14:53.375913364Z     load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
2018-08-13T17:14:53.375918173Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 5717, in run
2018-08-13T17:14:53.377208587Z     main(sys.argv[1:])
2018-08-13T17:14:53.377222215Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 5668, in main
2018-08-13T17:14:53.37842527Z     args.func(args)
2018-08-13T17:14:53.378439165Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 3758, in main_activate
2018-08-13T17:14:53.379145782Z     reactivate=args.reactivate,
2018-08-13T17:14:53.379156768Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 3521, in mount_activate
2018-08-13T17:14:53.379899211Z     (osd_id, cluster) = activate(path, activate_key_template, init)
2018-08-13T17:14:53.379910301Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 3698, in activate
2018-08-13T17:14:53.380577968Z     keyring=keyring,
2018-08-13T17:14:53.380589482Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 3165, in mkfs
2018-08-13T17:14:53.381196441Z     '--setgroup', get_ceph_group(),
2018-08-13T17:14:53.381206848Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 566, in command_check_call
2018-08-13T17:14:53.381212315Z     return subprocess.check_call(arguments)
2018-08-13T17:14:53.381216601Z   File "/usr/lib/python2.7/subprocess.py", line 541, in check_call
2018-08-13T17:14:53.381482189Z     raise CalledProcessError(retcode, cmd)
2018-08-13T17:14:53.381659138Z subprocess.CalledProcessError: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '-i', u'9', '--monmap', '/var/lib/ceph/tmp/mnt.eawl9p/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.eawl9p', '--osd-journal', '/var/lib/ceph/tmp/mnt.eawl9p/journal', '--osd-uuid', u'36356a07-a91f-4625-8b6f-864dd991de5f', '--setuser', 'ceph', '--setgroup', 'disk']' returned non-zero exit status 1

Anything else we need to know:
I found out a place where it is assumed that the partition number X of a device is defined by just adding a number to the device name, this is true for sdX1 for example, but not for mdXp1
I applied the following patch, but still doesn't work.

diff --git a/ceph/ceph/templates/bin/_osd_disk_prepare.sh.tpl b/ceph/ceph/templates/bin/_osd_disk_prepare.sh.tpl
index eda2b3f..88cf800 100644
--- a/ceph/ceph/templates/bin/_osd_disk_prepare.sh.tpl
+++ b/ceph/ceph/templates/bin/_osd_disk_prepare.sh.tpl
@@ -27,7 +27,7 @@ function osd_disk_prepare {
     log "Checking if it belongs to this cluster"
     tmp_osd_mount="/var/lib/ceph/tmp/`echo $RANDOM`/"
     mkdir -p $tmp_osd_mount
-    mount ${OSD_DEVICE}1 ${tmp_osd_mount}
+    mount $(dev_part ${OSD_DEVICE} 1) ${tmp_osd_mount}
     osd_cluster_fsid=`cat ${tmp_osd_mount}/ceph_fsid`
     umount ${tmp_osd_mount} && rmdir ${tmp_osd_mount}
     cluster_fsid=`ceph ${CLI_OPTS} --name client.bootstrap-osd --keyring $OSD_BOOTSTRAP_KEYRING fsid`
@@ -56,7 +56,7 @@ function osd_disk_prepare {
     echo "Unmounting LOCKBOX directory"
     # NOTE(leseb): adding || true so when this bug will be fixed the entrypoint will not fail
     # Ceph bug tracker: http://tracker.ceph.com/issues/18944
-    DATA_UUID=$(blkid -o value -s PARTUUID ${OSD_DEVICE}1)
+    DATA_UUID=$(blkid -o value -s PARTUUID $(dev_part ${OSD_DEVICE} 1))
     umount /var/lib/ceph/osd-lockbox/${DATA_UUID} || true
   else
     ceph-disk -v prepare ${CLI_OPTS} --journal-uuid ${OSD_JOURNAL_UUID} ${OSD_DEVICE} ${OSD_JOURNAL}

procMount for k8s 1.12

Current version of ceph-helm fails to install on k8s v1.12. v1.12 requires procMount set in the security context.

The templates for the OSD set a securityContext for priviledged: true. procMount: default is now required. One instance in daemonset-osd.yaml and two instances in daemonset-osd-devices.yaml. I made edits locally to verify, but am not setup to cleanly make the edits within the git repo.

Secrets generate error

Error from server (BadRequest): error when creating "STDIN": Secret in version "v1" cannot be handled as a Secret: v1.Secret: ObjectMeta: v1.ObjectMeta: TypeMeta: Kind: Data: decode base64: illegal base64 data at input byte 156, parsing 202 ...2QiCgo=\n"... at {"apiVersion":"v1","data":

Ceph helm not running

My env:
CentOS7, kubernetes1.11.1

ceph-overrides.yaml

network:
  public: 192.168.105.0/24
  cluster: 192.168.105.0/24

osd_devices:
  - name: dev-sdb
    device: /dev/sdb
    zap: "1"

storageclass:
  name: ceph-rbd
  pool: rbd
  #user_id: admin
  user_id: k8s

ceph/values.yaml I modified

deployment:
  ceph: true
  storage_secrets: true
  client_secrets: true
  rbd_provisioner: true
  rgw_keystone_user_and_endpoints: false

images:
  ks_user: docker.io/kolla/centos-source-heat-engine:3.0.3
  ks_service: docker.io/kolla/centos-source-heat-engine:3.0.3
  ks_endpoints: docker.io/kolla/centos-source-heat-engine:3.0.3
  bootstrap: docker.io/ceph/daemon:v3.0.5-stable-3.0-luminous-centos-7
  dep_check: docker.io/kolla/centos-source-kubernetes-entrypoint:4.0.0
  daemon: docker.io/ceph/daemon:v3.0.5-stable-3.0-luminous-centos-7
  ceph_config_helper: docker.io/port/ceph-config-helper:v1.7.5
  rbd_provisioner: quay.io/external_storage/rbd-provisioner:v0.1.1
  minimal: docker.io/alpine:latest
  pull_policy: "IfNotPresent"

kubectl get pod -n ceph

NAME                                        READY     STATUS                  RESTARTS   AGE
ceph-mds-c5c856bb8-rw2vq                    0/1       Pending                 0          13m
ceph-mds-keyring-generator-llhcl            0/1       Completed               0          13m
ceph-mgr-566969ff9f-bhnsz                   0/1       CrashLoopBackOff        6          7m
ceph-mgr-keyring-generator-gplx2            0/1       Completed               0          13m
ceph-mon-check-9fd5797bc-nb5l6              1/1       Running                 0          11m
ceph-mon-fpd6w                              3/3       Running                 0          13m
ceph-mon-keyring-generator-kvsgc            0/1       Completed               0          13m
ceph-namespace-client-key-generator-fg9nv   0/1       Completed               0          7m
ceph-osd-dev-sdb-4qnd9                      0/1       Init:CrashLoopBackOff   6          13m
ceph-osd-dev-sdb-glk52                      0/1       Init:CrashLoopBackOff   6          13m
ceph-osd-keyring-generator-9ztc7            0/1       Completed               0          13m
ceph-rbd-provisioner-5bc57f5f64-pmnr6       1/1       Running                 0          13m
ceph-rbd-provisioner-5bc57f5f64-sllbc       1/1       Running                 0          13m
ceph-rgw-597dcb57f7-9nzrz                   0/1       Pending                 0          13m
ceph-rgw-keyring-generator-5j844            0/1       Completed               0          13m
ceph-storage-keys-generator-t22q7           0/1       Completed               0          13m

kubectl describe pod/ceph-mon-fpd6w -n ceph

Events:
  Type     Reason       Age                From           Message
  ----     ------       ----               ----           -------
  Warning  FailedMount  16m (x5 over 16m)  kubelet, lab1  MountVolume.SetUp failed for volume "ceph-mon-keyring" : secrets "ceph-mon-keyring" not found
  Warning  FailedMount  16m (x5 over 16m)  kubelet, lab1  MountVolume.SetUp failed for volume "ceph-bootstrap-mds-keyring" : secrets "ceph-bootstrap-mds-keyring" not found
  Warning  FailedMount  16m (x5 over 16m)  kubelet, lab1  MountVolume.SetUp failed for volume "ceph-bootstrap-rgw-keyring" : secrets "ceph-bootstrap-rgw-keyring" not found
  Warning  FailedMount  16m (x5 over 16m)  kubelet, lab1  MountVolume.SetUp failed for volume "ceph-bootstrap-osd-keyring" : secrets "ceph-bootstrap-osd-keyring" not found
  Warning  FailedMount  16m (x5 over 16m)  kubelet, lab1  MountVolume.SetUp failed for volume "ceph-client-admin-keyring" : secrets "ceph-client-admin-keyring" not found

But, I can get the secet.

# kubectl get secret -n ceph
NAME                                  TYPE                                  DATA      AGE
ceph-bootstrap-mds-keyring            Opaque                                1         16m
ceph-bootstrap-mgr-keyring            Opaque                                1         16m
ceph-bootstrap-osd-keyring            Opaque                                1         16m
ceph-bootstrap-rgw-keyring            Opaque                                1         16m
ceph-client-admin-keyring             Opaque                                1         16m
ceph-keystone-user-rgw                Opaque                                7         16m
ceph-mon-keyring                      Opaque                                1         16m
default-token-htx2q                   kubernetes.io/service-account-token   3         16m
pvc-ceph-client-key                   kubernetes.io/rbd                     1         10m
pvc-ceph-conf-combined-storageclass   kubernetes.io/rbd                     1         16m

kubectl logs -f ceph-mon-check-9fd5797bc-nb5l6 -n ceph

+ echo '2018-08-16 03:43:15  /watch_mon_health.sh: sleep 30 sec'
+ return 0
+ sleep 30
+ '[' true ']'
+ log 'checking for zombie mons'
+ '[' -z 'checking for zombie mons' ']'
++ date '+%F %T'
2018-08-16 03:43:45  /watch_mon_health.sh: checking for zombie mons
+ TIMESTAMP='2018-08-16 03:43:45'
+ echo '2018-08-16 03:43:45  /watch_mon_health.sh: checking for zombie mons'
+ return 0
+ CLUSTER=ceph
+ /check_zombie_mons.py
2018-08-16 03:43:46.122705 7fb2f3994700  0 librados: client.admin authentication error (1) Operation not permitted
[errno 1] error connecting to the cluster
Traceback (most recent call last):
  File "/check_zombie_mons.py", line 30, in <module>
    current_mons = extract_mons_from_monmap()
  File "/check_zombie_mons.py", line 18, in extract_mons_from_monmap
    monmap = subprocess.check_output(monmap_command, shell=True)
  File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command 'ceph --cluster=${CLUSTER} mon getmap > /tmp/monmap && monmaptool -f /tmp/monmap --print' returned non-zero exit status 1

Unable to mount volumes : timeout expired waiting for volumes to attach/mount

Is this a request for help?: Yes


Is this a BUG REPORT or FEATURE REQUEST? Bug report

Version of Helm and Kubernetes:

kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:17:39Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"} 
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.7", GitCommit:"dd5e1a2978fd0b97d9b78e1564398aeea7e7fe92", GitTreeState:"clean", BuildDate:"2018-04-18T23:58:35Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"} 
helm version                                                                                                                                     root@kubernetes
Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}

Which chart: ceph-helm

What happened:

Unable to mount volumes for pod "mypod_default(e68c8e3e-6578-11e8-87c4-e83935e84dc8)": timeout expired waiting for volumes to attach/mount for pod "default"/"mypod". list of unattached/unmounted volumes=[vol1]

How to reproduce it (as minimally and precisely as possible):
http://docs.ceph.com/docs/master/start/kube-helm/

Anything else we need to know:

The ceph cluster is working fine

  ceph -s
  cluster:
    id:     88596d9e-b478-47a9-8208-3a6cea33d1d4
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum kubernetes
    mgr: kubernetes(active)
    mds: cephfs-1/1/1 up  {0=mds-ceph-mds-5696f9df5d-jbsgz=up:active}
    osd: 1 osds: 1 up, 1 in
    rgw: 1 daemon active
 
  data:
    pools:   7 pools, 176 pgs
    objects: 213 objects, 3391 bytes
    usage:   108 MB used, 27134 MB / 27243 MB avail
    pgs:     176 active+clean

Everything in th ceph namespace works fine
In the mon pod I got an image created for the pvc

rbd ls
kubernetes-dynamic-pvc-0077fdf9-6578-11e8-b1f8-b63c3e9e1eaa
kubectl get pvc                                                                                                                                  root@kubernetes
NAME                  STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
ceph-pvc              Bound     pvc-c9d07cf9-6578-11e8-87c4-e83935e84dc8   1Gi        RWO            ceph-rbd       29m

I have changed resolv.conf and added the kube-dns as nameserver, I can resolve
ceph-mon.ceph and ceph-mon.ceph.svc.local from the host node

some kubelet logs that I found related
juin 01 11:24:19 kubernetes kubelet[32612]: E0601 11:24:19.587800 32612 nestedpendingoperations.go:263] Operation for "\"kubernetes.io/rbd/[ceph-mon.ceph.svc.cluster.local:6789]:kubernetes-dynamic-pvc-0077fdf9-6578-11e8-b1f8-b63c3e9e1eaa\"" failed. No retries permitted until 2018-06-01 11:24:51.582365588 +0200 CEST m=+162261.330642194 (durationBeforeRetry 32s). Error: "MountVolume.WaitForAttach failed for volume \"pvc-004d66b7-6578-11e8-87c4-e83935e84dc8\" (UniqueName: \"kubernetes.io/rbd/[ceph-mon.ceph.svc.cluster.local:6789]:kubernetes-dynamic-pvc-0077fdf9-6578-11e8-b1f8-b63c3e9e1eaa\") pod \"ldap-ss-0\" (UID: \"f63432e0-6579-11e8-87c4-e83935e84dc8\") : error: exit status 1, rbd output: 2018-06-01 11:19:19.513914 7f1cf1f227c0 -1 did not load config file, using default settings.\n2018-06-01 11:19:19.579955 7f1cf1f20700 0 -- IP@:0/1002573 >> IP@:6789/0 pipe(0x3a2a3f0 sd=3 :53578 s=1 pgs=0 cs=0 l=1 c=0x3a2e6e0).connect protocol feature mismatch, my 83ffffffffffff < peer 481dff8eea4fffb missing 400000000000000\n2018-06-01 11:19:19.580065 7f1cf1f20700 0 -- IP@:0/1002573 >> IP@:6789/0 pipe(0x3a2a3f0 sd=3 :53578 s=1 pgs=0 cs=0 l=1 c=0x3a2e6e0).fault\n2018-06-01 11:19:19.580437 7f1cf1f20700 0 -- IP@:0/1002573 >> 10.1.0.146:6789/0 pipe(0x3a2a3f0 sd=3 :53580 s=1 pgs=0 cs=0 l=1 c=0x3a2e6e0).connect protocol feature mismatch, my 83ffffffffffff < peer 481dff8eea4fffb missing 400000000000000\n2018-06-01 11:19:19.781427 7f1cf1f20700 0 -- 10.1.0.146:0/1002573 >> 10.1.0.146:6789/0 pipe(0x3a2a3f0 sd=3 :53584 s=1 pgs=0 cs=0 l=1 c=0x3a2e6e0).**connect protocol feature mismatch**, my 83ffffffffffff < peer 481dff8eea4fffb missing 400000000000000\n2018-06-01 11:19:20.182401 7f1cf1f20700 0 -- 10.1.0.146:0/1002573 >> 10.1.0.146:6789/0 pipe(0x3a2a3f0 sd=3 :53588 s=1 pgs=0 cs=0 l=1 c=0x3a2e6e0).**connect protocol feature mismatch**, my 83ffffffffffff < peer 481dff8eea4fffb missing 400000000000000\n2018-06-01 11:19:20.983428 7f1cf1f20700 0 -- IP@:0/1002573 >> ip@:6789/0 pipe(0x3a2a3f0 sd=3 :53610 s=1 pgs=0 cs=0 l=1 c=0x3a2e6e0).conne

Idon't know it tries to connect to my kubernetes node externalip:6789 that port is only opened to the ceph-mon headless svc which is

kubectl get svc -n ceph                                                                                                                    root@kubernetes
NAME       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
ceph-mon   ClusterIP   None            <none>        6789/TCP   1h

From the kubernetes node I can telnet to the port 6789

telnet ceph-mon.ceph 6789                                                                                                                  root@kubernetes 
Trying IP@ ... 
Connected to ceph-mon.ceph. 

connect protocol feature mismatch in the kubelet logs
Could have something to do with

Important Kubernetes uses the RBD kernel module to map RBDs to hosts. Luminous requires CRUSH_TUNABLES 5 (Jewel). The minimal kernel version for these tunables is 4.5. If your kernel does not support these tunables, run ceph osd crush tunables hammer

in the ceph-helm doc

Cannot generate keyring

Is this a request for help?:


BUG REPOR :

When I run command "kubectl get pods -n ceph" after "helm install ..", I got:

NAME                                   READY   STATUS     RESTARTS   AGE
ceph-mds-75dc968dc7-kvp67              0/1     Init:0/2   0          18m
ceph-mgr-75f4c4dc76-x5r74              0/1     Init:0/2   0          18m
ceph-mon-check-85d59b5fd4-xjczs        0/1     Init:0/2   0          18m
ceph-mon-zjzpp                         0/3     Init:0/2   0          18m
ceph-osd-dev-sdd-clhpt                 0/1     Init:0/3   0          18m
ceph-osd-dev-sde-2fkmt                 0/1     Init:0/3   0          18m
ceph-rbd-provisioner-d59d65f74-nkj48   1/1     Running    0          18m
ceph-rbd-provisioner-d59d65f74-xqhmt   1/1     Running    0          18m
ceph-rgw-7598c7788-mbjt9               0/1     Init:0/2   0          18m

Then I see logs of one of these faild pods, and got:

Events:
  Type     Reason       Age                      From               Message
  ----     ------       ----                     ----               -------
  Normal   Scheduled    21m                      default-scheduler  Successfully assigned ceph/ceph-rgw-7598c7788-mbjt9 to minion
  Warning  FailedMount  14m (x5 over 14m)        kubelet, minion    MountVolume.SetUp failed for volume "ceph-bootstrap-rgw-keyring" : secret "ceph-bootstrap-rgw-keyring" not found
  Warning  FailedMount  14m (x5 over 14m)        kubelet, minion    MountVolume.SetUp failed for volume "ceph-mon-keyring" : secret "ceph-mon-keyring" not found
  Warning  FailedMount  14m (x5 over 14m)        kubelet, minion    MountVolume.SetUp failed for volume "ceph-bootstrap-mds-keyring" : secret "ceph-bootstrap-mds-keyring" not found
  Warning  FailedMount  8m15s (x11 over 14m)     kubelet, minion    MountVolume.SetUp failed for volume "ceph-bootstrap-osd-keyring" : secret "ceph-bootstrap-osd-keyring" not found
  Warning  FailedMount  4m11s (x13 over 14m)     kubelet, minion    MountVolume.SetUp failed for volume "ceph-client-admin-keyring" : secret "ceph-client-admin-keyring" not found
  Warning  FailedMount  <invalid> (x9 over 12m)  kubelet, minion    Unable to mount volumes for pod "ceph-rgw-7598c7788-mbjt9_ceph(44a7d63a-9230-11e9-9dd2-5820b10bbe42)": timeout expired waiting for volumes to attach or mount for pod "ceph"/"ceph-rgw-7598c7788-mbjt9". list of unmounted volumes=[ceph-client-admin-keyring ceph-mon-keyring ceph-bootstrap-osd-keyring ceph-bootstrap-mds-keyring ceph-bootstrap-rgw-keyring]. list of unattached volumes=[pod-etc-ceph ceph-bin ceph-etc pod-var-lib-ceph pod-run ceph-client-admin-keyring ceph-mon-keyring ceph-bootstrap-osd-keyring ceph-bootstrap-mds-keyring ceph-bootstrap-rgw-keyring default-token-rbjr6]

When I run command โ€œkubectl get job -n cephโ€, I got:

NAME                                  COMPLETIONS   DURATION   AGE
ceph-mds-keyring-generator            0/1           17m        17m
ceph-mgr-keyring-generator            0/1           17m        17m
ceph-mon-keyring-generator            0/1           17m        17m
ceph-namespace-client-key-generator   0/1           17m        17m
ceph-osd-keyring-generator            0/1           17m        17m
ceph-rgw-keyring-generator            0/1           17m        17m
ceph-storage-keys-generator           0/1           17m        17m

I think the file "ceph/ceph/templates/job-keyring.yaml" format is wrong for k8s v1.14.1.

{{- if .Values.manifests.job_keyring }}
{{- $envAll := . }}
{{- if .Values.deployment.storage_secrets }}
{{- range $key1, $cephBootstrapKey := tuple "mds" "osd" "rgw" "mon" "mgr"}}
{{- $jobName := print $cephBootstrapKey "-keyring-generator" }}
---
apiVersion: batch/v1
kind: Job 
metadata:
  name: ceph-{{ $jobName }}
spec:
  template:
    metadata:
      labels:
{{ tuple $envAll "ceph" $jobName | include "helm-toolkit.snippets.kubernetes_metadata_labels" | indent 8 }}
    spec:
      restartPolicy: OnFailure
      nodeSelector:
        {{ $envAll.Values.labels.jobs.node_selector_key }}: {{ $envAll.Values.labels.jobs.node_selector_value }}      containers:
        - name:  ceph-{{ $jobName }}
          image: {{ $envAll.Values.images.ceph_config_helper }}
          imagePullPolicy: {{ $envAll.Values.images.pull_policy }}
{{ tuple $envAll $envAll.Values.pod.resources.jobs.secret_provisioning | include "helm-toolkit.snippets.kubernetes_resources" | indent 10 }}          env:
            - name: DEPLOYMENT_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: CEPH_GEN_DIR 
              value: /opt/ceph
            - name: CEPH_TEMPLATES_DIR
              value: /opt/ceph/templates
            {{- if eq $cephBootstrapKey "mon"}}
            - name: CEPH_KEYRING_NAME
              value: ceph.mon.keyring
            - name: CEPH_KEYRING_TEMPLATE
              value: mon.keyring
            {{- else }}
            - name: CEPH_KEYRING_NAME
              value: ceph.keyring
            - name: CEPH_KEYRING_TEMPLATE
              value: bootstrap.keyring.{{ $cephBootstrapKey }}
            {{- end }}
            - name: KUBE_SECRET_NAME
              value: {{  index $envAll.Values.secrets.keyrings $cephBootstrapKey }}
          command:
            - /opt/ceph/ceph-key.sh
          volumeMounts:
            - name: ceph-bin
              mountPath: /opt/ceph/ceph-key.sh
              subPath: ceph-key.sh
              readOnly: true
            - name: ceph-bin
              mountPath: /opt/ceph/ceph-key.py
              subPath: ceph-key.py
              readOnly: true
            - name: ceph-templates
              mountPath: /opt/ceph/templates
              readOnly: true
      volumes:
        - name: ceph-bin
          configMap:
            name: ceph-bin
            defaultMode: 0555
        - name: ceph-templates
          configMap:
            name: ceph-templates
            defaultMode: 0444
{{- end }}
{{- end }}
{{- end }}

Version of Helm and Kubernetes:

Helm

Client: &version.Version{SemVer:"v2.14.1", GitCommit:"5270352a09c7e8b6e8c9593002a73535276507c0", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.14.1", GitCommit:"5270352a09c7e8b6e8c9593002a73535276507c0", GitTreeState:"clean"}

Kubernetes

Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:02:58Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}

Which chart:
local/ceph

What happened:

kubectl describe pod ceph-storage-keys-generator-9x6z8 -n ceph
++ ceph_gen_key
++ python /opt/ceph/ceph-key.py
+ CEPH_CLIENT_KEY=AQBhjwldAAAAABAAO9uRzdtFukGfCKVFFFPM1w==
+ create_kube_key AQBhjwldAAAAABAAO9uRzdtFukGfCKVFFFPM1w== ceph.client.admin.keyring admin.keyring ceph-client-admin-keyring
+ CEPH_KEYRING=AQBhjwldAAAAABAAO9uRzdtFukGfCKVFFFPM1w==
+ CEPH_KEYRING_NAME=ceph.client.admin.keyring
+ CEPH_KEYRING_TEMPLATE=admin.keyring
+ KUBE_SECRET_NAME=ceph-client-admin-keyring
+ kubectl get --namespace ceph secrets ceph-client-admin-keyring
Error from server (NotFound): secrets "ceph-client-admin-keyring" not found
+ cat
+ kubectl create --namespace ceph -f -
++ kube_ceph_keyring_gen AQBhjwldAAAAABAAO9uRzdtFukGfCKVFFFPM1w== admin.keyring
++ CEPH_KEY=AQBhjwldAAAAABAAO9uRzdtFukGfCKVFFFPM1w==
++ CEPH_KEY_TEMPLATE=admin.keyring
++ sed 's|{{ key }}|AQBhjwldAAAAABAAO9uRzdtFukGfCKVFFFPM1w==|' /opt/ceph/templates/admin.keyring
++ base64
++ tr -d '\n'
error: error validating "STDIN": error validating data: unknown; if you choose to ignore these errors, turn validation off with --validate=false

What you expected to happen:

When I run bellow command, all jobs are completed.

kubectl get job - n ceph

And after bellow command, I can see kinds of keyring.

kubectl get secret -n ceph

How to reproduce it (as minimally and precisely as possible):

Just run command

helm install --name=ceph local/ceph --namespace=ceph -f ~/ceph-overrides.yaml

Anything else we need to know:

Waiting you to resolv it online. tks! @rootfs @zmc @alfredodeza @liewegas @jecluis @ktdreyer

ceph-rbd-provisioner failing to create ceph-rbd storage class

Is this a request for help?: Yes


Is this a BUG REPORT or FEATURE REQUEST? (choose one): Bug Report

Version of Helm and Kubernetes: Helm 2.11 Minikube v0.24.1

Which chart: Not sure

What happened: Trying to create PVC after all pods come up

What you expected to happen: I create my PVC and everything works

How to reproduce it (as minimally and precisely as possible): Not sure

Anything else we need to know: I create a cluster based on your setup guide but when I go to create a PVC in Kubernetes

apiVersion: v1
metadata:
  name: ceph-pvc
spec:
  accessModes:
   - ReadWriteOnce
  resources:
    requests:
       storage: 20Gi
  storageClassName: ceph-rbd

It stays pending forever. This is the error I'm seeing in the logs.

  Warning  ProvisioningFailed  6s                ceph.com/rbd ceph-rbd-provisioner-85d57d8799-j6vbn cb6ca04b-3094-11e9-9c4b-0242ac110006  Failed to provision volume with StorageClass "ceph-rbd": failed to create rbd image: signal: segmentation fault (core dumped), command output: 2019-02-14 20:28:34.084001 7f94dca8fd80 -1 did not load config file, using default settings.
2019-02-14 20:28:34.143846 7f94dca8fd80 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
*** Caught signal (Segmentation fault) **

Can we push ceph-helm to offical helm/charts?

this repo was forked from official helm/charts, and was going too far away from the main repo. please push back to the official, and we all can develop, test there, and make helm ceph deploy stable at last.

ceph-osd fails to initialize when using non-bluestore and journals

Is this a request for help?:NO


Is this a BUG REPORT or FEATURE REQUEST? (choose one):BUG REPORT

Version of Helm and Kubernetes:Kubernetes 1.11, helm: 2.11

Which chart:ceph-helm

What happened:
tried to use non-bluestore configuration with separate journals.

What you expected to happen:
OSD should activate.

How to reproduce it (as minimally and precisely as possible):
Simple two drive system. Set up values.yaml to use non-bluestore object storage.

Anything else we need to know:
Attaching potential "fix" for ceph/ceph/templates/bin/_osd_disk_activate.sh.tpl

#!/bin/bash
set -ex

function osd_activate {
if [[ -z "${OSD_DEVICE}" ]];then
log "ERROR- You must provide a device to build your OSD ie: /dev/sdb"
exit 1
fi

CEPH_DISK_OPTIONS=""
CEPH_OSD_OPTIONS=""

DATA_UUID=$(blkid -o value -s PARTUUID ${OSD_DEVICE}1)
LOCKBOX_UUID=$(blkid -o value -s PARTUUID ${OSD_DEVICE}3 || true)
JOURNAL_PART=$(dev_part ${OSD_DEVICE} 2)
ACTUAL_OSD_DEVICE=$(readlink -f ${OSD_DEVICE}) # resolve /dev/disk/by-
names

watch the udev event queue, and exit if all current events are handled

udevadm settle --timeout=600

wait till partition exists then activate it

if [[ -n "${OSD_JOURNAL}" ]]; then
#wait_for_file /dev/disk/by-partuuid/${OSD_JOURNAL_UUID}
#chown ceph. /dev/disk/by-partuuid/${OSD_JOURNAL_UUID}
#CEPH_OSD_OPTIONS="${CEPH_OSD_OPTIONS} --osd-journal /dev/disk/by-partuuid/${OSD_JOURNAL_UUID}"
CEPH_OSD_OPTIONS="${CEPH_OSD_OPTIONS}"
else
wait_for_file $(dev_part ${OSD_DEVICE} 1)
chown ceph. $JOURNAL_PART
fi

chown ceph. /var/log/ceph

DATA_PART=$(dev_part ${OSD_DEVICE} 1)
MOUNTED_PART=${DATA_PART}

if [[ ${OSD_DMCRYPT} -eq 1 ]]; then
echo "Mounting LOCKBOX directory"
# NOTE(leseb): adding || true so when this bug will be fixed the entrypoint will not fail
# Ceph bug tracker: http://tracker.ceph.com/issues/18945
mkdir -p /var/lib/ceph/osd-lockbox/${DATA_UUID}
mount /dev/disk/by-partuuid/${LOCKBOX_UUID} /var/lib/ceph/osd-lockbox/${DATA_UUID} || true
CEPH_DISK_OPTIONS="$CEPH_DISK_OPTIONS --dmcrypt"
MOUNTED_PART="/dev/mapper/${DATA_UUID}"
fi

ceph-disk -v --setuser ceph --setgroup disk activate ${CEPH_DISK_OPTIONS} --no-start-daemon ${DATA_PART}

OSD_ID=$(grep "${MOUNTED_PART}" /proc/mounts | awk '{print $2}' | grep -oh '[0-9]*')
OSD_PATH=$(get_osd_path $OSD_ID)
OSD_KEYRING="$OSD_PATH/keyring"
OSD_WEIGHT=$(df -P -k $OSD_PATH | tail -1 | awk '{ d= $2/1073741824 ; r = sprintf("%.2f", d); print r }')
ceph ${CLI_OPTS} --name=osd.${OSD_ID} --keyring=$OSD_KEYRING osd crush create-or-move -- ${OSD_ID} ${OSD_WEIGHT} ${CRUSH_LOCATION}

log "SUCCESS"
exec /usr/bin/ceph-osd ${CLI_OPTS} ${CEPH_OSD_OPTIONS} -f -i ${OSD_ID} --setuser ceph --setgroup disk
}

ceph-etc configmaps changing after adding and removing or changing osd-s

Is this a request for help?:
No

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
It's a bug report

Version of Helm and Kubernetes:
helm version
2.8.0

kubernetes version:
1.9.2

Which chart:
ceph-helm

What happened:
After the deployment, added a new disk to the ceph-overrides.yml and ran:
helm upgrade ceph local/ceph --namespace=ceph -f ~/ceph-overrides.yaml
This added the new disk in, but also changed the ceph-etc configmap which caused the cluster fsid to change.
If the machine which runs the OSD containers gets rebooted, the new containers come up with the new fsid, the mon-s are still running with the old fsid which causes the osd containers to crash loop back off.

What you expected to happen:
Configmap should have stayed the same.

How to reproduce it (as minimally and precisely as possible):
Finish deployment, add a new disk or change an existing osd mapping to a new disk in the ceph-overrides.yml file then issue the command:
helm upgrade ceph local/ceph --namespace=ceph -f ~/ceph-overrides.yaml

Anything else we need to know:

ceph-mgr and ceph-osd is not starting

Is this a request for help?: Yes


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Version of Helm and Kubernetes:

$ helm version
Client: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.4", GitCommit:"c27b913fddd1a6c480c229191a087698aa92f0b1", GitTreeState:"clean", BuildDate:"2019-02-28T13:37:52Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.4", GitCommit:"c27b913fddd1a6c480c229191a087698aa92f0b1", GitTreeState:"clean", BuildDate:"2019-02-28T13:30:26Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}

Which chart: ceph

What happened: ceph-mgr and ceph-osd won't start up
ouput of ceph-mgr:

๏ปฟ+ source variables_entrypoint.sh 
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr' 
++ : ceph 
++ : ceph-config/ceph 
++ : 
++ : 
++ : 0 
++ : dockerblade-slot6-oben.example.com 
++ : dockerblade-slot6-oben.example.com 
++ : /etc/ceph/monmap-ceph 
++ : /var/lib/ceph/mon/ceph-dockerblade-slot6-oben.example.com 
++ : 0 
++ : 0 
++ : mds-dockerblade-slot6-oben.example.com 
++ : 0 
++ : 100 
++ : 0 
++ : 0 
+++ uuidgen 
++ : 5700ffd2-02f6-4212-8a76-8a57f3fe2a04 
+++ uuidgen 
++ : 38e7aef4-c42b-457a-af33-fa8dc3ff1eb7 
++ : root=default host=dockerblade-slot6-oben.example.com 
++ : 0 
++ : cephfs 
++ : cephfs_data 
++ : 8 
++ : cephfs_metadata 
++ : 8 
++ : dockerblade-slot6-oben.example.com 
++ : 
++ : 
++ : 8080 
++ : 0 
++ : 9000 
++ : 0.0.0.0 
++ : cephnfs 
++ : dockerblade-slot6-oben.example.com 
++ : 0.0.0.0 
++ CLI_OPTS='--cluster ceph' 
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d' 
++ MOUNT_OPTS='-t xfs -o noatime,inode64' 
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-dockerblade-slot6-oben.example.com/keyring 
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring 
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring 
++ RGW_KEYRING=/var/lib/ceph/radosgw/dockerblade-slot6-oben.example.com/keyring 
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-dockerblade-slot6-oben.example.com/keyring 
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring 
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring 
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring 
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph 
+ source common_functions.sh 
++ set -ex 
+ [[ ! -e /usr/bin/ceph-mgr ]] 
+ [[ ! -e /etc/ceph/ceph.conf ]] 
+ '[' 0 -eq 1 ']' 
+ '[' '!' -e /var/lib/ceph/mgr/ceph-dockerblade-slot6-oben.example.com/keyring ']' 
+ timeout 10 ceph --cluster ceph auth get-or-create mgr.dockerblade-slot6-oben.example.com mon 'allow profile mgr' osd 'allow *' mds 'allow *' -o /var/lib/ceph/mgr/ceph-dockerblade-slot6-oben.example.com/keyring 

and the output of ceph-osd's osd-prepare-pod:

๏ปฟ+ export LC_ALL=C 
+ LC_ALL=C 
+ source variables_entrypoint.sh 
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr' 
++ : ceph 
++ : ceph-config/ceph 
++ : 
++ : osd_ceph_disk_prepare 
++ : 1 
++ : dockerblade-slot5-unten 
++ : dockerblade-slot5-unten 
++ : /etc/ceph/monmap-ceph 
++ : /var/lib/ceph/mon/ceph-dockerblade-slot5-unten 
++ : 0 
++ : 0 
++ : mds-dockerblade-slot5-unten 
++ : 0 
++ : 100 
++ : 0 
++ : 0 
+++ uuidgen 
++ : e101933b-67b3-4267-824f-173d2ef7a47b 
+++ uuidgen 
++ : 10dd57d2-f3c7-4cab-88ea-8e3771baeaa7 
++ : root=default host=dockerblade-slot5-unten 
++ : 0 
++ : cephfs 
++ : cephfs_data 
++ : 8 
++ : cephfs_metadata 
++ : 8 
++ : dockerblade-slot5-unten 
++ : 
++ : 
++ : 8080 
++ : 0 
++ : 9000 
++ : 0.0.0.0 
++ : cephnfs 
++ : dockerblade-slot5-unten 
++ : 0.0.0.0 
++ CLI_OPTS='--cluster ceph' 
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d' 
++ MOUNT_OPTS='-t xfs -o noatime,inode64' 
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-dockerblade-slot5-unten/keyring 
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring 
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring 
++ RGW_KEYRING=/var/lib/ceph/radosgw/dockerblade-slot5-unten/keyring 
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-dockerblade-slot5-unten/keyring 
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring 
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring 
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring 
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph 
+ source common_functions.sh 
++ set -ex 
+ is_available rpm 
+ command -v rpm 
+ is_available dpkg 
+ command -v dpkg 
+ OS_VENDOR=ubuntu 
+ source /etc/default/ceph 
++ TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 
+ case "$CEPH_DAEMON" in 
+ OSD_TYPE=prepare 
+ start_osd 
+ [[ ! -e /etc/ceph/ceph.conf ]] 
+ '[' 1 -eq 1 ']' 
+ [[ ! -e /etc/ceph/ceph.client.admin.keyring ]] 
+ case "$OSD_TYPE" in 
+ source osd_disk_prepare.sh 
++ set -ex 
+ osd_disk_prepare 
+ [[ -z /dev/container/block-data ]] 
+ [[ ! -e /dev/container/block-data ]] 
+ '[' '!' -e /var/lib/ceph/bootstrap-osd/ceph.keyring ']' 
+ timeout 10 ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring health 
+ exit 1 

What you expected to happen: to start up flawlessly

Anything else we need to know:
here is my overrides.yml:

network:
  public: 10.42.0.0/16
  cluster: 10.42.0.0/16

osd_devices:
  - name: block-data
    device: /dev/container/block-data
    zap: "1"

storageclass:
  name: ceph-rbd
  pool: rbd
  user_id: k8s

I am using Rancher 2 / RKE on bare-metal. I am unsure about the network-setup. Maybe i have some issues here:

  • All nodes (6) can see and reach each other by IPv4 address only. Although the nodes have names there is no DNS set up outside of the cluster
  • Rancher/RKE sets up a flannel-network with CIDR 10.42.0.0/16 which is what i used as network.public and network.cluster

How can I adjust the number of pgs ?

I have a running ceph deployment in kubernetes. What is the suggested way to adjust the number of pgs dynamically(as more osds(nodes) get added) with the ceph-helm deployment?

Mounting CephFS is giving an Error.

Request for Help.
I am using ceph-helm, also running mds with it.
ceph -s shows that everything is up and running.

I am unable to mount ceph fs to a pod.

I am using this ceph-test-fs.yaml

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: cephfs
  name: ceph-cephfs-test
  namespace: ceph
spec:
  nodeSelector:
    node-type: storage
  containers:
  - name: cephfs-rw
    image: busybox
    command:
    - sh
    - -c
    - while true; do sleep 1; done
    volumeMounts:
    - mountPath: "/mnt/cephfs"
      name: cephfs
  volumes:
  - name: cephfs
    cephfs:
      monitors:
#This only works if you have skyDNS resolveable from the kubernetes node. Otherwise you must manually put in one or more mon pod ips.
      - ceph-mon.ceph:6789
      user: admin
      secretRef:
        name: ??

I am not sure what keyring to use to be able to mount.
and also I did try all the secrets generated by ceph-helm deployment.

Can anyone help me with mounting ceph fs?

Here is the error log:

MountVolume.SetUp failed for volume "cephfs" : CephFS: mount failed: mount failed: exit status 32 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/7d82c944-e01f-11e7-bf03-001c42d61047/volumes/kubernetes.io~cephfs/cephfs --scope -- mount -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret ceph-mon.ceph:6789:/ /var/lib/kubelet/pods/7d82c944-e01f-11e7-bf03-001c42d61047/volumes/kubernetes.io~cephfs/cephfs Output: Running scope as unit run-10734.scope. mount: wrong fs type, bad option, bad superblock on ceph-mon.ceph:6789:/, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so.

Integration of Ceph Exporter

Hi,
I was looking for a metrics exporter for prometheus and found a general exporter at: ceph/ceph-container/examples/helm/ceph

Would it be possible to integrate the exporter into this project? I am not very good at writing helm charts, and the structure of both charts seems to differ a lot.

Thanks,
Stefan

OSD init container fails on minikube

Is this a request for help?:
Yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
Bug?

Version of Helm and Kubernetes:

Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5", GitCommit:"17d7182a7ccbb167074be7a87f0a68bd00d58d97", GitTreeState:"clean", BuildDate:"2017-08-31T09:14:02Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0", GitCommit:"0b9efaeb34a2fc51ff8e4d34ad9bc6375459c4a4", GitTreeState:"clean", BuildDate:"2017-11-29T22:43:34Z", GoVersion:"go1.9.1", Compiler:"gc", Platform:"linux/amd64"}

Which chart:

ceph

What happened:

The OSD container fails on this line:

timeout 10 ceph ${CLI_OPTS} --name client.bootstrap-osd --keyring $OSD_BOOTSTRAP_KEYRING health || exit 1

What you expected to happen:

Success

How to reproduce it (as minimally and precisely as possible):

  • Start minikube in virtualbox mode. Attach a disk which becomes /dev/sdb. Deploy ceph-helm.

Anything else we need to know:

MON is also failing, but it appears to be failing due to a lack of storage.

$ ย kubectl logs -n ceph ceph-osd-sdb-jhfd8 -c osd-prepare-pod
+ export LC_ALL=C
+ LC_ALL=C
+ source variables_entrypoint.sh
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr'
++ : ceph
++ : ceph-config/ceph
++ :
++ : osd_ceph_disk_prepare
++ : 1
++ : minikube
++ : minikube
++ : /etc/ceph/monmap-ceph
++ : /var/lib/ceph/mon/ceph-minikube
++ : 0
++ : 0
++ : mds-minikube
++ : 0
++ : 100
++ : 0
++ : 0
+++ uuidgen
++ : 9c0528e0-da2b-4801-a920-4e476b4c6a69
+++ uuidgen
++ : 19e4efd2-405f-4c70-b9fd-0c18d937fbdb
++ : root=default host=minikube
++ : 0
++ : cephfs
++ : cephfs_data
++ : 8
++ : cephfs_metadata
++ : 8
++ : minikube
++ :
++ :
++ : 8080
++ : 0
++ : 9000
++ : 0.0.0.0
++ : cephnfs
++ : minikube
++ : 0.0.0.0
++ CLI_OPTS='--cluster ceph'
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d'
++ MOUNT_OPTS='-t xfs -o noatime,inode64'
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-minikube/keyring
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring
++ RGW_KEYRING=/var/lib/ceph/radosgw/minikube/keyring
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-minikube/keyring
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph
+ source common_functions.sh
++ set -ex
+ is_available rpm
+ command -v rpm
+ is_available dpkg
+ command -v dpkg
+ OS_VENDOR=ubuntu
+ source /etc/default/ceph
++ TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728
+ case "$CEPH_DAEMON" in
+ OSD_TYPE=prepare
+ start_osd
+ [[ ! -e /etc/ceph/ceph.conf ]]
+ '[' 1 -eq 1 ']'
+ [[ ! -e /etc/ceph/ceph.client.admin.keyring ]]
+ case "$OSD_TYPE" in
+ source osd_disk_prepare.sh
++ set -ex
+ osd_disk_prepare
+ [[ -z /dev/sdb ]]
+ [[ ! -e /dev/sdb ]]
+ '[' '!' -e /var/lib/ceph/bootstrap-osd/ceph.keyring ']'
+ timeout 10 ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring health
+ exit 1

the DNS pod can not resolve ceph-monitor'name ceph-mon.ceph.svc.cluster.local

Is this a request for help?: yes


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

the container alawys show "1 dns.go:555] Could not find endpoints for service "ceph-mon" in namespace "ceph". DNS records will be created once endpoints show up. " in pod kube-dns-85bc874cc5-mdzhb
"

[root@master ceph]# helm install --name=ceph local/ceph --namespace=ceph
NAME: ceph
LAST DEPLOYED: Tue Jun 12 09:53:41 2018
NAMESPACE: ceph
STATUS: DEPLOYED

RESOURCES:
==> v1beta1/DaemonSet
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
ceph-mon 1 1 0 1 0 ceph-mon=enabled 1s
ceph-osd-dev-sda 1 1 0 1 0 ceph-osd-device-dev-sda=enabled,ceph-osd=enabled 1s

==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
ceph-mds 1 1 1 0 1s
ceph-mgr 1 1 1 0 1s
ceph-mon-check 1 1 1 0 1s
ceph-rbd-provisioner 2 2 2 0 1s
ceph-rgw 1 1 1 0 1s

==> v1/Job
NAME DESIRED SUCCESSFUL AGE
ceph-mon-keyring-generator 1 0 1s
ceph-mds-keyring-generator 1 0 1s
ceph-osd-keyring-generator 1 0 1s
ceph-mgr-keyring-generator 1 0 1s
ceph-rgw-keyring-generator 1 0 1s
ceph-namespace-client-key-generator 1 0 1s
ceph-storage-keys-generator 1 0 1s

==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
ceph-mon-rsjkn 0/3 Init:0/2 0 1s
ceph-osd-dev-sda-jb8s7 0/1 Init:0/3 0 1s
ceph-mds-696bd98bdb-92tj2 0/1 Init:0/2 0 1s
ceph-mgr-56f45bb99c-pmpfm 0/1 Pending 0 1s
ceph-mon-check-74d98c5b95-k5xc5 0/1 Pending 0 1s
ceph-rbd-provisioner-b58659dc9-llllj 0/1 Pending 0 1s
ceph-rbd-provisioner-b58659dc9-rh4zd 0/1 ContainerCreating 0 1s
ceph-rgw-5bd9dd66c5-q5vzp 0/1 Pending 0 1s
ceph-mon-keyring-generator-nzg2l 0/1 Pending 0 1s
ceph-mds-keyring-generator-cr8ql 0/1 Pending 0 1s
ceph-osd-keyring-generator-z5jrq 0/1 Pending 0 1s
ceph-mgr-keyring-generator-kw2wj 0/1 Pending 0 1s
ceph-rgw-keyring-generator-6kghm 0/1 Pending 0 1s
ceph-namespace-client-key-generator-dk968 0/1 Pending 0 1s
ceph-storage-keys-generator-4mhhk 0/1 Pending 0 1s

==> v1/Secret
NAME TYPE DATA AGE
ceph-keystone-user-rgw Opaque 7 1s

==> v1/ConfigMap
NAME DATA AGE
ceph-bin-clients 2 1s
ceph-bin 26 1s
ceph-etc 1 1s
ceph-templates 5 1s

==> v1/StorageClass
NAME PROVISIONER AGE
ceph-rbd ceph.com/rbd 1s

==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ceph-mon ClusterIP None 6789/TCP 1s
ceph-rgw ClusterIP 10.109.46.173 8088/TCP 1s

[root@master ceph]# kubectl exec kube-dns-85bc874cc5-mdzhb -ti -n kube-system -c kubedns -- sh
/ # ps
PID USER TIME COMMAND
1 root 3:19 /kube-dns --domain=172.16.34.88. --dns-port=10053 --config-dir=/kube-dns-config --v=2
26 root 0:35 ping ceph-mon.ceph.svc.cluster.local
157 root 0:00 sh
161 root 0:00 sh
165 root 0:00 sh
/ #
/ # ping ceph-mon.ceph.svc.cluster.local
ping: bad address 'ceph-mon.ceph.svc.cluster.local'
/ #

[root@master ceph]# kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
ceph ceph-mds-696bd98bdb-rvq42 0/1 CrashLoopBackOff 6 11m
ceph ceph-mds-keyring-generator-6nrct 0/1 Completed 0 11m
ceph ceph-mgr-56f45bb99c-smqqj 0/1 CrashLoopBackOff 6 11m
ceph ceph-mgr-keyring-generator-kdjd4 0/1 Completed 0 11m
ceph ceph-mon-check-74d98c5b95-nqhmg 1/1 Running 0 11m
ceph ceph-mon-keyring-generator-7xmd8 0/1 Completed 0 11m
ceph ceph-mon-m72hp 3/3 Running 0 11m
ceph ceph-namespace-client-key-generator-cvnpw 0/1 Completed 0 11m
ceph ceph-osd-dev-sda-kzn65 0/1 Init:CrashLoopBackOff 6 11m
ceph ceph-osd-keyring-generator-48gb6 0/1 Completed 0 11m
ceph ceph-rbd-provisioner-b58659dc9-7jsnk 1/1 Running 0 11m
ceph ceph-rbd-provisioner-b58659dc9-sf6hr 1/1 Running 0 11m
ceph ceph-rgw-5bd9dd66c5-n25bn 0/1 CrashLoopBackOff 6 11m
ceph ceph-rgw-keyring-generator-vs8th 0/1 Completed 0 11m
ceph ceph-storage-keys-generator-ww7hn 0/1 Completed 0 11m
default busybox 1/1 Running 113 4d
kube-system etcd-master 1/1 Running 8 24d
kube-system heapster-69b5d4974d-9g96p 1/1 Running 10 24d
kube-system kube-apiserver-master 1/1 Running 8 24d
kube-system kube-controller-manager-master 1/1 Running 8 24d
kube-system kube-dns-85bc874cc5-mdzhb 3/3 Running 27 24d
kube-system kube-flannel-ds-b94c4 1/1 Running 12 24d
kube-system kube-flannel-ds-sqzwv 1/1 Running 10 24d
kube-system kube-proxy-9j6sq 1/1 Running 10 24d
kube-system kube-proxy-znkxj 1/1 Running 7 24d
kube-system kube-scheduler-master 1/1 Running 8 24d
kube-system kubernetes-dashboard-7d5dcdb6d9-c2sz6 1/1 Running 10 24d
kube-system monitoring-grafana-69df66f668-fpgn5 1/1 Running 10 24d
kube-system monitoring-influxdb-78d4c6f5b6-hnjg2 1/1 Running 50 24d
kube-system tiller-deploy-f9b8476d-trtml 1/1 Running 0 4d

Version of Helm and Kubernetes:

[root@master ceph]# helm version
Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}

[root@master ceph]# kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.2", GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", BuildDate:"2018-04-27T09:22:21Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.2", GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", BuildDate:"2018-04-27T09:10:24Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

Which chart:

What happened:
DNS pod can not resolve ceph-mon.ceph.svc.cluster.local

What you expected to happen:
DNS pod can resolve ceph-mon.ceph.svc.cluster.local

How to reproduce it (as minimally and precisely as possible):
always

Anything else we need to know:
None

unable to delete helm ceph namespace

Is this a request for help?:
yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Version of Helm and Kubernetes:

[root@admin-node ~]# helm version
Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}

[root@admin-node ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:17:28Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:08:34Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
[root@admin-node ~]#

Which chart:
Ceph

What happened:
Unable to release ceph namespace from Helm after a few objects missing in helm status

[root@admin-node ~]# helm list
NAME    REVISION        UPDATED                         STATUS          CHART           NAMESPACE
ceph    1               Sat Jul  7 02:46:40 2018        DEPLOYED        ceph-0.1.0      ceph
[root@admin-node ~]#
[root@admin-node ~]# helm status ceph
LAST DEPLOYED: Sat Jul  7 02:46:40 2018
NAMESPACE: ceph
STATUS: DEPLOYED

RESOURCES:
==> v1/StorageClass
NAME      PROVISIONER   AGE
ceph-rbd  ceph.com/rbd  5d

==> MISSING
KIND           NAME
secrets        ceph-keystone-user-rgw
configmaps     ceph-bin-clients
configmaps     ceph-bin
configmaps     ceph-etc
configmaps     ceph-templates
services       ceph-mon
services       ceph-rgw
daemonsets     ceph-mon
daemonsets     ceph-osd-dev-vdc
deployments    ceph-mds
deployments    ceph-mgr
deployments    ceph-mon-check
deployments    ceph-rbd-provisioner
deployments    ceph-rgw
jobs           ceph-rgw-keyring-generator
jobs           ceph-osd-keyring-generator
jobs           ceph-mgr-keyring-generator
jobs           ceph-mon-keyring-generator
jobs           ceph-mds-keyring-generator
jobs           ceph-namespace-client-key-generator
jobs           ceph-storage-keys-generator
[root@admin-node ~]# helm delete ceph --purge --debug
[debug] Created tunnel using local port: '42271'

[debug] SERVER: "127.0.0.1:42271"

Error: namespaces "ceph" is forbidden: User "system:serviceaccount:kube-system:default" cannot get namespaces in the namespace "ceph"
[root@admin-node ~]#

There are no resources at kubernetes level. I think they got deleted with i first ran helm delete ceph

[root@admin-node ~]# kubectl get all --namespace ceph
No resources found.
[root@admin-node ~]#

Also tried removing helm tiller and doing re-init

kubectl delete deployment -n=kube-system tiller-deploy
helm init --upgrade
kubectl get po -n kube-system
helm delete ceph --purge --debug

What you expected to happen:
helm delete --purge should delete the helm namespace

How to reproduce it (as minimally and precisely as possible):

  • Deploy ceph using helm chart
  • helm delete ceph
  • helm delete ceph --purge

Anything else we need to know:

RGW doesn't work

Is this a request for help?:
Yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
This is a BUG REPORT

Version of Helm and Kubernetes:
k8s - 1.8.7
helm - v2.8.2

Which chart:
ceph

What happened:
rgw container doesnt work.
it writes a log

Initialization timeout, failed to initialize

What you expected to happen:
rgw works and binds port 8088

How to reproduce it (as minimally and precisely as possible):
just run

helm install --name=ceph local/ceph --namespace=ceph -f ./ceph-overrides.yaml

Anything else we need to know:

ceph-mon, ceph-osd-*, ceph-mgr do not start on k8s v1.9.5

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
bug

Version of Helm and Kubernetes:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.2", GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", BuildDate:"2018-04-27T09:22:21Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.5", GitCommit:"f01a2bf98249a4db383560443a59bed0c13575df", GitTreeState:"clean", BuildDate:"2018-03-19T15:50:45Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

$ helm version
Client: &version.Version{SemVer:"v2.9.0", GitCommit:"f6025bb9ee7daf9fee0026541c90a6f557a3e0bc", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.0", GitCommit:"f6025bb9ee7daf9fee0026541c90a6f557a3e0bc", GitTreeState:"clean"}

Which chart:

$ helm ls
NAME    REVISION        UPDATED                         STATUS          CHART           NAMESPACE
ceph    1               Wed May  9 00:34:18 2018        DEPLOYED        ceph-0.1.0      ceph     

What happened:
ceph-mon, ceph-osd-*, ceph-mgr do not start. Fails with error:

container_linux.go:247: starting container process caused "process_linux.go:359: container init caused rootfs_linux.go:54: mounting "/var/lib/kubelet/pods/.../volume-subpaths/ceph-bootstrap-rgw-keyring/ceph-mon/8" to rootfs "/var/lib/docker/aufs/mnt/..." at "/var/lib/docker/aufs/mnt/.../var/lib/ceph/bootstrap-rgw/ceph.keyring" caused "not a directory"

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):
Run the steps from http://docs.ceph.com/docs/master/start/kube-helm/ on k8s 1.9.5

Anything else we need to know:
Looks like related to kubernetes/kubernetes#62417

It's an expected error due to security fix

ceph-mon failing: unable to load initial keyring

Is this a request for help?: yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Version of Helm and Kubernetes:

$ helm version
Client: &version.Version{SemVer:"v2.8+unreleased", GitCommit:"c5f2174f264554c62278c0695d58f250d3e207c8", GitTreeState:"clean"}
Server: &version.Version{SemVer:"canary+unreleased", GitCommit:"fe9d36533901b71923c49142f5cf007f93fa926f", GitTreeState:"clean"}

Kubernetes master > 1.9

Which chart: ceph

What happened:

I compiled k8s master from source (commit 04634cb19843195) and brought up a local cluster with:

RUNTIME_CONFIG=storage.k8s.io/v1alpha1=true ALLOW_PRIVILEGED=1 FEATURE_GATES="BlockVolume=true,MountPropagation=true,CSIPersistentVolume=true,ReadOnlyAPIDataVolumes=false"  hack/local-up-cluster.sh -O

Then I installed Helm and ceph. For Helm I use:

helm init --upgrade --canary-image

#  https://github.com/kubernetes/helm/issues/2224#issuecomment-356344286
kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'

For ceph I use these commands:

kubectl create namespace ceph
kubectl create -f /fast/work/ceph-helm/ceph/rbac.yaml
kubectl label node 127.0.0.1 ceph-mon=enabled ceph-mgr=enabled ceph-osd=enabled ceph-osd-device-loop=enabled

truncate -s 10G /fast/work/ceph-loop-storage
sudo touch /dev/ceph-loop-storage
sudo mount -obind /fast/work/ceph-loop-storage /dev/ceph-loop-storage

# The local helm repository contains the ceph chart.
if ! pidof helm >/dev/null; then
    helm serve &
fi

cat >/tmp/ceph-overrides.yaml <<EOF
network:
  public:   192.168.0/20
  cluster:   192.168.0.0/20

osd_devices:
  - name: loop
    device: /dev/ceph-loop-storage
    zap: "1"

storageclass:
  name: ceph-rbd
  pool: rbd
  user_id: k8s
EOF

sudo rm -rf /varlib/ceph-helm
# in ceph-helm
helm install --name=ceph ceph/ceph --namespace=ceph -f /tmp/ceph-overrides.yaml

After this, some pods never start and ceph-mon fails:

$ kubectl -n ceph get pods
NAME                                   READY     STATUS             RESTARTS   AGE
ceph-mds-6f9cb6bd69-dlrgm              0/1       Pending            0          2h
ceph-mgr-776957b4cb-wxlc8              0/1       Init:0/2           0          2h
ceph-mon-check-57b6ddf49d-dg8j6        0/1       Init:0/2           0          2h
ceph-mon-rfzr6                         2/3       CrashLoopBackOff   28         2h
ceph-osd-loop-ctzdc                    0/1       Init:0/3           0          2h
ceph-rbd-provisioner-b58659dc9-m68qv   1/1       Running            0          2h
ceph-rbd-provisioner-b58659dc9-trzmm   1/1       Running            0          2h
ceph-rgw-dcbb695d6-tfhjg               0/1       Pending            0          2h

$ kubectl -n ceph describe pod/ceph-mon-rfzr6 
Name:           ceph-mon-rfzr6
Namespace:      ceph
Node:           127.0.0.1/127.0.0.1
Start Time:     Tue, 13 Mar 2018 17:56:58 +0100
Labels:         application=ceph
                component=mon
                controller-revision-hash=3731361699
                pod-template-generation=1
                release_group=ceph
Annotations:    <none>
Status:         Running
IP:             127.0.0.1
Controlled By:  DaemonSet/ceph-mon
Init Containers:
  init:
    Container ID:  docker://32678bca76683c38ce5cfa6b174037285ecbde2148e97b036f56c70586d34044
    Image:         docker.io/kolla/ubuntu-source-kubernetes-entrypoint:4.0.0
    Image ID:      docker-pullable://kolla/ubuntu-source-kubernetes-entrypoint@sha256:75116ab2f9f65c5fc078e68ce7facd66c1c57496947f37b7209b32f94925e53b
    Port:          <none>
    Command:
      kubernetes-entrypoint
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 13 Mar 2018 17:57:25 +0100
      Finished:     Tue, 13 Mar 2018 17:57:25 +0100
    Ready:          True
    Restart Count:  0
    Environment:
      POD_NAME:              ceph-mon-rfzr6 (v1:metadata.name)
      NAMESPACE:             ceph (v1:metadata.namespace)
      INTERFACE_NAME:        eth0
      DEPENDENCY_SERVICE:    
      DEPENDENCY_JOBS:       
      DEPENDENCY_DAEMONSET:  
      DEPENDENCY_CONTAINER:  
      COMMAND:               echo done
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rn798 (ro)
  ceph-init-dirs:
    Container ID:  docker://92f77fe2d747af0d51d78916fcba20503a171910fdd89d29346b2cdd20788324
    Image:         docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04
    Image ID:      docker-pullable://ceph/daemon@sha256:687056228e899ecbfd311854e3864db0b46dd4a9a6d4eb4b47c815ca413f25ee
    Port:          <none>
    Command:
      /tmp/init_dirs.sh
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 13 Mar 2018 17:57:27 +0100
      Finished:     Tue, 13 Mar 2018 17:57:27 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /run from pod-run (rw)
      /tmp/init_dirs.sh from ceph-bin (ro)
      /var/lib/ceph from pod-var-lib-ceph (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rn798 (ro)
      /variables_entrypoint.sh from ceph-bin (ro)
Containers:
  cluster-audit-log-tailer:
    Container ID:  docker://928570a1606a2611e94429ceaa2973155652365c03f79b58e2d88d3471e3a731
    Image:         docker.io/alpine:latest
    Image ID:      docker-pullable://alpine@sha256:7b848083f93822dd21b0a2f14a110bd99f6efb4b838d499df6d04a49d0debf8b
    Port:          <none>
    Command:
      /tmp/log_handler.sh
    Args:
      /var/log/ceph/ceph.audit.log
    State:          Running
      Started:      Tue, 13 Mar 2018 17:57:28 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /tmp/log_handler.sh from ceph-bin (rw)
      /var/log/ceph from pod-var-log-ceph (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rn798 (ro)
  cluster-log-tailer:
    Container ID:  docker://7dc91378bf88b366ca4e3f8eb731ca142a702bd9a5b4125ca5b2bd6adaa7eb1c
    Image:         docker.io/alpine:latest
    Image ID:      docker-pullable://alpine@sha256:7b848083f93822dd21b0a2f14a110bd99f6efb4b838d499df6d04a49d0debf8b
    Port:          <none>
    Command:
      /tmp/log_handler.sh
    Args:
      /var/log/ceph/ceph.log
    State:          Running
      Started:      Tue, 13 Mar 2018 17:57:28 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /tmp/log_handler.sh from ceph-bin (rw)
      /var/log/ceph from pod-var-log-ceph (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rn798 (ro)
  ceph-mon:
    Container ID:  docker://49c26f3a425ac86fcaebbe6a19ed6f17fcfa79f8564c3019185326d162fb5a57
    Image:         docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04
    Image ID:      docker-pullable://ceph/daemon@sha256:687056228e899ecbfd311854e3864db0b46dd4a9a6d4eb4b47c815ca413f25ee
    Port:          6789/TCP
    Command:
      /start_mon.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 13 Mar 2018 20:04:54 +0100
      Finished:     Tue, 13 Mar 2018 20:05:00 +0100
    Ready:          False
    Restart Count:  29
    Liveness:       tcp-socket :6789 delay=60s timeout=5s period=10s #success=1 #failure=3
    Readiness:      tcp-socket :6789 delay=0s timeout=5s period=10s #success=1 #failure=3
    Environment:
      K8S_HOST_NETWORK:     1
      MONMAP:               /var/lib/ceph/mon/monmap
      NAMESPACE:            ceph (v1:metadata.namespace)
      CEPH_DAEMON:          mon
      CEPH_PUBLIC_NETWORK:  192.168.0/20
      KUBECTL_PARAM:        -l application=ceph -l component=mon
      MON_IP:                (v1:status.podIP)
    Mounts:
      /common_functions.sh from ceph-bin (ro)
      /etc/ceph/ceph.client.admin.keyring from ceph-client-admin-keyring (ro)
      /etc/ceph/ceph.conf from ceph-etc (ro)
      /run from pod-run (rw)
      /start_mon.sh from ceph-bin (ro)
      /var/lib/ceph from pod-var-lib-ceph (rw)
      /var/lib/ceph/bootstrap-mds/ceph.keyring from ceph-bootstrap-mds-keyring (rw)
      /var/lib/ceph/bootstrap-osd/ceph.keyring from ceph-bootstrap-osd-keyring (rw)
      /var/lib/ceph/bootstrap-rgw/ceph.keyring from ceph-bootstrap-rgw-keyring (rw)
      /var/log/ceph from pod-var-log-ceph (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rn798 (ro)
      /variables_entrypoint.sh from ceph-bin (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  ceph-bin:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ceph-bin
    Optional:  false
  ceph-etc:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ceph-etc
    Optional:  false
  pod-var-log-ceph:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  pod-var-lib-ceph:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/ceph-helm/ceph/mon
    HostPathType:  
  pod-run:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  Memory
  ceph-client-admin-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-client-admin-keyring
    Optional:    false
  ceph-bootstrap-osd-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-bootstrap-osd-keyring
    Optional:    false
  ceph-bootstrap-mds-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-bootstrap-mds-keyring
    Optional:    false
  ceph-bootstrap-rgw-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-bootstrap-rgw-keyring
    Optional:    false
  default-token-rn798:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-rn798
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  ceph-mon=enabled
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
Events:
  Type     Reason   Age                From                Message
  ----     ------   ----               ----                -------
  Normal   Pulled   58m (x19 over 2h)  kubelet, 127.0.0.1  Container image "docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04" already present on machine
  Warning  BackOff  3m (x560 over 2h)  kubelet, 127.0.0.1  Back-off restarting failed container


$ kubectl -n ceph logs ceph-mon-rfzr6 ceph-mon
+ export LC_ALL=C
+ LC_ALL=C
+ source variables_entrypoint.sh
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr'
++ : ceph
++ : ceph-config/ceph
++ : 192.168.0/20
++ : mon
++ : 0
++ : pohly-desktop
++ : pohly-desktop
++ : /var/lib/ceph/mon/monmap
++ : /var/lib/ceph/mon/ceph-pohly-desktop
++ : 1
++ : 0
++ : mds-pohly-desktop
++ : 0
++ : 100
++ : 0
++ : 0
+++ uuidgen
++ : a22cb2e2-eac5-49ae-85aa-8abf231ee5c1
+++ uuidgen
++ : 03e59aae-9eb7-47c9-9d4e-7348ce4ead15
++ : root=default host=pohly-desktop
++ : 0
++ : cephfs
++ : cephfs_data
++ : 8
++ : cephfs_metadata
++ : 8
++ : pohly-desktop
++ :
++ :
++ : 8080
++ : 0
++ : 9000
++ : 0.0.0.0
++ : cephnfs
++ : pohly-desktop
++ : 0.0.0.0
++ CLI_OPTS='--cluster ceph'
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d'
++ MOUNT_OPTS='-t xfs -o noatime,inode64'
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-pohly-desktop/keyring
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring
++ RGW_KEYRING=/var/lib/ceph/radosgw/pohly-desktop/keyring
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-pohly-desktop/keyring
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph
+ source common_functions.sh
++ set -ex
+ [[ -z 192.168.0/20 ]]
+ [[ -z 127.0.0.1 ]]
+ [[ -z 127.0.0.1 ]]
+ [[ -z 192.168.0/20 ]]
+ get_mon_config
++ ceph-conf --lookup fsid -c /etc/ceph/ceph.conf
+ local fsid=35f6bbc7-9f08-4984-a8c0-e690f95059ca
+ timeout=10
+ MONMAP_ADD=
+ [[ -z '' ]]
+ [[ 10 -gt 0 ]]
+ [[ 1 -eq 0 ]]
++ kubectl get pods --namespace=ceph -l application=ceph -l component=mon -o template '--template={{range .items}}{{if .status.podIP}}--add {{.spec.nodeName}} {{.status.podIP}} {{end}} {{end}}'
+ MONMAP_ADD='--add 127.0.0.1 127.0.0.1  '
+ ((  timeout--  ))
+ sleep 1
+ [[ -z --add127.0.0.1127.0.0.1 ]]
+ [[ -z --add127.0.0.1127.0.0.1 ]]
+ '[' -f /var/lib/ceph/mon/monmap ']'
+ monmaptool --print /var/lib/ceph/mon/monmap
+ grep -q 127.0.0.1:6789
+ '[' 0 -eq 0 ']'
+ log '127.0.0.1 already exists in monmap /var/lib/ceph/mon/monmap'
+ '[' -z '127.0.0.1 already exists in monmap /var/lib/ceph/mon/monmap' ']'
++ date '+%F %T'
+ TIMESTAMP='2018-03-13 18:59:39'
+ echo '2018-03-13 18:59:39  /start_mon.sh: 127.0.0.1 already exists in monmap /var/lib/ceph/mon/monmap'
+ return 0
+ return
2018-03-13 18:59:39  /start_mon.sh: 127.0.0.1 already exists in monmap /var/lib/ceph/mon/monmap
+ chown ceph. /var/log/ceph
+ '[' '!' -e /var/lib/ceph/mon/ceph-pohly-desktop/keyring ']'
+ '[' '!' -e /etc/ceph/ceph.mon.keyring ']'
+ touch /etc/ceph/ceph.mon.keyring
+ '[' '!' -e /var/lib/ceph/mon/monmap ']'
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-osd/ceph.keyring into /etc/ceph/ceph.mon.keyring
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-mds/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-mds/ceph.keyring into /etc/ceph/ceph.mon.keyring
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-rgw/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-rgw/ceph.keyring into /etc/ceph/ceph.mon.keyring
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
importing contents of /etc/ceph/ceph.client.admin.keyring into /etc/ceph/ceph.mon.keyring
+ ceph-mon --setuser ceph --setgroup ceph --cluster ceph --mkfs -i pohly-desktop --inject-monmap /var/lib/ceph/mon/monmap --keyring /etc/ceph/ceph.mon.keyring --mon-data /var/lib/ceph/mon/ceph-pohly-desktop
2018-03-13 18:59:39.715279 7fa991335f00 -1 '/var/lib/ceph/mon/ceph-pohly-desktop' already exists and is not empty: monitor may already exist
+ log SUCCESS
+ '[' -z SUCCESS ']'
++ date '+%F %T'
+ TIMESTAMP='2018-03-13 18:59:39'
+ echo '2018-03-13 18:59:39  /start_mon.sh: SUCCESS'
+ return 0
+ exec /usr/bin/ceph-mon --cluster ceph --setuser ceph --setgroup ceph -d -i pohly-desktop --mon-data /var/lib/ceph/mon/ceph-pohly-desktop --public-addr 127.0.0.1:6789
2018-03-13 18:59:39  /start_mon.sh: SUCCESS
2018-03-13 18:59:39.756113 7f3f7a7fff00  0 set uid:gid to 64045:64045 (ceph:ceph)
2018-03-13 18:59:39.756139 7f3f7a7fff00  0 ceph version 12.2.3 (2dab17a455c09584f2a85e6b10888337d1ec8949) luminous (stable), process (unknown), pid 1
2018-03-13 18:59:39.756228 7f3f7a7fff00  0 pidfile_write: ignore empty --pid-file
2018-03-13 18:59:39.764761 7f3f7a7fff00  0 load: jerasure load: lrc load: isa 
2018-03-13 18:59:39.764863 7f3f7a7fff00  0  set rocksdb option compression = kNoCompression
2018-03-13 18:59:39.764877 7f3f7a7fff00  0  set rocksdb option write_buffer_size = 33554432
2018-03-13 18:59:39.764899 7f3f7a7fff00  0  set rocksdb option compression = kNoCompression
2018-03-13 18:59:39.764904 7f3f7a7fff00  0  set rocksdb option write_buffer_size = 33554432
2018-03-13 18:59:39.765053 7f3f7a7fff00  4 rocksdb: RocksDB version: 5.4.0

2018-03-13 18:59:39.765065 7f3f7a7fff00  4 rocksdb: Git sha rocksdb_build_git_sha:@0@
2018-03-13 18:59:39.765067 7f3f7a7fff00  4 rocksdb: Compile date Feb 19 2018
2018-03-13 18:59:39.765068 7f3f7a7fff00  4 rocksdb: DB SUMMARY

2018-03-13 18:59:39.765124 7f3f7a7fff00  4 rocksdb: CURRENT file:  CURRENT

2018-03-13 18:59:39.765131 7f3f7a7fff00  4 rocksdb: IDENTITY file:  IDENTITY

2018-03-13 18:59:39.765136 7f3f7a7fff00  4 rocksdb: MANIFEST file:  MANIFEST-000110 size: 109 Bytes

2018-03-13 18:59:39.765139 7f3f7a7fff00  4 rocksdb: SST files in /var/lib/ceph/mon/ceph-pohly-desktop/store.db dir, Total Num: 1, files: 000004.sst 

2018-03-13 18:59:39.765142 7f3f7a7fff00  4 rocksdb: Write Ahead Log file in /var/lib/ceph/mon/ceph-pohly-desktop/store.db: 000111.log size: 0 ; 

2018-03-13 18:59:39.765144 7f3f7a7fff00  4 rocksdb:                         Options.error_if_exists: 0
2018-03-13 18:59:39.765145 7f3f7a7fff00  4 rocksdb:                       Options.create_if_missing: 0
2018-03-13 18:59:39.765146 7f3f7a7fff00  4 rocksdb:                         Options.paranoid_checks: 1
2018-03-13 18:59:39.765147 7f3f7a7fff00  4 rocksdb:                                     Options.env: 0x557b73957f20
2018-03-13 18:59:39.765149 7f3f7a7fff00  4 rocksdb:                                Options.info_log: 0x557b757689a0
2018-03-13 18:59:39.765150 7f3f7a7fff00  4 rocksdb:                          Options.max_open_files: -1
2018-03-13 18:59:39.765151 7f3f7a7fff00  4 rocksdb:                Options.max_file_opening_threads: 16
2018-03-13 18:59:39.765152 7f3f7a7fff00  4 rocksdb:                               Options.use_fsync: 0
2018-03-13 18:59:39.765153 7f3f7a7fff00  4 rocksdb:                       Options.max_log_file_size: 0
2018-03-13 18:59:39.765155 7f3f7a7fff00  4 rocksdb:                  Options.max_manifest_file_size: 18446744073709551615
2018-03-13 18:59:39.765156 7f3f7a7fff00  4 rocksdb:                   Options.log_file_time_to_roll: 0
2018-03-13 18:59:39.765157 7f3f7a7fff00  4 rocksdb:                       Options.keep_log_file_num: 1000
2018-03-13 18:59:39.765158 7f3f7a7fff00  4 rocksdb:                    Options.recycle_log_file_num: 0
2018-03-13 18:59:39.765159 7f3f7a7fff00  4 rocksdb:                         Options.allow_fallocate: 1
2018-03-13 18:59:39.765160 7f3f7a7fff00  4 rocksdb:                        Options.allow_mmap_reads: 0
2018-03-13 18:59:39.765161 7f3f7a7fff00  4 rocksdb:                       Options.allow_mmap_writes: 0
2018-03-13 18:59:39.765162 7f3f7a7fff00  4 rocksdb:                        Options.use_direct_reads: 0
2018-03-13 18:59:39.765178 7f3f7a7fff00  4 rocksdb:                        Options.use_direct_io_for_flush_and_compaction: 0
2018-03-13 18:59:39.765179 7f3f7a7fff00  4 rocksdb:          Options.create_missing_column_families: 0
2018-03-13 18:59:39.765180 7f3f7a7fff00  4 rocksdb:                              Options.db_log_dir: 
2018-03-13 18:59:39.765181 7f3f7a7fff00  4 rocksdb:                                 Options.wal_dir: /var/lib/ceph/mon/ceph-pohly-desktop/store.db
2018-03-13 18:59:39.765182 7f3f7a7fff00  4 rocksdb:                Options.table_cache_numshardbits: 6
2018-03-13 18:59:39.765183 7f3f7a7fff00  4 rocksdb:                      Options.max_subcompactions: 1
2018-03-13 18:59:39.765185 7f3f7a7fff00  4 rocksdb:                  Options.max_background_flushes: 1
2018-03-13 18:59:39.765186 7f3f7a7fff00  4 rocksdb:                         Options.WAL_ttl_seconds: 0
2018-03-13 18:59:39.765186 7f3f7a7fff00  4 rocksdb:                       Options.WAL_size_limit_MB: 0
2018-03-13 18:59:39.765187 7f3f7a7fff00  4 rocksdb:             Options.manifest_preallocation_size: 4194304
2018-03-13 18:59:39.765189 7f3f7a7fff00  4 rocksdb:                     Options.is_fd_close_on_exec: 1
2018-03-13 18:59:39.765190 7f3f7a7fff00  4 rocksdb:                   Options.advise_random_on_open: 1
2018-03-13 18:59:39.765191 7f3f7a7fff00  4 rocksdb:                    Options.db_write_buffer_size: 0
2018-03-13 18:59:39.765192 7f3f7a7fff00  4 rocksdb:         Options.access_hint_on_compaction_start: 1
2018-03-13 18:59:39.765193 7f3f7a7fff00  4 rocksdb:  Options.new_table_reader_for_compaction_inputs: 0
2018-03-13 18:59:39.765194 7f3f7a7fff00  4 rocksdb:               Options.compaction_readahead_size: 0
2018-03-13 18:59:39.765195 7f3f7a7fff00  4 rocksdb:           Options.random_access_max_buffer_size: 1048576
2018-03-13 18:59:39.765196 7f3f7a7fff00  4 rocksdb:           Options.writable_file_max_buffer_size: 1048576
2018-03-13 18:59:39.765197 7f3f7a7fff00  4 rocksdb:                      Options.use_adaptive_mutex: 0
2018-03-13 18:59:39.765198 7f3f7a7fff00  4 rocksdb:                            Options.rate_limiter: (nil)
2018-03-13 18:59:39.765199 7f3f7a7fff00  4 rocksdb:     Options.sst_file_manager.rate_bytes_per_sec: 0
2018-03-13 18:59:39.765200 7f3f7a7fff00  4 rocksdb:                          Options.bytes_per_sync: 0
2018-03-13 18:59:39.765201 7f3f7a7fff00  4 rocksdb:                      Options.wal_bytes_per_sync: 0
2018-03-13 18:59:39.765202 7f3f7a7fff00  4 rocksdb:                       Options.wal_recovery_mode: 2
2018-03-13 18:59:39.765203 7f3f7a7fff00  4 rocksdb:                  Options.enable_thread_tracking: 0
2018-03-13 18:59:39.765204 7f3f7a7fff00  4 rocksdb:         Options.allow_concurrent_memtable_write: 1
2018-03-13 18:59:39.765205 7f3f7a7fff00  4 rocksdb:      Options.enable_write_thread_adaptive_yield: 1
2018-03-13 18:59:39.765206 7f3f7a7fff00  4 rocksdb:             Options.write_thread_max_yield_usec: 100
2018-03-13 18:59:39.765207 7f3f7a7fff00  4 rocksdb:            Options.write_thread_slow_yield_usec: 3
2018-03-13 18:59:39.765208 7f3f7a7fff00  4 rocksdb:                               Options.row_cache: None
2018-03-13 18:59:39.765209 7f3f7a7fff00  4 rocksdb:                              Options.wal_filter: None
2018-03-13 18:59:39.765210 7f3f7a7fff00  4 rocksdb:             Options.avoid_flush_during_recovery: 0
2018-03-13 18:59:39.765211 7f3f7a7fff00  4 rocksdb:             Options.base_background_compactions: 1
2018-03-13 18:59:39.765212 7f3f7a7fff00  4 rocksdb:             Options.max_background_compactions: 1
2018-03-13 18:59:39.765214 7f3f7a7fff00  4 rocksdb:             Options.avoid_flush_during_shutdown: 0
2018-03-13 18:59:39.765215 7f3f7a7fff00  4 rocksdb:             Options.delayed_write_rate : 16777216
2018-03-13 18:59:39.765216 7f3f7a7fff00  4 rocksdb:             Options.max_total_wal_size: 0
2018-03-13 18:59:39.765217 7f3f7a7fff00  4 rocksdb:             Options.delete_obsolete_files_period_micros: 21600000000
2018-03-13 18:59:39.765218 7f3f7a7fff00  4 rocksdb:                   Options.stats_dump_period_sec: 600
2018-03-13 18:59:39.765219 7f3f7a7fff00  4 rocksdb: Compression algorithms supported:
2018-03-13 18:59:39.765220 7f3f7a7fff00  4 rocksdb: 	Snappy supported: 0
2018-03-13 18:59:39.765221 7f3f7a7fff00  4 rocksdb: 	Zlib supported: 0
2018-03-13 18:59:39.765223 7f3f7a7fff00  4 rocksdb: 	Bzip supported: 0
2018-03-13 18:59:39.765224 7f3f7a7fff00  4 rocksdb: 	LZ4 supported: 0
2018-03-13 18:59:39.765224 7f3f7a7fff00  4 rocksdb: 	ZSTD supported: 0
2018-03-13 18:59:39.765225 7f3f7a7fff00  4 rocksdb: Fast CRC32 supported: 0
2018-03-13 18:59:39.765616 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/version_set.cc:2609] Recovering from manifest file: MANIFEST-000110

2018-03-13 18:59:39.765753 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/column_family.cc:407] --------------- Options for column family [default]:

2018-03-13 18:59:39.765776 7f3f7a7fff00  4 rocksdb:               Options.comparator: leveldb.BytewiseComparator
2018-03-13 18:59:39.765778 7f3f7a7fff00  4 rocksdb:           Options.merge_operator: 
2018-03-13 18:59:39.765780 7f3f7a7fff00  4 rocksdb:        Options.compaction_filter: None
2018-03-13 18:59:39.765782 7f3f7a7fff00  4 rocksdb:        Options.compaction_filter_factory: None
2018-03-13 18:59:39.765783 7f3f7a7fff00  4 rocksdb:         Options.memtable_factory: SkipListFactory
2018-03-13 18:59:39.765785 7f3f7a7fff00  4 rocksdb:            Options.table_factory: BlockBasedTable
2018-03-13 18:59:39.765815 7f3f7a7fff00  4 rocksdb:            table_factory options:   flush_block_policy_factory: FlushBlockBySizePolicyFactory (0x557b754480f8)
  cache_index_and_filter_blocks: 1
  cache_index_and_filter_blocks_with_high_priority: 1
  pin_l0_filter_and_index_blocks_in_cache: 1
  index_type: 0
  hash_index_allow_collision: 1
  checksum: 1
  no_block_cache: 0
  block_cache: 0x557b75754400
  block_cache_name: LRUCache
  block_cache_options:
    capacity : 134217728
    num_shard_bits : 4
    strict_capacity_limit : 0
    high_pri_pool_ratio: 0.000
  block_cache_compressed: (nil)
  persistent_cache: (nil)
  block_size: 4096
  block_size_deviation: 10
  block_restart_interval: 16
  index_block_restart_interval: 1
  filter_policy: rocksdb.BuiltinBloomFilter
  whole_key_filtering: 1
  format_version: 2

2018-03-13 18:59:39.765873 7f3f7a7fff00  4 rocksdb:        Options.write_buffer_size: 33554432
2018-03-13 18:59:39.765879 7f3f7a7fff00  4 rocksdb:  Options.max_write_buffer_number: 2
2018-03-13 18:59:39.765881 7f3f7a7fff00  4 rocksdb:          Options.compression: NoCompression
2018-03-13 18:59:39.765884 7f3f7a7fff00  4 rocksdb:                  Options.bottommost_compression: Disabled
2018-03-13 18:59:39.765886 7f3f7a7fff00  4 rocksdb:       Options.prefix_extractor: nullptr
2018-03-13 18:59:39.765888 7f3f7a7fff00  4 rocksdb:   Options.memtable_insert_with_hint_prefix_extractor: nullptr
2018-03-13 18:59:39.765889 7f3f7a7fff00  4 rocksdb:             Options.num_levels: 7
2018-03-13 18:59:39.765890 7f3f7a7fff00  4 rocksdb:        Options.min_write_buffer_number_to_merge: 1
2018-03-13 18:59:39.765892 7f3f7a7fff00  4 rocksdb:     Options.max_write_buffer_number_to_maintain: 0
2018-03-13 18:59:39.765893 7f3f7a7fff00  4 rocksdb:            Options.compression_opts.window_bits: -14
2018-03-13 18:59:39.765894 7f3f7a7fff00  4 rocksdb:                  Options.compression_opts.level: -1
2018-03-13 18:59:39.765895 7f3f7a7fff00  4 rocksdb:               Options.compression_opts.strategy: 0
2018-03-13 18:59:39.765897 7f3f7a7fff00  4 rocksdb:         Options.compression_opts.max_dict_bytes: 0
2018-03-13 18:59:39.765898 7f3f7a7fff00  4 rocksdb:      Options.level0_file_num_compaction_trigger: 4
2018-03-13 18:59:39.765899 7f3f7a7fff00  4 rocksdb:          Options.level0_slowdown_writes_trigger: 20
2018-03-13 18:59:39.765901 7f3f7a7fff00  4 rocksdb:              Options.level0_stop_writes_trigger: 36
2018-03-13 18:59:39.765902 7f3f7a7fff00  4 rocksdb:                   Options.target_file_size_base: 67108864
2018-03-13 18:59:39.765903 7f3f7a7fff00  4 rocksdb:             Options.target_file_size_multiplier: 1
2018-03-13 18:59:39.765905 7f3f7a7fff00  4 rocksdb:                Options.max_bytes_for_level_base: 268435456
2018-03-13 18:59:39.765916 7f3f7a7fff00  4 rocksdb: Options.level_compaction_dynamic_level_bytes: 0
2018-03-13 18:59:39.765917 7f3f7a7fff00  4 rocksdb:          Options.max_bytes_for_level_multiplier: 10.000000
2018-03-13 18:59:39.765921 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[0]: 1
2018-03-13 18:59:39.765923 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[1]: 1
2018-03-13 18:59:39.765925 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[2]: 1
2018-03-13 18:59:39.765926 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[3]: 1
2018-03-13 18:59:39.765927 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[4]: 1
2018-03-13 18:59:39.765928 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[5]: 1
2018-03-13 18:59:39.765930 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[6]: 1
2018-03-13 18:59:39.765931 7f3f7a7fff00  4 rocksdb:       Options.max_sequential_skip_in_iterations: 8
2018-03-13 18:59:39.765932 7f3f7a7fff00  4 rocksdb:                    Options.max_compaction_bytes: 1677721600
2018-03-13 18:59:39.765933 7f3f7a7fff00  4 rocksdb:                        Options.arena_block_size: 4194304
2018-03-13 18:59:39.765934 7f3f7a7fff00  4 rocksdb:   Options.soft_pending_compaction_bytes_limit: 68719476736
2018-03-13 18:59:39.765935 7f3f7a7fff00  4 rocksdb:   Options.hard_pending_compaction_bytes_limit: 274877906944
2018-03-13 18:59:39.765937 7f3f7a7fff00  4 rocksdb:       Options.rate_limit_delay_max_milliseconds: 100
2018-03-13 18:59:39.765938 7f3f7a7fff00  4 rocksdb:                Options.disable_auto_compactions: 0
2018-03-13 18:59:39.765940 7f3f7a7fff00  4 rocksdb:                         Options.compaction_style: kCompactionStyleLevel
2018-03-13 18:59:39.765951 7f3f7a7fff00  4 rocksdb:                           Options.compaction_pri: kByCompensatedSize
2018-03-13 18:59:39.765952 7f3f7a7fff00  4 rocksdb:  Options.compaction_options_universal.size_ratio: 1
2018-03-13 18:59:39.765954 7f3f7a7fff00  4 rocksdb: Options.compaction_options_universal.min_merge_width: 2
2018-03-13 18:59:39.765955 7f3f7a7fff00  4 rocksdb: Options.compaction_options_universal.max_merge_width: 4294967295
2018-03-13 18:59:39.765957 7f3f7a7fff00  4 rocksdb: Options.compaction_options_universal.max_size_amplification_percent: 200
2018-03-13 18:59:39.765958 7f3f7a7fff00  4 rocksdb: Options.compaction_options_universal.compression_size_percent: -1
2018-03-13 18:59:39.765959 7f3f7a7fff00  4 rocksdb: Options.compaction_options_fifo.max_table_files_size: 1073741824
2018-03-13 18:59:39.765961 7f3f7a7fff00  4 rocksdb:                   Options.table_properties_collectors: 
2018-03-13 18:59:39.765962 7f3f7a7fff00  4 rocksdb:                   Options.inplace_update_support: 0
2018-03-13 18:59:39.765963 7f3f7a7fff00  4 rocksdb:                 Options.inplace_update_num_locks: 10000
2018-03-13 18:59:39.765965 7f3f7a7fff00  4 rocksdb:               Options.memtable_prefix_bloom_size_ratio: 0.000000
2018-03-13 18:59:39.765967 7f3f7a7fff00  4 rocksdb:   Options.memtable_huge_page_size: 0
2018-03-13 18:59:39.765968 7f3f7a7fff00  4 rocksdb:                           Options.bloom_locality: 0
2018-03-13 18:59:39.765969 7f3f7a7fff00  4 rocksdb:                    Options.max_successive_merges: 0
2018-03-13 18:59:39.765970 7f3f7a7fff00  4 rocksdb:                Options.optimize_filters_for_hits: 0
2018-03-13 18:59:39.765971 7f3f7a7fff00  4 rocksdb:                Options.paranoid_file_checks: 0
2018-03-13 18:59:39.765972 7f3f7a7fff00  4 rocksdb:                Options.force_consistency_checks: 0
2018-03-13 18:59:39.765983 7f3f7a7fff00  4 rocksdb:                Options.report_bg_io_stats: 0
2018-03-13 18:59:39.767128 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/version_set.cc:2859] Recovered from manifest file:/var/lib/ceph/mon/ceph-pohly-desktop/store.db/MANIFEST-000110 succeeded,manifest_file_number is 110, next_file_number is 112, last_sequence is 5, log_number is 0,prev_log_number is 0,max_column_family is 0

2018-03-13 18:59:39.767145 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/version_set.cc:2867] Column family [default] (ID 0), log number is 109

2018-03-13 18:59:39.767239 7f3f7a7fff00  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1520967579767228, "job": 1, "event": "recovery_started", "log_files": [111]}
2018-03-13 18:59:39.767250 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/db_impl_open.cc:482] Recovering log #111 mode 2
2018-03-13 18:59:39.767358 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/version_set.cc:2395] Creating manifest 113

2018-03-13 18:59:39.769145 7f3f7a7fff00  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1520967579769139, "job": 1, "event": "recovery_finished"}
2018-03-13 18:59:39.773075 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/db_impl_open.cc:1063] DB pointer 0x557b75868000
2018-03-13 18:59:39.773292 7f3f7a7fff00  0 mon.pohly-desktop does not exist in monmap, will attempt to join an existing cluster
2018-03-13 18:59:39.773699 7f3f7a7fff00  0 using public_addr 127.0.0.1:6789/0 -> 127.0.0.1:6789/0
2018-03-13 18:59:39.774640 7f3f7a7fff00  0 starting mon.pohly-desktop rank -1 at public addr 127.0.0.1:6789/0 at bind addr 127.0.0.1:6789/0 mon_data /var/lib/ceph/mon/ceph-pohly-desktop fsid 8811b121-9e17-43f7-a276-87246527f00a
2018-03-13 18:59:39.774806 7f3f7a7fff00  0 starting mon.pohly-desktop rank -1 at 127.0.0.1:6789/0 mon_data /var/lib/ceph/mon/ceph-pohly-desktop fsid 8811b121-9e17-43f7-a276-87246527f00a
2018-03-13 18:59:39.775574 7f3f7a7fff00  1 mon.pohly-desktop@-1(probing) e0 preinit fsid 8811b121-9e17-43f7-a276-87246527f00a
2018-03-13 18:59:39.775702 7f3f7a7fff00  1 mon.pohly-desktop@-1(probing).mds e0 Unable to load 'last_metadata'
2018-03-13 18:59:39.776171 7f3f7a7fff00 -1 auth: error reading file: /var/lib/ceph/mon/ceph-pohly-desktop/keyring: can't open /var/lib/ceph/mon/ceph-pohly-desktop/keyring: (2) No such file or directory
2018-03-13 18:59:39.776181 7f3f7a7fff00 -1 mon.pohly-desktop@-1(probing) e0 unable to load initial keyring /etc/ceph/ceph.mon.pohly-desktop.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
2018-03-13 18:59:39.776185 7f3f7a7fff00 -1 failed to initialize

Note the failure: e0 unable to load initial keyring /etc/ceph/ceph.mon.pohly-desktop.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,

What you expected to happen:

I expect all pods to run normally, except for rgw and mds (no host for those).

How to reproduce it (as minimally and precisely as possible):

See above.

Anything else we need to know:

Note that I am working around issue #50 with ReadOnlyAPIDataVolumes=false

Bad Header Magic on two of three OSD nodes

Is this a request for help?: YES


Is this a BUG REPORT? (choose one):BUG REPORT

Version of Helm and Kubernetes:

Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
chrisp@px-chrisp1:~/APICv2018DevInstall$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:43:26Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

Which chart:
helm-ceph

What happened:
When deploying with three OSD nodes two of the nodes (and it changes each time) fail to start with bad magic header. See log below.

ceph-osd-dev-sdb-6mjwq                      0/1       Running     1          2m
ceph-osd-dev-sdb-9rgmd                      1/1       Running     0          2m
ceph-osd-dev-sdb-nrv8v                      0/1       Running     1          2m
chrisp@px-chrisp1:~/$ kubectl logs -nceph po/ceph-osd-dev-sdb-nrv8v  osd-activate-pod
+ export LC_ALL=C
+ LC_ALL=C
+ source variables_entrypoint.sh
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr'
++ : ceph
++ : ceph-config/ceph
++ :
++ : osd_ceph_disk_activate
++ : 1
++ : px-chrisp3
++ : px-chrisp3
++ : /etc/ceph/monmap-ceph
++ : /var/lib/ceph/mon/ceph-px-chrisp3
++ : 0
++ : 0
++ : mds-px-chrisp3
++ : 0
++ : 100
++ : 0
++ : 0
+++ uuidgen
++ : 57ea3535-932a-410f-bf05-f6386e6f9b54
+++ uuidgen
++ : 472f01d3-5053-4ab4-9aef-9435fc48c484
++ : root=default host=px-chrisp3
++ : 0
++ : cephfs
++ : cephfs_data
++ : 8
++ : cephfs_metadata
++ : 8
++ : px-chrisp3
++ :
++ :
++ : 8080
++ : 0
++ : 9000
++ : 0.0.0.0
++ : cephnfs
++ : px-chrisp3
++ : 0.0.0.0
++ CLI_OPTS='--cluster ceph'
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d'
++ MOUNT_OPTS='-t xfs -o noatime,inode64'
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-px-chrisp3/keyring
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring
++ RGW_KEYRING=/var/lib/ceph/radosgw/px-chrisp3/keyring
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-px-chrisp3/keyring
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph
+ source common_functions.sh
++ set -ex
+ is_available rpm
+ command -v rpm
+ is_available dpkg
+ command -v dpkg
+ OS_VENDOR=ubuntu
+ source /etc/default/ceph
++ TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728
+ case "$CEPH_DAEMON" in
+ OSD_TYPE=activate
+ start_osd
+ [[ ! -e /etc/ceph/ceph.conf ]]
+ '[' 1 -eq 1 ']'
+ [[ ! -e /etc/ceph/ceph.client.admin.keyring ]]
+ case "$OSD_TYPE" in
+ source osd_disk_activate.sh
++ set -ex
+ osd_activate
+ [[ -z /dev/sdb ]]
+ CEPH_DISK_OPTIONS=
+ CEPH_OSD_OPTIONS=
++ blkid -o value -s PARTUUID /dev/sdb1
+ DATA_UUID=5ada2967-155e-4208-86c4-21e7edfae0f1
++ blkid -o value -s PARTUUID /dev/sdb3
++ true
+ LOCKBOX_UUID=
++ dev_part /dev/sdb 2
++ local osd_device=/dev/sdb
++ local osd_partition=2
++ [[ -L /dev/sdb ]]
++ [[ b == [0-9] ]]
++ echo /dev/sdb2
+ JOURNAL_PART=/dev/sdb2
++ readlink -f /dev/sdb
+ ACTUAL_OSD_DEVICE=/dev/sdb
+ udevadm settle --timeout=600
+ [[ -n '' ]]
++ dev_part /dev/sdb 1
++ local osd_device=/dev/sdb
++ local osd_partition=1
++ [[ -L /dev/sdb ]]
++ [[ b == [0-9] ]]
++ echo /dev/sdb1
+ wait_for_file /dev/sdb1
+ timeout 10 bash -c 'while [ ! -e /dev/sdb1 ]; do echo '\''Waiting for /dev/sdb1 to show up'\'' && sleep 1 ; done'
+ chown ceph. /dev/sdb2
+ chown ceph. /var/log/ceph
++ dev_part /dev/sdb 1
++ local osd_device=/dev/sdb
++ local osd_partition=1
++ [[ -L /dev/sdb ]]
++ [[ b == [0-9] ]]
++ echo /dev/sdb1
+ DATA_PART=/dev/sdb1
+ MOUNTED_PART=/dev/sdb1
+ [[ 0 -eq 1 ]]
+ ceph-disk -v --setuser ceph --setgroup disk activate --no-start-daemon /dev/sdb1
main_activate: path = /dev/sdb1
get_dm_uuid: get_dm_uuid /dev/sdb1 uuid path is /sys/dev/block/8:17/dm/uuid
command: Running command: /sbin/blkid -o udev -p /dev/sdb1
command: Running command: /sbin/blkid -p -s TYPE -o value -- /dev/sdb1
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
mount: Mounting /dev/sdb1 on /var/lib/ceph/tmp/mnt.BPKzys with options noatime,inode64
command_check_call: Running command: /bin/mount -t xfs -o noatime,inode64 -- /dev/sdb1 /var/lib/ceph/tmp/mnt.BPKzys
activate: Cluster uuid is 56d8e493-f75d-43b0-af75-b5e9ed708416
command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
activate: Cluster name is ceph
activate: OSD uuid is 5ada2967-155e-4208-86c4-21e7edfae0f1
activate: OSD id is 1
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup init
command: Running command: /usr/bin/ceph-detect-init --default sysvinit
activate: Marking with init system none
command: Running command: /bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.BPKzys/none
activate: ceph osd.1 data dir is ready at /var/lib/ceph/tmp/mnt.BPKzys
move_mount: Moving mount to final location...
command_check_call: Running command: /bin/mount -o noatime,inode64 -- /dev/sdb1 /var/lib/ceph/osd/ceph-1
command_check_call: Running command: /bin/umount -l -- /var/lib/ceph/tmp/mnt.BPKzys
++ grep /dev/sdb1 /proc/mounts
++ awk '{print $2}'
++ grep -oh '[0-9]*'
+ OSD_ID=1
++ get_osd_path 1
++ echo /var/lib/ceph/osd/ceph-1/
+ OSD_PATH=/var/lib/ceph/osd/ceph-1/
+ OSD_KEYRING=/var/lib/ceph/osd/ceph-1//keyring
++ df -P -k /var/lib/ceph/osd/ceph-1/
++ tail -1
++ awk '{ d= $2/1073741824 ; r = sprintf("%.2f", d); print r }'
+ OSD_WEIGHT=0.09
+ ceph --cluster ceph --name=osd.1 --keyring=/var/lib/ceph/osd/ceph-1//keyring osd crush create-or-move -- 1 0.09 root=default host=px-chrisp3
create-or-move updated item name 'osd.1' weight 0.09 at location {host=px-chrisp3,root=default} to crush map
+ log SUCCESS
+ '[' -z SUCCESS ']'
++ date '+%F %T'
+ TIMESTAMP='2018-08-05 15:05:13'
+ echo '2018-08-05 15:05:13  /start_osd.sh: SUCCESS'
+ return 0
+ exec /usr/bin/ceph-osd --cluster ceph -f -i 1 --setuser ceph --setgroup disk
2018-08-05 15:05:13  /start_osd.sh: SUCCESS
starting osd.1 at - osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
2018-08-05 15:05:13.822997 7fa44b9aae00 -1 journal do_read_entry(323584): bad header magic
2018-08-05 15:05:13.823010 7fa44b9aae00 -1 journal do_read_entry(323584): bad header magic
2018-08-05 15:05:13.835202 7fa44b9aae00 -1 osd.1 10 log_to_monitors {default=true} 

I have a disk on /dev/sdb with no partitions on all nodes, i even remove the paritions on install to ensure they are not there. I also remove /var/lib/ceph-helm

What you expected to happen:
I would expect all three pods to start.

How to reproduce it (as minimally and precisely as possible):
I followed these instructions https://github.com/helm/helm#docs
Samples from my make script
CleanUp

        ssh $(workerNode) sudo kubeadm reset --force 
        ssh $(workerNode) sudo rm -rf /var/lib/ceph-helm
        ssh $(workerNode) sudo rm -rf /var/kubernetes
        ssh $(workerNode) "( echo d ; echo 1 ; echo d ; echo w ) | sudo fdisk /dev/sdb"
        
        ssh $(workerNode2) sudo kubeadm reset --force  
        ssh $(workerNode2) sudo rm -rf   /var/lib/ceph-helm 
        ssh $(workerNode2) sudo rm -rf /var/kubernetes
        ssh $(workerNode2) "( echo d ; echo 1 ; echo d ; echo w ) | sudo fdisk /dev/sdb"
        ( echo d ; echo 1 ; echo d ; echo w ) | sudo fdisk /dev/sdb
        sudo rm -rf ~/.kube
        sudo rm -rf ~/.helm
        sudo rm -rf /var/kubernetes
        sudo rm -rf  /var/lib/ceph-helm

installCeph

        kubectl create namespace ceph
        $(MAKE) -C ceph-helm/ceph/
        kubectl create -f ceph-helm/ceph/rbac.yaml
        kubectl label node px-chrisp1 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled ceph-mon=enabled ceph-mgr=enabled
        kubectl label node px-chrisp2 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled ceph-mon=enabled ceph-mgr=enabled
        kubectl label node px-chrisp3 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled ceph-mon=enabled ceph-mgr=enabled
        helm install --name=ceph local/ceph --namespace=ceph -f ~/ceph-overrides.yaml || helm upgrade ceph local/ceph -f ~/ceph-overrides.yaml --recreate-pods

Anything else we need to know:
K8s over three nodes with the master on one of the nodes. Trying to set up an HA K8s cluster (management will be done after ceph).

I have also tried with the latest images from docker hub to no avail.

Full k8s artifacts

NAME                                            READY     STATUS      RESTARTS   AGE
pod/ceph-mds-666578c5f5-plknd                   0/1       Pending     0          3m
pod/ceph-mds-keyring-generator-9qvq9            0/1       Completed   0          3m
pod/ceph-mgr-69c4b4d4bb-ptwv5                   1/1       Running     1          3m
pod/ceph-mgr-keyring-generator-bvqjm            0/1       Completed   0          3m
pod/ceph-mon-7lrrk                              3/3       Running     0          3m
pod/ceph-mon-check-59499b664d-c95nf             1/1       Running     0          3m
pod/ceph-mon-fk2qx                              3/3       Running     0          3m
pod/ceph-mon-h727g                              3/3       Running     0          3m
pod/ceph-mon-keyring-generator-stlsf            0/1       Completed   0          3m
pod/ceph-namespace-client-key-generator-hdqs8   0/1       Completed   0          3m
pod/ceph-osd-dev-sdb-pjw7l                      0/1       Running     1          3m
pod/ceph-osd-dev-sdb-rtgnb                      1/1       Running     0          3m
pod/ceph-osd-dev-sdb-vzbp5                      0/1       Running     1          3m
pod/ceph-osd-keyring-generator-jzj2x            0/1       Completed   0          3m
pod/ceph-rbd-provisioner-5bc57f5f64-2x5kp       1/1       Running     0          3m
pod/ceph-rbd-provisioner-5bc57f5f64-x45mz       1/1       Running     0          3m
pod/ceph-rgw-58c67497fb-sdvkp                   0/1       Pending     0          3m
pod/ceph-rgw-keyring-generator-5b4gr            0/1       Completed   0          3m
pod/ceph-storage-keys-generator-qvhzw           0/1       Completed   0          3m

NAME               TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)    AGE
service/ceph-mon   ClusterIP   None          <none>        6789/TCP   3m
service/ceph-rgw   ClusterIP   10.110.4.11   <none>        8088/TCP   3m

NAME                              DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR                                      AGE
daemonset.apps/ceph-mon           3         3         3         3            3           ceph-mon=enabled                                   3m
daemonset.apps/ceph-osd-dev-sdb   3         3         1         3            1           ceph-osd-device-dev-sdb=enabled,ceph-osd=enabled   3m

NAME                                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/ceph-mds               1         1         1            0           3m
deployment.apps/ceph-mgr               1         1         1            1           3m
deployment.apps/ceph-mon-check         1         1         1            1           3m
deployment.apps/ceph-rbd-provisioner   2         2         2            2           3m
deployment.apps/ceph-rgw               1         1         1            0           3m

NAME                                              DESIRED   CURRENT   READY     AGE
replicaset.apps/ceph-mds-666578c5f5               1         1         0         3m
replicaset.apps/ceph-mgr-69c4b4d4bb               1         1         1         3m
replicaset.apps/ceph-mon-check-59499b664d         1         1         1         3m
replicaset.apps/ceph-rbd-provisioner-5bc57f5f64   2         2         2         3m
replicaset.apps/ceph-rgw-58c67497fb               1         1         0         3m

NAME                                            DESIRED   SUCCESSFUL   AGE
job.batch/ceph-mds-keyring-generator            1         1            3m
job.batch/ceph-mgr-keyring-generator            1         1            3m
job.batch/ceph-mon-keyring-generator            1         1            3m
job.batch/ceph-namespace-client-key-generator   1         1            3m
job.batch/ceph-osd-keyring-generator            1         1            3m
job.batch/ceph-rgw-keyring-generator            1         1            3m
job.batch/ceph-storage-keys-generator           1         1            3m```

pod "ceph-osd-dev-" blocked after the k8s recovered from a crash

mount ${OSD_DEVICE}1 ${tmp_osd_mount}

I installed a ceph cluster using the ceph-helm, but my kubernetes cluster crashed for some reason. After I restored the k8s cluster, all the pods recovered excluding the ceph-osd-dev- pods.

The initContainer osd-prepare-pod logs:

2018-08-25 04:37:41  /start_osd.sh: Checking if it belongs to this cluster
++ echo 2712
+ tmp_osd_mount=/var/lib/ceph/tmp/2712/
+ mkdir -p /var/lib/ceph/tmp/2712/
+ mount /dev/loop01 /var/lib/ceph/tmp/2712/
mount: special device /dev/loop01 does not exist

I checked the devices in /dev/ ,

$ ls /dev/loop*
/dev/loop0  /dev/loop0p1  /dev/loop0p2  /dev/loop-control

I suspect the /dev/loop0p1 is the device wanted, so I make a symbol link , and IT WORKS!

I think there is something wrong with it. The ceph deamon image is

registry.docker-cn.com/ceph/daemon                                         tag-build-master-luminous-ubuntu-16.04   c48fa6936ae5        6 months ago        445 MB

Ceph OSD's readiness probe fails.

Is this a request for help?:
Yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Version of Helm and Kubernetes:
Latest Master Branch.

Which chart:
Ceph-Helm

What happened:
I have deployed ceph helm chart on to a kubernetes cluster with the following overrides.
I am running MDS too.
`network:
public: 10.244.0.0/16
cluster: 10.244.0.0/16

ceph_mgr_modules_config:
dashboard:
port: 7000

osd_directory:
enabled: true

manifests:
deployment_rgw: false
service_rgw: false
daemonset_osd: true

storageclass:
name: ceph-rbd
pool: rbd
user_id: k8s`

All the services run correctly except OSD's. I am using osd_directory: enabled.
OSD fail on readiness probe.

Readiness probe failed: dial tcp 10.211.55.186:6800: getsockopt: connection refused Back-off restarting failed container Error syncing pod

What you expected to happen:
I expected to get all the osd's running too along with mon, mgr, mds.

How to reproduce it (as minimally and precisely as possible):
My cluster setup.
1 Master node
2 Worker nodes

  1. Use the following overrides file.

`network:
public: 10.244.0.0/16
cluster: 10.244.0.0/16

ceph_mgr_modules_config:
dashboard:
port: 7000

osd_directory:
enabled: true

manifests:
deployment_rgw: false
service_rgw: false
daemonset_osd: true

storageclass:
name: ceph-rbd
pool: rbd
user_id: k8s`

  1. Please replace public/cluster network to whatever is applicable in your kubernetes cluster.

  2. Add ceph-mon=enabled,ceph-mds=enabled,ceph-mgr=enabled to master node.
    Add ceph-osd=enabled to two worker nodes.

Follow the deploy instructions in http://docs.ceph.com/docs/master/start/kube-helm/ with the above changes to it.

Anything else we need to know:

Error on OSD pod: "MountVolume.SetUp failed for volume ..." using ceph-helm on GKE

I'm trying deploy Ceph on GKE(k8s) using ceph-helm, but after run "helm install ..." osd pod can't be created due to error "MountVolume.SetUp failed for volume ..." Full error see below

GKE env:
2 nodes: n1-standart-1 (1 virtual CPU, 3.75Gb RAM, hdd 100Gb, 2 mounted ssd 375Gb)
kubernetes: 1.9.7-gke.6 or 1.10.7-gke.6

I use this instruction (my script see below)
http://docs.ceph.com/docs/mimic/start/kube-helm/
but it fails on command "helm install ..."

kubectl get pods -n ceph

NAME                                        READY     STATUS                  RESTARTS   AGE
ceph-mds-5696f9df5d-nmgtb                   0/1       Pending                 0          12m
ceph-mds-keyring-generator-rh9tr            0/1       Completed               0          12m
ceph-mgr-8656b978df-w4mt6                   1/1       Running                 2          12m
ceph-mgr-keyring-generator-t2x7j            0/1       Completed               0          12m
ceph-mon-check-7d49bd686c-nmpw5             1/1       Running                 0          12m
ceph-mon-keyring-generator-hpbcg            0/1       Completed               0          12m
ceph-mon-xjjs4                              3/3       Running                 0          12m
ceph-namespace-client-key-generator-np2kv   0/1       Completed               0          12m
ceph-osd-dev-sdb-5wzs6                      0/1       Init:CrashLoopBackOff   6          12m
ceph-osd-dev-sdb-zwldd                      0/1       Init:CrashLoopBackOff   6          12m
ceph-osd-dev-sdc-qsqpl                      0/1       Init:CrashLoopBackOff   6          12m
ceph-osd-dev-sdc-x4722                      0/1       Init:CrashLoopBackOff   6          12m
ceph-osd-keyring-generator-xlmmb            0/1       Completed               0          12m
ceph-rbd-provisioner-5544dcbcf5-gb9ws       1/1       Running                 0          12m
ceph-rbd-provisioner-5544dcbcf5-hnmjm       1/1       Running                 0          12m
ceph-rgw-65b4bd8cc5-24fxz                   0/1       Pending                 0          12m
ceph-rgw-keyring-generator-4fp2j            0/1       Completed               0          12m
ceph-storage-keys-generator-x5nzl           0/1       Completed               0          12m

Describe failed osd pod shows:

Events:
  Type     Reason                 Age                From                                                        Message
  ----     ------                 ----               ----                                                        -------
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "run-udev"
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "pod-var-lib-ceph"
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "pod-run"
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "ceph-etc"
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "ceph-bin"
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "devices"
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "default-token-mdmzq"
  Warning  FailedMount            14m (x3 over 14m)  kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp failed for volume "ceph-client-admin-keyring" : secrets "ceph-client-admin-keyring" not found
  Warning  FailedMount            14m (x3 over 14m)  kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp failed for volume "ceph-bootstrap-mds-keyring" : secrets "ceph-bootstrap-mds-keyring" not found
  Warning  FailedMount            14m (x4 over 14m)  kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp failed for volume "ceph-bootstrap-osd-keyring" : secrets "ceph-bootstrap-osd-keyring" not found
  Warning  FailedMount            14m (x4 over 14m)  kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp failed for volume "ceph-mon-keyring" : secrets "ceph-mon-keyring" not found
  Warning  FailedMount            14m (x4 over 14m)  kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp failed for volume "ceph-bootstrap-rgw-keyring" : secrets "ceph-bootstrap-rgw-keyring" not found
  Normal   Pulled                 9m (x5 over 11m)   kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  Container image "docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04" already present on machine
  Warning  BackOff                4m (x29 over 11m)  kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  Back-off restarting failed container

Command "lsblk -f" on both nodes shows

NAME   FSTYPE LABEL           UUID                                 MOUNTPOINT
sdb                                                                /mnt/disks/ssd0
โ”œโ”€sdb2                                                             
โ””โ”€sdb1                                                             
sdc                                                                /mnt/disks/ssd1
โ”œโ”€sdc2                                                             
โ””โ”€sdc1                                                             
sda                                                                
โ””โ”€sda1 ext4   cloudimg-rootfs 819b0621-c9ea-4d69-b955-966a1b7c9cff /

Command "gdisk -l /dev/sdb" (and sdc) on osd nodes shows

GPT fdisk (gdisk) version 1.0.1

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdc: 98304000 sectors, 375.0 GiB
Logical sector size: 4096 bytes
Disk identifier (GUID): 86E73D28-8AD8-4F5A-B58C-12C61E508C96
Partition table holds up to 128 entries
First usable sector is 6, last usable sector is 98303994
Partitions will be aligned on 256-sector boundaries
Total free space is 250 sectors (1000.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1         1310976        98303994   370.0 GiB   F804  ceph data
   2             256         1310975   5.0 GiB     F802  ceph journal

Commands below show no error
kubectl logs -n ceph pod/ceph-mon-xjjs4 -c ceph-mon | grep error
kubectl logs -n ceph pod/ceph-mon-xjjs4 -c cluster-log-tailer | grep error
kubectl logs -n ceph pod/ceph-mon-xjjs4 -c cluster-audit-log-tailer | grep error

Output "kubectl logs -n ceph pod/ceph-mon-xjjs4 -c ceph-mon"

+ export LC_ALL=C
+ LC_ALL=C
+ source variables_entrypoint.sh
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr'
++ : ceph
++ : ceph-config/ceph
++ : 172.21.0.0/20
++ : mon
++ : 0
++ : gke-standard-cluster-2-default-pool-8b55990f-bj85
++ : gke-standard-cluster-2-default-pool-8b55990f-bj85
++ : /var/lib/ceph/mon/monmap
++ : /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85
++ : 1
++ : 0
++ : mds-gke-standard-cluster-2-default-pool-8b55990f-bj85
++ : 0
++ : 100
++ : 0
++ : 0
+++ uuidgen
++ : 63d877a7-1d14-4882-bdde-a6995abbf4a3
+++ uuidgen
++ : d7796d8c-7c16-4765-bcb4-e49b4e34c8cf
++ : root=default host=gke-standard-cluster-2-default-pool-8b55990f-bj85
++ : 0
++ : cephfs
++ : cephfs_data
++ : 8
++ : cephfs_metadata
++ : 8
++ : gke-standard-cluster-2-default-pool-8b55990f-bj85
++ :
++ :
++ : 8080
++ : 0
++ : 9000
++ : 0.0.0.0
++ : cephnfs
++ : gke-standard-cluster-2-default-pool-8b55990f-bj85
++ : 0.0.0.0
++ CLI_OPTS='--cluster ceph'
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d'
++ MOUNT_OPTS='-t xfs -o noatime,inode64'
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-gke-standard-cluster-2-default-pool-8b55990f-bj85/keyring
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring
++ RGW_KEYRING=/var/lib/ceph/radosgw/gke-standard-cluster-2-default-pool-8b55990f-bj85/keyring
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85/keyring
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph
+ source common_functions.sh
++ set -ex
+ [[ -z 172.21.0.0/20 ]]
+ [[ -z 10.140.0.2 ]]
+ [[ -z 10.140.0.2 ]]
+ [[ -z 172.21.0.0/20 ]]
+ get_mon_config
++ ceph-conf --lookup fsid -c /etc/ceph/ceph.conf
+ local fsid=ba3982a0-3a07-45b6-b69b-02bc37deeb00
+ timeout=10
+ MONMAP_ADD=
+ [[ -z '' ]]
+ [[ 10 -gt 0 ]]
+ [[ 1 -eq 0 ]]
++ kubectl get pods --namespace=ceph -l application=ceph -l component=mon -o template '--template={{range .items}}{{if .status.podIP}}--add {{.spec.nodeName}} {{.status.podIP}} {{end}} {{end}}'
+ MONMAP_ADD='--add gke-standard-cluster-2-default-pool-8b55990f-bj85 10.140.0.2  '
+ ((  timeout--  ))
+ sleep 1
+ [[ -z --addgke-standard-cluster-2-default-pool-8b55990f-bj8510.140.0.2 ]]
+ [[ -z --addgke-standard-cluster-2-default-pool-8b55990f-bj8510.140.0.2 ]]
+ '[' -f /var/lib/ceph/mon/monmap ']'
+ monmaptool --create --add gke-standard-cluster-2-default-pool-8b55990f-bj85 10.140.0.2 --fsid ba3982a0-3a07-45b6-b69b-02bc37deeb00 /var/lib/ceph/mon/monmap --clobber
monmaptool: monmap file /var/lib/ceph/mon/monmap
monmaptool: set fsid to ba3982a0-3a07-45b6-b69b-02bc37deeb00
monmaptool: writing epoch 0 to /var/lib/ceph/mon/monmap (1 monitors)
+ chown ceph. /var/log/ceph
+ [[ ! -e /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85/keyring ]]
+ [[ ! -e /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85/done ]]
+ '[' '!' -e /etc/ceph/ceph.mon.keyring.seed ']'
+ cp -vf /etc/ceph/ceph.mon.keyring.seed /etc/ceph/ceph.mon.keyring
'/etc/ceph/ceph.mon.keyring.seed' -> '/etc/ceph/ceph.mon.keyring'
+ '[' '!' -e /var/lib/ceph/mon/monmap ']'
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-osd/ceph.keyring into /etc/ceph/ceph.mon.keyring
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-mds/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-mds/ceph.keyring into /etc/ceph/ceph.mon.keyring
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-rgw/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-rgw/ceph.keyring into /etc/ceph/ceph.mon.keyring
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
importing contents of /etc/ceph/ceph.client.admin.keyring into /etc/ceph/ceph.mon.keyring
+ ceph-mon --setuser ceph --setgroup ceph --cluster ceph --mkfs -i gke-standard-cluster-2-default-pool-8b55990f-bj85 --monmap /var/lib/ceph/mon/monmap --keyring /etc/ceph/ceph.mon.keyring --mon-data /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85
+ touch /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85/done
+ log SUCCESS
+ '[' -z SUCCESS ']'
++ date '+%F %T'
2018-10-19 18:44:44  /start_mon.sh: SUCCESS
+ TIMESTAMP='2018-10-19 18:44:44'
+ echo '2018-10-19 18:44:44  /start_mon.sh: SUCCESS'
+ return 0
+ exec /usr/bin/ceph-mon --cluster ceph --setuser ceph --setgroup ceph -d -i gke-standard-cluster-2-default-pool-8b55990f-bj85 --mon-data /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85 --public-addr 10.140.0.2:6789
2018-10-19 18:44:44.694633 7fcdefd43f00  0 set uid:gid to 64045:64045 (ceph:ceph)
2018-10-19 18:44:44.694866 7fcdefd43f00  0 ceph version 12.2.3 (2dab17a455c09584f2a85e6b10888337d1ec8949) luminous (stable), process (unknown), pid 1
2018-10-19 18:44:44.695016 7fcdefd43f00  0 pidfile_write: ignore empty --pid-file
2018-10-19 18:44:44.702045 7fcdefd43f00  0 load: jerasure load: lrc load: isa
2018-10-19 18:44:44.702378 7fcdefd43f00  0  set rocksdb option compression = kNoCompression
2018-10-19 18:44:44.702472 7fcdefd43f00  0  set rocksdb option write_buffer_size = 33554432
2018-10-19 18:44:44.702548 7fcdefd43f00  0  set rocksdb option compression = kNoCompression
2018-10-19 18:44:44.702609 7fcdefd43f00  0  set rocksdb option write_buffer_size = 33554432
2018-10-19 18:44:44.702874 7fcdefd43f00  4 rocksdb: RocksDB version: 5.4.0

2018-10-19 18:44:44.702934 7fcdefd43f00  4 rocksdb: Git sha rocksdb_build_git_sha:@0@
2018-10-19 18:44:44.702970 7fcdefd43f00  4 rocksdb: Compile date Feb 19 2018
2018-10-19 18:44:44.703023 7fcdefd43f00  4 rocksdb: DB SUMMARY

2018-10-19 18:44:44.703128 7fcdefd43f00  4 rocksdb: CURRENT file:  CURRENT

2018-10-19 18:44:44.703183 7fcdefd43f00  4 rocksdb: IDENTITY file:  IDENTITY

2018-10-19 18:44:44.703225 7fcdefd43f00  4 rocksdb: MANIFEST file:  MANIFEST-000001 size: 13 Bytes

2018-10-19 18:44:44.703281 7fcdefd43f00  4 rocksdb: SST files in /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85/store.db dir, Total Num: 0, files:

2018-10-19 18:44:44.703333 7fcdefd43f00  4 rocksdb: Write Ahead Log file in /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85/store.db: 000003.log size: 1103 ;

2018-10-19 18:44:44.703369 7fcdefd43f00  4 rocksdb:                         Options.error_if_exists: 0
2018-10-19 18:44:44.703422 7fcdefd43f00  4 rocksdb:                       Options.create_if_missing: 0
2018-10-19 18:44:44.703457 7fcdefd43f00  4 rocksdb:                         Options.paranoid_checks: 1
...

My script for ceph installation

#INSTALL AND START HELM
sudo apt-get update
mkdir ~/helm
cd ~/helm
wget https://storage.googleapis.com/kubernetes-helm/helm-v2.11.0-linux-386.tar.gz
tar -zxvf helm-v2.11.0-linux-386.tar.gz
sudo mv linux-386/helm /usr/local/bin/helm

# RUN PROXY
kubectl proxy --port=8080

#CONFIGURE YOUR CEPH CLUSTER
cd ~
cat<<EOF>tiller-rbac-config.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: tiller
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: tiller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: tiller
  namespace: kube-system
EOF
kubectl create -f tiller-rbac-config.yaml
helm init --service-account tiller

helm serve
helm repo add local http://localhost:8879/charts

# ADD CEPH-HELM TO HELM LOCAL REPOS
git clone https://github.com/ceph/ceph-helm
cd ceph-helm/ceph
make

#CONFIGURE YOUR CEPH CLUSTER
cd ~
cat<<EOF>ceph-overrides.yaml
network:
  public:   172.21.0.0/20
  cluster:   172.21.0.0/20
osd_devices:
  - name: dev-sdb
    device: /dev/sdb
    zap: "1"
  - name: dev-sdc
    device: /dev/sdc
    zap: "1"
storageclass:
  name: ceph-rbd
  pool: rbd
  user_id: k8s
EOF

#CREATE THE CEPH CLUSTER NAMESPACE
kubectl create namespace ceph

#CONFIGURE RBAC PERMISSIONS
kubectl create clusterrolebinding test --clusterrole=cluster-admin [email protected]
kubectl create -f ~/ceph-helm/ceph/rbac.yaml

#LABEL KUBELETS
kubectl label node gke-standard-cluster-2-default-pool-8b55990f-bj85 ceph-mon=enabled ceph-mgr=enabled
kubectl label node gke-standard-cluster-2-default-pool-8b55990f-bj85 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled ceph-osd-device-dev-sdc=enabled
kubectl label node gke-standard-cluster-2-default-pool-8b55990f-s264 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled ceph-osd-device-dev-sdc=enabled

#CEPH DEPLOYMENT
helm install --name=ceph local/ceph --namespace=ceph -f ~/ceph-overrides.yaml 

Ps
I had read this articles, but it looks like aren't my case
#55
#51
#48
#45

And if smb knows other way (tested and working) to deploy Ceph on K8s, tell me please

ceph-helm for single node deployment

Is this a request for help?: yes

I am looking to use the ceph-helm chart to deploy on a single node kubernetes cluster. I have followed this tutorial and successfully got ceph running without kubernetes on a single node. Now I want to get it running within kubernetes using ceph-helm.

To clarify my understanding. I can basically follow the tutorial on the ceph documentation for kubernetes + helm. The only differences are:

  1. the single node must be labeled with both
    • ceph-monitor: ceph-mon=enabled ceph-mgr=enabled
    • ceph-osd: ceph-osd=enabled ceph-osd-device-dev-sdb=enabled ceph-osd-device-dev-sdc=enabled ...
  2. I must set osd pool default size = 2 and osd crush chooseleaf type = 0 in ceph-overrides.yaml

How would I complete step 2? Please correct me it step 1 is not correct.

ceph osd fails due to /dev/sdX being changed across reboots.

Is this a request for help?:No


Is this a BUG REPORT or FEATURE REQUEST? (choose one):BUG REPORT

Version of Helm and Kubernetes:Helm: 2.11.0, Kubernetes 1.11.6

Which chart: ceph-helm

What happened:
Servers were configured with SAS controllers and onboard ATA controller i.e two sets of SSD/HDD controllers. Across reboots the drives /dev/ names changed e.g. drive on SAS controller port 1 became /dev/sdc and prior to reboot it was /dev/sda. This is not uncommon.
The values.yaml file was configured to avoid the situation using by-path rather than /dev/sdX values.

osd_devices:

  • name: nvsedcog-osd-1
    device: /dev/disk/by-path/pci-0000:00:11.4-ata-1
    journal: /dev/disk/by-path/pci-0000:00:1f.2-ata-2
  • name: nvsedcog-osd-2
    device: /dev/disk/by-path/pci-0000:00:11.4-ata-3
    journal: /dev/disk/by-path/pci-0000:00:1f.2-ata-2

What you expected to happen:
_osd_disk_activate.sh.tpl, _osd_disk_prepare.sh.tpl should have found the correct device name using readlink and used the corresponding /dev/sdX device.

How to reproduce it (as minimally and precisely as possible):

A SAS controller is not necessary - given 3 drives, /dev/sda, /dev/sdb, /dev/sdc, install ceph on /dev/sda and /dev/sdc.
Shutdown the server and remove /dev/sdb.
On restart, osd1 or the osd attached to /dev/sdc will fail.

Anything else we need to know:
I'm attaching the "fixes" I made to support by-path names in the values.yaml file:

_osd_disk_prepare.sh.tpl.txt
_osd_disk_activate.sh.tpl.txt

secrets must be read-only

Is this a request for help?: no

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Version of Helm and Kubernetes:

$ helm version
Client: &version.Version{SemVer:"v2.8+unreleased", GitCommit:"c5f2174f264554c62278c0695d58f250d3e207c8", GitTreeState:"clean"}
Server: &version.Version{SemVer:"canary+unreleased", GitCommit:"fe9d36533901b71923c49142f5cf007f93fa926f", GitTreeState:"clean"}

Kubernetes master > 1.9

Which chart: ceph

What happened:

I compiled k8s master from source (commit 04634cb19843195) and brought up a local cluster with:

RUNTIME_CONFIG=storage.k8s.io/v1alpha1=true ALLOW_PRIVILEGED=1 FEATURE_GATES="BlockVolume=true,MountPropagation=true,CSIPersistentVolume=true," hack/local-up-cluster.sh -O

Then I followed http://docs.ceph.com/docs/master/start/kube-helm/#configure-your-ceph-cluster to install the ceph chart.

In that installation, the start_mon.sh script in the ceph-mon pod fails with:

+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-osd/ceph.keyring into /etc/ceph/ceph.mon.keyring
bufferlist::write_file(/etc/ceph/ceph.mon.keyring): failed to open file: (30) Read-only file system
could not write /etc/ceph/ceph.mon.keyring

What you expected to happen:

The script shouldn't write into a secret. The modification is not stored permanently in older Kubernetes releases and starting with 1.10, the default will be to mount secrets as read-only, even if "readonly: false" is used - see kubernetes/kubernetes#58720.

Anything else we need to know:

@intlabs said on Slack that he's going to fix this for openstack-helm/ceph. In the meantime one can use ReadOnlyAPIDataVolumes=false in FEATURE_GATES to restore the old behavior.

Here's a fix that worked for me. It's intentionally very minimal, perhaps the right solution also has to clean up the usage of secret in other pods:

diff --git a/ceph/ceph/templates/bin/_start_mon.sh.tpl b/ceph/ceph/templates/bin/_start_mon.sh.tpl
index 50e4bfd..5b3330c 100644
--- a/ceph/ceph/templates/bin/_start_mon.sh.tpl
+++ b/ceph/ceph/templates/bin/_start_mon.sh.tpl
@@ -62,8 +62,7 @@ chown ceph. /var/log/ceph
 # If we don't have a monitor keyring, this is a new monitor
 if [ ! -e "$MON_DATA_DIR/keyring" ]; then
   if [ ! -e $MON_KEYRING ]; then
-    log "ERROR- $MON_KEYRING must exist.  You can extract it from your current monitor by running 'ceph auth get mon. -o $MON_KEYRING' or use a KV Store"
-    exit 1
+    touch $MON_KEYRING
   fi
 
   if [ ! -e $MONMAP ]; then
diff --git a/ceph/ceph/templates/daemonset-mon.yaml b/ceph/ceph/templates/daemonset-mon.yaml
index 4b9c90d..3c26211 100644
--- a/ceph/ceph/templates/daemonset-mon.yaml
+++ b/ceph/ceph/templates/daemonset-mon.yaml
@@ -141,10 +141,6 @@ spec:
               mountPath: /etc/ceph/ceph.client.admin.keyring
               subPath: ceph.client.admin.keyring
               readOnly: true
-            - name: ceph-mon-keyring
-              mountPath: /etc/ceph/ceph.mon.keyring
-              subPath: ceph.mon.keyring
-              readOnly: false
             - name: ceph-bin
               mountPath: /variables_entrypoint.sh
               subPath: variables_entrypoint.sh
@@ -195,9 +191,6 @@ spec:
         - name: ceph-client-admin-keyring
           secret:
             secretName: {{ .Values.secrets.keyrings.admin }}
-        - name: ceph-mon-keyring
-          secret:
-            secretName: {{ .Values.secrets.keyrings.mon }}
         - name: ceph-bootstrap-osd-keyring
           secret:
             secretName: {{ .Values.secrets.keyrings.osd }}

PVC stuck in pending state

below are the logs of rbd provisioner
kubectl logs -f -n ceph ceph-rbd-provisioner-5b9bfb859d-2wvd9

  • exec /usr/local/bin/rbd-provisioner -id ceph-rbd-provisioner-5b9bfb859d-2wvd9
    I0514 08:59:52.174721 1 main.go:84] Creating RBD provisioner with identity: ceph-rbd-provisioner-5b9bfb859d-2wvd9
    I0514 08:59:52.225339 1 controller.go:407] Starting provisioner controller a2b23724-7626-11e9-9f90-0242ac1e4803!
    I0514 09:13:36.115718 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:13:36.222196 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:13:36.300492 1 leaderelection.go:156] attempting to acquire leader lease...
    E0514 09:13:36.371424 1 leaderelection.go:273] Failed to update lock: Operation cannot be fulfilled on persistentvolumeclaims "ceph-pvc": the object has been modified; please apply your changes to the latest version and try again
    I0514 09:13:37.253079 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:13:38.408582 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:13:52.253423 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:14:07.254063 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:14:22.254484 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:14:24.849756 1 leaderelection.go:178] successfully acquired lease to provision for pvc default/ceph-pvc
    I0514 09:14:24.850364 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:14:37.254732 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:14:52.255130 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:14:55.219073 1 leaderelection.go:204] stopped trying to renew lease to provision for pvc default/ceph-pvc, timeout reached
    I0514 09:15:07.255355 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:15:22.255579 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:15:22.267831 1 leaderelection.go:156] attempting to acquire leader lease...
    I0514 09:15:37.256247 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:15:52.256693 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:16:04.686503 1 leaderelection.go:178] successfully acquired lease to provision for pvc default/ceph-pvc
    I0514 09:16:04.686626 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:16:07.256853 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:16:22.257263 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:16:35.201268 1 leaderelection.go:204] stopped trying to renew lease to provision for pvc default/ceph-pvc, timeout reached
    I0514 09:16:37.257458 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:16:52.258008 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:16:52.275487 1 leaderelection.go:156] attempting to acquire leader lease...
    I0514 09:16:52.302071 1 leaderelection.go:178] successfully acquired lease to provision for pvc default/ceph-pvc
    I0514 09:16:52.302256 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:17:07.258228 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:17:22.258531 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:17:22.658162 1 leaderelection.go:204] stopped trying to renew lease to provision for pvc default/ceph-pvc, timeout reached

All pods are running well
kubectl get pods -n ceph
NAME READY STATUS RESTARTS AGE
ceph-mds-68c79b5cc-jt2m9 0/1 Pending 0 47m
ceph-mds-keyring-generator-jmj5g 0/1 Completed 0 47m
ceph-mgr-6c687f5964-7g8fq 1/1 Running 1 47m
ceph-mgr-keyring-generator-4jtbm 0/1 Completed 0 47m
ceph-mon-check-676d984874-mnscz 1/1 Running 0 47m
ceph-mon-fl2jl 3/3 Running 0 48m
ceph-mon-keyring-generator-cv8ff 0/1 Completed 0 47m
ceph-namespace-client-key-generator-klnmt 0/1 Completed 0 47m
ceph-osd-dev-sdb-hnkd4 1/1 Running 0 47m
ceph-osd-keyring-generator-7h7dz 0/1 Completed 0 47m
ceph-rbd-provisioner-5b9bfb859d-2wvd9 1/1 Running 0 47m
ceph-rbd-provisioner-5b9bfb859d-r7khc 1/1 Running 0 47m
ceph-rgw-6d946b-ztwnx 0/1 Pending 0 47m
ceph-rgw-keyring-generator-ndzbz 0/1 Completed 0 47m
ceph-storage-keys-generator-2wnrq 0/1 Completed 0 47m

3 nodes one node for osd one for mon one master node for kubernetes

MountVolume.SetUp failed for volume "ceph-bootstrap-osd-keyring" : secret "ceph-bootstrap-osd-keyring" not found

Is this a request for help?: yes


Is this a BUG REPORT or FEATURE REQUEST? (choose one):BUG REPORT

Version of Helm and Kubernetes:
kubeadm version 1.12.1
helm 2.9.1

Which chart:
Just follow the http://docs.ceph.com/docs/master/start/kube-helm/
Would like to have osd on every node /dev/sdb

What happened:
tried many times, the osd pod can't start
Events:
Type Reason Age From Message


Normal Scheduled 17m default-scheduler Successfully assigned ceph/ceph-osd-dev-sdb-4fd4c to pro-docker-2-64
Warning FailedMount 17m (x5 over 17m) kubelet, pro-docker-2-64 MountVolume.SetUp failed for volume "ceph-bootstrap-osd-keyring" : secret "ceph-bootstrap-osd-keyring" not found
Warning FailedMount 17m (x5 over 17m) kubelet, pro-docker-2-64 MountVolume.SetUp failed for volume "ceph-bootstrap-mds-keyring" : secret "ceph-bootstrap-mds-keyring" not found
Warning FailedMount 17m (x5 over 17m) kubelet, pro-docker-2-64 MountVolume.SetUp failed for volume "ceph-mon-keyring" : secret "ceph-mon-keyring" not found
Warning FailedMount 11m (x11 over 17m) kubelet, pro-docker-2-64 MountVolume.SetUp failed for volume "ceph-client-admin-keyring" : secret "ceph-client-admin-keyring" not found
Warning FailedMount 7m8s (x13 over 17m) kubelet, pro-docker-2-64 MountVolume.SetUp failed for volume "ceph-bootstrap-rgw-keyring" : secret "ceph-bootstrap-rgw-keyring" not found
Warning FailedMount 103s (x7 over 15m) kubelet, pro-docker-2-64 Unable to mount volumes for pod "ceph-osd-dev-sdb-4fd4c_ceph(ede3590e-d027-11e8-a389-005056b224f1)": timeout expired waiting for volumes to attach or mount for pod "ceph"/"ceph-osd-dev-sdb-4fd4c". list of unmounted volumes=[ceph-client-admin-keyring ceph-mon-keyring ceph-bootstrap-osd-keyring ceph-bootstrap-mds-keyring ceph-bootstrap-rgw-keyring]. list of unattached volumes=[devices pod-var-lib-ceph pod-run ceph-bin ceph-etc ceph-client-admin-keyring ceph-mon-keyring ceph-bootstrap-osd-keyring ceph-bootstrap-mds-keyring ceph-bootstrap-rgw-keyring run-udev default-token-k2x7h]

What you expected to happen:
expect to run 'helm status ceph' and see ceph ready to deploy.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:
I am using vmware to add virtual disk /dev/sdb. entire k8s master and node on Ubuntu18.04 with latest update.

Many thanks,

Not compatible with latest kubernetes

If this is a BUG REPORT, please:
-The deployment isn't compatible with kubernetes 1.16

What happened:

secret/ceph-keystone-user-rgw unchanged
configmap/ceph-bin-clients unchanged
configmap/ceph-bin unchanged
configmap/ceph-etc configured
configmap/ceph-templates unchanged
storageclass.storage.k8s.io/general unchanged
service/ceph-mon unchanged
service/ceph-rgw unchanged
job.batch/ceph-storage-admin-key-cleaner-z0qki created
job.batch/ceph-mds-keyring-generator unchanged
job.batch/ceph-osd-keyring-generator unchanged
job.batch/ceph-rgw-keyring-generator unchanged
job.batch/ceph-mon-keyring-generator unchanged
job.batch/ceph-mgr-keyring-generator unchanged
job.batch/ceph-namespace-client-key-cleaner-xkx0p created
job.batch/ceph-namespace-client-key-generator unchanged
job.batch/ceph-storage-keys-generator unchanged
unable to recognize "STDIN": no matches for kind "DaemonSet" in version "extensions/v1beta1"
unable to recognize "STDIN": no matches for kind "Deployment" in version "apps/v1beta1"
unable to recognize "STDIN": no matches for kind "Deployment" in version "apps/v1beta1"
unable to recognize "STDIN": no matches for kind "Deployment" in version "apps/v1beta1"
unable to recognize "STDIN": no matches for kind "Deployment" in version "extensions/v1beta1"
unable to recognize "STDIN": no matches for kind "Deployment" in version "apps/v1beta1"

What you expected to happen:
A cluster would need to be deployed

How to reproduce it (as minimally and precisely as possible):
Install latest kubernetes, k3s,.. download the charts and run them

No secrets found in helm-toolkit during "make" and ceph-mon pod is going in "CrashLoopBackOff"

Is this a request for help?:
Yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Version of Helm and Kubernetes:

Kubernetes v1.13.0
Client Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.0-dirty", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"dirty", BuildDate:"2019-01-31T06:07:25Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.0-dirty", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"dirty", BuildDate:"2019-01-31T06:07:25Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Helm
Client: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}

What happened:
I am trying to install Ceph using helm charts in k8s cluster and followed this document http://docs.ceph.com/docs/master/start/kube-helm/ and facing this 2 major issues

  1. If we run "make" it shows no secrets found in helm-toolkit

  2. After installing step i.e.

     helm install --name=ceph local/ceph --namespace=ceph -f  ceph-overrides.yaml
    

ceph-mon pod is going in "CrashLoopBackOff" state

    NAMESPACE          NAME                                        READY   STATUS             RESTARTS   AGE
   ceph               ceph-mds-85b4fbb478-26sw8                   0/1     Pending            0          4h56m
   ceph               ceph-mds-keyring-generator-w6xqz            0/1     Completed          0          4h56m
   ceph               ceph-mgr-588577d89f-rrd84                   0/1     Init:0/2           0          4h56m
   ceph               ceph-mgr-keyring-generator-sg75h            0/1     Completed          0          4h56m
   ceph               ceph-mon-82rtj                              2/3     CrashLoopBackOff   57         4h56m
   ceph               ceph-mon-check-549b886885-x4m7d             0/1     Init:0/2           0                 4h56m
   ceph               ceph-mon-keyring-generator-d5txp            0/1     Completed          0          4h56m
   ceph               ceph-namespace-client-key-generator-rqd2m   0/1     Completed          0          4h56m
   ceph               ceph-osd-dev-sdb-9fpd9                      0/1     Init:0/3           0          4h56m
   ceph               ceph-osd-keyring-generator-m44l4            0/1     Completed          0          4h56m
   ceph               ceph-rbd-provisioner-5cf47cf8d5-gwfnj       1/1     Running            0          4h56m
   ceph               ceph-rbd-provisioner-5cf47cf8d5-s9vvg       1/1     Running            0          4h56m
   ceph               ceph-rgw-7b9677854f-9tdwt                   0/1     Pending            0          4h56m
   ceph               ceph-rgw-keyring-generator-chm89            0/1     Completed          0          4h56m
   ceph               ceph-storage-keys-generator-sqwb2           0/1     Completed          0          4h56m
   kube-system        kube-dns-8f7866879-28pq7                    3/3     Running            0          6h2m
   kube-system        tiller-deploy-dbb85cb99-68xmk               1/1     Running            0          104m

What you expected to happen:

We want ceph-mon pod in running condition so that we can create secrets and keyrings as without ceph-mon in running state we can't create secrets. Please do let me know if I am miss anything.

Anything else we need to know:

Secret not found for ceph helm

Describe the bug
A clear and concise description of what the bug is.

Secret not found

Version of Helm and Kubernetes:
k8s: v1.15.7
Helm: v2.16.1

Which chart:
helm-ceph

What happened:
Secret is missing

What you expected to happen
I expected the outcome to look like below but, nothing is running except the rbd pod.

$ kubectl -n ceph get pods
NAME READY STATUS RESTARTS AGE
ceph-mds-3804776627-976z9 0/1 Pending 0 1m
ceph-mgr-3367933990-b368c 1/1 Running 0 1m
ceph-mon-check-1818208419-0vkb7 1/1 Running 0 1m
ceph-mon-cppdk 3/3 Running 0 1m
ceph-mon-t4stn 3/3 Running 0 1m
ceph-mon-vqzl0 3/3 Running 0 1m
ceph-osd-dev-sdd-6dphp 1/1 Running 0 1m
ceph-osd-dev-sdd-6w7ng 1/1 Running 0 1m
ceph-osd-dev-sdd-l80vv 1/1 Running 0 1m
ceph-osd-dev-sde-6dq6w 1/1 Running 0 1m
ceph-osd-dev-sde-kqt0r 1/1 Running 0 1m
ceph-osd-dev-sde-lp2pf 1/1 Running 0 1m
ceph-rbd-provisioner-2099367036-4prvt 1/1 Running 0 1m
ceph-rbd-provisioner-2099367036-h9kw7 1/1 Running 0 1m
ceph-rgw-3375847861-4wr74 0/1 Pending 0 1m

How to reproduce it (as minimally and precisely as possible):
cd /tmp
curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get > install-helm.sh
bash install-helm.sh
helm init
helm serve
helm repo add local http://localhost:8879/charts
git clone https://github.com/ceph/ceph-helm
cd ceph-helm/ceph
sudo apt install -y make
make

sudo nano ceph-overrides.yaml
network:
public: 172.16.0.0/16
cluster: 172.16.0.0/16

osd_devices:

  • name: dev-sdb1
    device: /dev/sdb
    zap: "1"
    storageclass:
    name: ceph-rbd
    pool: rbd
    user_id: k8s

kubectl create namespace ceph
kubectl create -f /home/vagrant/ceph-helm/ceph/rbac.yaml
kubectl label node node-1 ceph-mon=enabled ceph-mgr=enabled
kubectl label node node-1 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled
kubectl label node node-2 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled
kubectl label node node-3 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled

helm install --name=ceph /home/vagrant/ceph-helm/ceph/ceph --namespace=ceph -f /home/vagrant/ceph-helm/ceph/ceph-overrides.yaml

Anything else we need to know:
I followed the steps on https://docs.ceph.com/docs/mimic/start/kube-helm/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.