Giter Site home page Giter Site logo

spdk / spdk-csi Goto Github PK

View Code? Open in Web Editor NEW
72.0 72.0 38.0 357 KB

CSI driver to bring SPDK to Kubernetes storage through NVMe-oF or iSCSI. Supports dynamic volume provisioning and enables Pods to use SPDK storage transparently.

License: Apache License 2.0

Makefile 1.74% Go 84.72% Dockerfile 0.40% Shell 13.00% Mustache 0.11% Ruby 0.03%

spdk-csi's Introduction

Storage Performance Development Kit

License Build Status Go Doc Go Report Card

NOTE: The SPDK mailing list has moved to a new location. Please visit this URL to subscribe at the new location. Subscribers from the old location will not be automatically migrated to the new location.

The Storage Performance Development Kit (SPDK) provides a set of tools and libraries for writing high performance, scalable, user-mode storage applications. It achieves high performance by moving all of the necessary drivers into userspace and operating in a polled mode instead of relying on interrupts, which avoids kernel context switches and eliminates interrupt handling overhead.

The development kit currently includes:

In this readme

Documentation

Doxygen API documentation is available, as well as a Porting Guide for porting SPDK to different frameworks and operating systems.

Source Code

git clone https://github.com/spdk/spdk
cd spdk
git submodule update --init

Prerequisites

The dependencies can be installed automatically by scripts/pkgdep.sh. The scripts/pkgdep.sh script will automatically install the bare minimum dependencies required to build SPDK. Use --help to see information on installing dependencies for optional components

./scripts/pkgdep.sh

Build

Linux:

./configure
make

FreeBSD: Note: Make sure you have the matching kernel source in /usr/src/ and also note that CONFIG_COVERAGE option is not available right now for FreeBSD builds.

./configure
gmake

Unit Tests

./test/unit/unittest.sh

You will see several error messages when running the unit tests, but they are part of the test suite. The final message at the end of the script indicates success or failure.

Vagrant

A Vagrant setup is also provided to create a Linux VM with a virtual NVMe controller to get up and running quickly. Currently this has been tested on MacOS, Ubuntu 16.04.2 LTS and Ubuntu 18.04.3 LTS with the VirtualBox and Libvirt provider. The VirtualBox Extension Pack or [Vagrant Libvirt] (https://github.com/vagrant-libvirt/vagrant-libvirt) must also be installed in order to get the required NVMe support.

Details on the Vagrant setup can be found in the SPDK Vagrant documentation.

AWS

The following setup is known to work on AWS: Image: Ubuntu 18.04 Before running setup.sh, run modprobe vfio-pci then: DRIVER_OVERRIDE=vfio-pci ./setup.sh

Advanced Build Options

Optional components and other build-time configuration are controlled by settings in the Makefile configuration file in the root of the repository. CONFIG contains the base settings for the configure script. This script generates a new file, mk/config.mk, that contains final build settings. For advanced configuration, there are a number of additional options to configure that may be used, or mk/config.mk can simply be created and edited by hand. A description of all possible options is located in CONFIG.

Boolean (on/off) options are configured with a 'y' (yes) or 'n' (no). For example, this line of CONFIG controls whether the optional RDMA (libibverbs) support is enabled:

CONFIG_RDMA?=n

To enable RDMA, this line may be added to mk/config.mk with a 'y' instead of 'n'. For the majority of options this can be done using the configure script. For example:

./configure --with-rdma

Additionally, CONFIG options may also be overridden on the make command line:

make CONFIG_RDMA=y

Users may wish to use a version of DPDK different from the submodule included in the SPDK repository. Note, this includes the ability to build not only from DPDK sources, but also just with the includes and libraries installed via the dpdk and dpdk-devel packages. To specify an alternate DPDK installation, run configure with the --with-dpdk option. For example:

Linux:

./configure --with-dpdk=/path/to/dpdk/x86_64-native-linuxapp-gcc
make

FreeBSD:

./configure --with-dpdk=/path/to/dpdk/x86_64-native-bsdapp-clang
gmake

The options specified on the make command line take precedence over the values in mk/config.mk. This can be useful if you, for example, generate a mk/config.mk using the configure script and then have one or two options (i.e. debug builds) that you wish to turn on and off frequently.

Shared libraries

By default, the build of the SPDK yields static libraries against which the SPDK applications and examples are linked. Configure option --with-shared provides the ability to produce SPDK shared libraries, in addition to the default static ones. Use of this flag also results in the SPDK executables linked to the shared versions of libraries. SPDK shared libraries by default, are located in ./build/lib. This includes the single SPDK shared lib encompassing all of the SPDK static libs (libspdk.so) as well as individual SPDK shared libs corresponding to each of the SPDK static ones.

In order to start a SPDK app linked with SPDK shared libraries, make sure to do the following steps:

  • run ldconfig specifying the directory containing SPDK shared libraries
  • provide proper LD_LIBRARY_PATH

If DPDK shared libraries are used, you may also need to add DPDK shared libraries to LD_LIBRARY_PATH

Linux:

./configure --with-shared
make
ldconfig -v -n ./build/lib
LD_LIBRARY_PATH=./build/lib/:./dpdk/build/lib/ ./build/bin/spdk_tgt

Hugepages and Device Binding

Before running an SPDK application, some hugepages must be allocated and any NVMe and I/OAT devices must be unbound from the native kernel drivers. SPDK includes a script to automate this process on both Linux and FreeBSD. This script should be run as root.

sudo scripts/setup.sh

Users may wish to configure a specific memory size. Below is an example of configuring 8192MB memory.

sudo HUGEMEM=8192 scripts/setup.sh

There are a lot of other environment variables that can be set to configure setup.sh for advanced users. To see the full list, run:

scripts/setup.sh --help

Target applications

After completing the build process, SPDK target applications can be found in spdk/build/bin directory:

  • nvmf_tgt - SPDK NVMe over Fabrics target presents block devices over a fabrics,
  • iscsi_tgt - SPDK iSCSI target runs I/O operations remotely with TCP/IP protocol,
  • vhost - A vhost target provides a local storage service as a process running on a local machine,
  • spdk_tgt - combines capabilities of all three applications.

SPDK runs in a polled mode, which means it continuously checks for operation completions. This approach assures faster response than interrupt mode, but also lessens usefulness of tools like top, which only shows 100% CPU usage for SPDK assigned cores. spdk_top is a program which simulates top application and uses SPDK's JSON RPC interface to present statistics about SPDK threads, pollers and CPU cores as an interactive list.

Example Code

Example code is located in the examples directory. The examples are compiled automatically as part of the build process. Simply call any of the examples with no arguments to see the help output. You'll likely need to run the examples as a privileged user (root) unless you've done additional configuration to grant your user permission to allocate huge pages and map devices through vfio.

Contributing

For additional details on how to get more involved in the community, including contributing code and participating in discussions and other activities, please refer to spdk.io

spdk-csi's People

Contributors

askervin avatar avalluri avatar cyb70289 avatar haichaoli01 avatar halfzebra avatar karlatec avatar peluse avatar rollandf avatar trochumski avatar xinydev avatar yanjing1104 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spdk-csi's Issues

Pod hot plug ?

When I created testpod.yaml, and then I manually nvme disconnect, the disk was lost, and then I mounted the volume with nvme connect, but the spdkvol in the Pod is still inaccessible, why and how?

timed out waiting device ready:

when i deploy spdk-test,its status always be "ContainerCreating"
image
and the describe say "MountVolume.MountDevice failed for volume "pvc-114cc322-8a71-4a11-be1a-2970bd11cf4c" : rpc error: code = Internal desc = timed out waiting device ready: /dev/disk/by-id/df8f0a4e-1458-488b-8793-ed2222c0f25d"
image
how can i fix it?

i did just like the guide test,except use my own kubernates not minikube.

and the "dmsg" show
image

kubectl apply -f testpod.yaml

spdkcsi-test,The pods couldn't get up,STATUS is ContainerCreating。

kubectl describe po spdkcsi-test
`Events:
Type Reason Age From Message


Warning FailedScheduling 111s default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
Warning FailedScheduling 110s default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
Normal Scheduled 108s default-scheduler Successfully assigned default/spdkcsi-test to host
Normal SuccessfulAttachVolume 109s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-8a3aa437-a775-4a82-8e49-f1075d3b6e74"
Warning FailedMount 17s (x4 over 84s) kubelet MountVolume.MountDevice failed for volume "pvc-8a3aa437-a775-4a82-8e49-f1075d3b6e74" : rpc error: code = Internal desc = timed out waiting device ready: /dev/disk/by-id/ddb26aa1-3a89-4354-9336-26e087ebb07d`

error: the namespace from the provided object "kube-system" does not match the namespace "default"

from https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v6.2.2/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml

metadata:
  name: snapshot-controller
  namespace: kube-system

but from

SNAPSHOT_VERSION="v6.2.2" ./scripts/install-snapshot.sh install

+ kubectl apply -f /tmp/tmp.g6inKF778O/snapshot-controller.yaml -n default

resulting in error:

error: the namespace from the provided object "kube-system" does not match the namespace "default". You must pass '--namespace=kube-system' to perform this operation.

this will work

$ SNAPSHOT_VERSION="v6.2.2" ./scripts/install-snapshot.sh install kube-system

spdkcsi的镜像获取不到

Failed to pull image "spdkcsi/spdkcsi:canary": rpc error: code = Unknown desc = Error response from daemon: pull access denied for spdkcsi/spdkcsi, repository does not exist or may require 'docker login': denied: requested access to the resource is denied

CI doesn't checkout spdk-csi sources correctly

See: https://review.spdk.io/gerrit/c/spdk/spdk-csi/+/18520
https://ci.spdk.io/public_build/spdk-csi-per-patch_144.html

CI fails to check out and test the change because of:

08:31:37  Setting http proxy: proxy-dmz.intel.com:911
08:31:37   > git fetch --tags --force --progress -- https://review.spdk.io/gerrit/a/spdk/spdk refs/changes/20/18520/2 +refs/heads/master:refs/remotes/origin/master # timeout=10
08:31:38  ERROR: Error fetching remote repo 'origin'
08:31:38  hudson.plugins.git.GitException: Failed to fetch from https://review.spdk.io/gerrit/a/spdk/spdk
08:31:38  	at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:999)
08:31:38  	at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1241)
08:31:38  	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1305)
08:31:38  	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:129)
08:31:38  	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:97)
08:31:38  	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:84)
08:31:38  	at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
08:31:38  	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
08:31:38  	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
08:31:38  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
08:31:38  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
08:31:38  	at java.base/java.lang.Thread.run(Thread.java:840)
08:31:38  Caused by: hudson.plugins.git.GitException: Command "git fetch --tags --force --progress -- https://review.spdk.io/gerrit/a/spdk/spdk refs/changes/20/18520/2 +refs/heads/master:refs/remotes/origin/master" returned status code 128:
08:31:38  stdout: 
08:31:38  stderr: fatal: couldn't find remote ref refs/changes/20/18520/2

Seems there is something wrong with Jenkins build parameters and it tries to checkout spdk/spdk instead of spdk/spdk-csi

When I hung on the file system, I found that the file system is bad

cat /etc/redhat-release
CentOS Linux release 7.7.1908 (BonusCloud) 1688368866
`
package main

import (
"fmt"
"k8s.io/utils/mount"
"k8s.io/utils/exec"
)

func main() {

mntFlags := []string{}
mounter := mount.SafeFormatAndMount{Interface: mount.New(""), Exec: exec.New()}
err := mounter.FormatAndMount("/dev/nvme0n1", "/data", "ext4", mntFlags)
if err != nil {
	fmt.Println("---:", err)
}

}`

[root@node2 src]# go run test.go
I0830 18:09:37.757907 1646 mount_linux.go:367] Disk "/dev/nvme0n1" appears to be unformatted, attempting to format as type: "ext4" with options: [-F -m0 /dev/nvme0n1]
I0830 18:09:37.841104 1646 mount_linux.go:377] Disk successfully formatted (mkfs): ext4 - /dev/nvme0n1 /data
[root@node2 src]# blkid -p -s TYPE -s PTTYPE -o export /dev/nvme0n1
[root@node2 src]# lsbkl
-bash: lsbkl: 未找到命令
[root@node2 src]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0 0 64M 0 disk /data
sda 8:0 0 111.3G 0 disk
├─sda2 8:2 0 11.1G 0 part
├─sda3 8:3 0 50G 0 part /
└─sda1 8:1 0 1G 0 part /boot
[root@node2 src]# blkid -p -s TYPE -s PTTYPE -o export /dev/nvme0n1
[root@node2 src]#

support volume expansion

I would like to ask what you think about the high availability of CSI. For example, if SPDK and CSI-Node are restarted at the same time, how can they be restored to the state before the restart.

Enable RWX support for raw block volumes

Hello everyone, this is two questions in one.

I've gone through the SPDK-CSI design document today, and I've read that support for raw block volume is present in the to-do tasks. Is there any plan for adding this inside spdk-csi?

Plus, if this is planned, do you plan to support RWX?

Kudos who those who'll answer :)

CI Intermittent Failure - Network issue

CI Intermittent Failure

https://review.spdk.io/gerrit/c/spdk/spdk-csi/+/16096

Link to the failed CI build

https://ci.spdk.io/public_build/spdk-csi-upstream_606.html

Execution failed at

00:01:34.259 + vagrant up
00:01:35.636 Bringing machine 'default' up with 'libvirt' provider...
00:01:36.586 ==> default: Box 'fedora/34-cloud-base' could not be found. Attempting to find and install...
00:01:36.586 default: Box Provider: libvirt
00:01:36.586 default: Box Version: >= 0
00:01:37.528 ==> default: Loading metadata for box 'fedora/34-cloud-base'
00:01:37.528 default: URL: https://vagrantcloud.com/fedora/34-cloud-base
00:01:38.902 ==> default: Adding box 'fedora/34-cloud-base' (v34.20210423.0) for provider: libvirt
00:01:38.902 default: Downloading: https://vagrantcloud.com/fedora/boxes/34-cloud-base/versions/34.20210423.0/providers/libvirt.box
00:01:39.729
�[K default: Progress: 0% (Rate: 0*/s, Estimated time remaining: --:--:--)
�[K default: Progress: 0% (Rate: 0*/s, Estimated time remaining: --:--:--)
�[K default: Progress: 100% (Rate: 230/s, Estimated time remaining: --:--:--)
�[K default: Download redirected to host: download.fedoraproject.org
00:01:40.297
�[K default: Progress: 100% (Rate: 178/s, Estimated time remaining: --:--:--)
�[K default: Progress: 0% (Rate: 0*/s, Estimated time remaining: --:--:--)
�[KAn error occurred while downloading the remote file. The error
00:01:40.297 message, if any, is reproduced below. Please fix this error and try
00:01:40.297 again.
00:01:40.297
00:01:40.297 HTTP/2 stream 0 was not closed cleanly: PROTOCOL_ERROR (err 1)
00:01:40.570 [Pipeline] }
00:01:40.590 [Pipeline] // stage
00:01:40.598 [Pipeline] }
00:01:40.601 ERROR: script returned exit code 1
00:01:40.619 [Pipeline] // catchError
00:01:40.629 [Pipeline] stage
00:01:40.631 [Pipeline] { (Epilogue)
00:01:40.646 [Pipeline] sh
00:01:40.926 + vagrant destroy -f
00:01:43.467 ==> default: Remove stale volume...
00:01:43.467 ==> default: Domain is not created. Please run vagrant up first.
00:01:43.482 [Pipeline] cleanWs
00:01:43.491 [WS-CLEANUP] Deleting project workspace...
00:01:43.491 [WS-CLEANUP] Deferred wipeout is used...
00:01:43.498 [WS-CLEANUP] done
00:01:43.501 [Pipeline] }
00:01:43.526 [Pipeline] // stage
00:01:43.532 [Pipeline] }
00:01:43.557 [Pipeline] // node
00:01:43.565 [Pipeline] End of Pipeline
00:01:43.585 Finished: FAILURE

example compiling inside docker as well

just an example for people don't want to install specific go version on the local machine...

diff --git a/deploy/image/Dockerfile b/deploy/image/Dockerfile
index 9773a1e..f44d4a5 100644
--- a/deploy/image/Dockerfile
+++ b/deploy/image/Dockerfile
@@ -4,11 +4,15 @@
 #
 # XXX: pin alpine to 3.8 with e2fsprogs-1.44
 # e2fsprogs-1.45+ crashes my test vm when running mkfs.ext4
+FROM docker.io/library/golang:1.22.0-alpine3.19 as builder
+COPY . .
+RUN CGO_ENABLED=0 GOOS=linux go build -buildvcs=false -o /tmp/spdkcsi ./cmd/
+
 FROM alpine:3.8
 LABEL maintainers="SPDK-CSI Authors"
 LABEL description="SPDK-CSI Plugin"

-COPY spdkcsi /usr/local/bin/spdkcsi
+COPY --from=builder /tmp/spdkcsi /usr/local/bin/spdkcsi

 RUN apk add nvme-cli open-iscsi e2fsprogs xfsprogs blkid

(END)

and then

docker build -t spdkcsi/spdkcsi:canary -f deploy/image/Dockerfile .

Flag --short has been deprecated, and will be removed in the future

if ! get_kube_version=$(kubectl version --short) ||

$ SNAPSHOT_VERSION="v6.2.2" ./scripts/install-snapshot.sh cleanup
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
serviceaccount "snapshot-controller" deleted
clusterrole.rbac.authorization.k8s.io "snapshot-controller-runner" deleted
clusterrolebinding.rbac.authorization.k8s.io "snapshot-controller-role" deleted
role.rbac.authorization.k8s.io "snapshot-controller-leaderelection" deleted
rolebinding.rbac.authorization.k8s.io "snapshot-controller-leaderelection" deleted
deployment.apps "snapshot-controller" deleted
customresourcedefinition.apiextensions.k8s.io "volumesnapshotclasses.snapshot.storage.k8s.io" deleted
customresourcedefinition.apiextensions.k8s.io "volumesnapshotcontents.snapshot.storage.k8s.io" deleted
customresourcedefinition.apiextensions.k8s.io "volumesnapshots.snapshot.storage.k8s.io" deleted

about external-resizer

I want to add the volume expansion function on this basis, but after adding the v1.4.0 version of external-resizer, I found that
updatedPVC, err := ctrl.patchClaim(pvc, newPVC, true /* addResourceVersionCheck */) This function reports an error, the reason for the error Error syncing PVC: marking pvc "kube-csi/volume-pvc" as resizing failed: can' t patch status of PVC kube-csi/volume-pvc with persistentvolumeclaims "volume-pvc" is forbidden, I judge that the version may be too high and not suitable, may I ask is this the reason?

Error when building spdkcsi/spdkcsi:canary image

Here we can find a command to build a local version of spdkcsi image:

  docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock golang:1.14 \
  bash -c "apt update && apt install -y make git docker.io && \
  git clone https://review.spdk.io/gerrit/spdk/spdk-csi && cd spdk-csi && make image"

This command fails with error:

=== building spdkcsi binary
flag provided but not defined: -buildvcs
usage: go build [-o output] [-i] [build flags] [packages]
Run 'go help build' for details.
make: *** [Makefile:45: spdkcsi] Error 2

one idea

I thought of a way to combine the lvolID into rpcUrl-username-userpass-uuid, so that the key information can be stored in k8s, so that if you want to DeleteVolume, you can use the previous three information to form a client to check and Delete lvolid, do you think this will work?

SPDK environment /root/spdk/scripts/rpc.py bdev_malloc_create issue

I know this is not really an issue with this repository, but it looks like scripts SPDK v20.01 can't run on the latest Python version(3.9.5).

So it looks like sudo docker exec -it spdkdev /root/spdk/scripts/rpc.py bdev_malloc_create -b Malloc0 1024 4096 script fails with:

Traceback (most recent call last):
  File "/root/spdk/scripts/rpc.py", line 2340, in <module>
    call_rpc_func(args)
  File "/root/spdk/scripts/rpc.py", line 2311, in call_rpc_func
    args.func(args)
  File "/root/spdk/scripts/rpc.py", line 256, in bdev_malloc_create
    print_json(rpc.bdev.bdev_malloc_create(args.client,
  File "/root/spdk/scripts/rpc/bdev.py", line 184, in bdev_malloc_create
    return client.call('bdev_malloc_create', params)
  File "/root/spdk/scripts/rpc/client.py", line 151, in call
    response = self.recv()
  File "/root/spdk/scripts/rpc/client.py", line 123, in recv
    start_time = time.clock()
AttributeError: module 'time' has no attribute 'clock'

The reason for this is:

The function time.clock() has been removed, after having been deprecated since Python 3.3: use time.perf_counter() or time.process_time() instead, depending on your requirements, to have well-defined behavior. (Contributed by Matthias Bussonnier in bpo-36895.)
https://docs.python.org/3/whatsnew/3.8.html#api-and-feature-removals

The workaround I'm using right now is to run sed -i 's/clock/perf_counter/g' /root/spdk/scripts/rpc/client.py before running the script.

I might be missing something, but Imy best guesses for possible solutions are:

  • Update the SPDK version used in the image to the later one
    Risks: untested
  • Run sed -i 's/clock/perf_counter/g' /root/spdk/scripts/rpc/client.py if the version is v20.01
    Risks: fragile, also needs to cover v20.01-v20.04
  • Downgrade Python version if running v20.01-v20.04
    Risks: may need lots of extra deps and logic if using Anaconda

Is there a simpler solution to this?

Let me know what you think!

It could be a bug

[2021-12-10 08:30:24.880721] ctrlr.c:2400:nvmf_ctrlr_identify: *ERROR*: Identify command with unsupported CNS 0x06
I didn't use minikube, I used the standalone K8S environment. When I created testPod, there was an error on the SPDK side. Although it does not affect disks mounted to Pod, I would like to mention this issue, which may be a bug and may have problems in the future.
After my test, this error will be reported after the PVC is created and the nvme connect command is successfully connected.

Helm Chart: Invalid Chart Version

Since Helm version 3.5.2, only valid Semver 2 versions are allowed for chart version.
latest is not a valid version for the latest Helm versions.

 $ helm version
version.BuildInfo{Version:"v3.10.2", GitCommit:"50f003e5ee8704ec937a756c646870227d7c8b58", GitTreeState:"clean", GoVersion:"go1.18.8"}
$ helm install spdk-csi ./spdk-csi --namespace spdk-csi
Error: INSTALLATION FAILED: validation: chart.metadata.version "latest" is invalid

Questions about formatting and mounting

  1. After applying for the PVC resource, NVMe disks are displayed on the host
    2.NodeStageVolume The path is changed when the disk is being formatted
    3.NodePublishVolume is actually mounted in Pod

    What I don’t understand is the path change in the second step when the disk is formatted, instead of directly mounting to the Pod.

func (ns *nodeServer) stageVolume(devicePath string, req *csi.NodeStageVolumeRequest) (string, error):
klog.Infof("mount %s to %s, fstype: %s, flags: %v", devicePath, stagingPath, fsType, mntFlags) mounter := mount.SafeFormatAndMount{Interface: ns.mounter, Exec: exec.New()} err = mounter.FormatAndMount(devicePath, stagingPath, fsType, mntFlags) and
func (ns *nodeServer) publishVolume(stagingPath string, req *csi.NodePublishVolumeRequest) error:
klog.Infof("mount %s to %s, fstype: %s, flags: %v", stagingPath, targetPath, fsType, mntFlags) return ns.mounter.Mount(stagingPath, targetPath, fsType, mntFlags)
Wouldn't it be better to format and mount directly to the Pod? emmmm,I find it very difficult to understand. Can you answer this doubt?

"timed out waiting device ready" while running the example

Hello friends, thanks for your work on SPDL CSI Plugin! 🙌

I'm experiencing issues running the example from README.md and it seems like I'm getting the error timed out waiting device ready, which probably originates from here.

I'm running this SPDK(v20.01) Target in a separate container as described in the guide on Ubuntu 20.04 and Docker 20.10.6.

Could you give me some pointers on what I might be doing wrong or how to debug this further?

Here is the output for spdkcsi-test from sudo kubectl describe pods:

Name:         spdkcsi-test                                                                                                                                                                                         
Namespace:    default                                                                                                                                                                                              
Priority:     0                                                                                                                                                                                                    
Node:        redacted-username/192.168.8.103                                                                                                                                                           
Start Time:   Thu, 03 Jun 2021 15:40:16 +0200                                                                                                                                                                      
Labels:       <none>
Annotations:  <none>
Status:       Pending
IP:           
IPs:          <none>
Containers:
  alpine:
    Container ID:  
    Image:         alpine:3
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      sleep
      365d
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /spdkvol from spdk-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-m8lzx (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  spdk-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  spdkcsi-pvc
    ReadOnly:   false
  default-token-m8lzx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-m8lzx
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                  From                     Message
  ----     ------                  ----                 ----                     -------
  Warning  FailedScheduling        42m (x2 over 42m)    default-scheduler        0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
  Normal   Scheduled               42m                  default-scheduler        Successfully assigned default/spdkcsi-test to redacted-username
  Normal   SuccessfulAttachVolume  42m                  attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-afcedf6f-5907-4016-a7d4-061b519a0281"
  Warning  FailedMount             17m (x2 over 37m)    kubelet                  Unable to attach or mount volumes: unmounted volumes=[spdk-volume], unattached volumes=[default-token-m8lzx spdk-volume]: timed out waiting for the condition
  Warning  FailedMount             15m (x18 over 41m)   kubelet                  MountVolume.MountDevice failed for volume "pvc-afcedf6f-5907-4016-a7d4-061b519a0281" : rpc error: code = Internal desc = timed out waiting device ready: /dev/disk/by-id/*1385c688-8325-4caa-83f1-9504853e02d0*
  Warning  FailedMount             101s (x16 over 40m)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[spdk-volume], unattached volumes=[spdk-volume default-token-m8lzx]: timed out waiting for the condition

how to deploy spdk docker image to k8s can anyone give me yaml file ?

SPDX-License-Identifier: Apache-2.0

Copyright (c) Arm Limited and Contributors

Copyright (c) Intel Corporation

FROM fedora:33

ARG TAG=v20.01
ARG ARCH=native

WORKDIR /root
RUN dnf install -y git
RUN git clone https://github.com/spdk/spdk --branch ${TAG} --depth 1 &&
cd spdk && git submodule update --init --depth 1 && scripts/pkgdep.sh
RUN cd spdk &&
./configure --disable-tests --without-vhost --without-virtio
--with-rdma --target-arch=${ARCH} &&
make
i am using this docker image i want to deploy it on k8s how i can ?
i am able to run container using following command

sudo docker run -it --rm --name spdkdev --privileged --net host -v /dev/hugepages:/dev/hugepages -v /dev/shm:/dev/shm spdkdev /root/spdk/app/spdk_tgt/spdk_tgt

how i can deploy to k8s . thanks in advance

Unknown revisions for few libraries

go list -m all results in
go: k8s.io/[email protected]: invalid version: unknown revision v0.0.0 go: k8s.io/[email protected]: invalid version: unknown revision v0.0.0

This issue mainly concern Goland users because Goland uses go list -m all when updating go.mod. In result IDE is unable to get dependencies for the project.

Fix: Force specific version for those libraries.

two Pods ` spdkcsi-controller-0` and `spdkcsi-node-xxxxx` restart

One more question I'd like to ask you, When all the components are running, and the two Pods spdkcsi-controller-0 and spdkcsi-node-xxxxx hang up, the previously applied PVC resources and mounting information are lost, and the command kubectl delete -f testpod.yaml cannot be passed to stop. And after executing this command, Pod spdkcsi-test has been in Terminating state.Have you considered this question?

My understanding is that when applying for resources, all valid information is stored in the cache. When two Pods hang up, the information in the cache is lost. After restarting the two Pods, there is no previous information. So I can't stop testpod.

snapshot and clone

spdk-csi does not yet support cloning a volume using snapshots, and does not support cloning another volume directly, is that right?

why not a device-plugin

SDPK with nvme is like a user space driver + raw device, why not use device plugin to allocate and initialize nvme device(like gpu)

The main difference between a storage device and computing device is storage device are stateful, when pod recreated it should bind to the original allocated device.

If the pod scheduler and device plugin can persist the pod to nvme device mapping, I'm not sure if there are any other technical issues that we can't use a device plugin to setup spdk in kubernetes?

go1.22 compilation errors

with make all I see:

e2e/utils.go:397:2: undefined: Expect (typecheck)
        Expect(podList.Items).NotTo(BeNil())
        ^
pkg/spdk/controllerserver.go:114:33: cs.Driver undefined (type *controllerServer has no field or method Driver) (typecheck)
                for _, accessMode := range cs.Driver.GetVolumeCapabilityAccessModes() {
                                            ^
[10:39](https://delloneisg.slack.com/archives/D0471QVCZF0/p1708097963495949)
pkg/util/opiinitiator_test.go:43:12: m.Called undefined (type *MockNvmeRemoteControllerServiceClient has no field or method Called) (typecheck)
        args := m.Called(ctx, in)
                  ^
[10:40](https://delloneisg.slack.com/archives/D0471QVCZF0/p1708098000309789)
pkg/util/opiinitiator_test.go:179:12: i.Called undefined (type *MockFrontendNvmeServiceClient has no field or method Called) (typecheck)
        args := i.Called(ctx, in)
../../../usr/local/go/src/compress/flate/inflate.go:720:18: f.r.ReadByte undefined (type Reader has no field or method ReadByte) (typecheck)
                        c, err := f.r.ReadByte()
                                      ^
../../../usr/local/go/src/encoding/json/encode.go:427:4: e.WriteString undefined (type *encodeState has no field or method WriteString) (typecheck)
        e.WriteString("null")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.