Giter Site home page Giter Site logo

cloudfoundry / nfs-volume-release Goto Github PK

View Code? Open in Web Editor NEW
16.0 18.0 29.0 152.34 MB

License: Apache License 2.0

Shell 54.29% HTML 9.67% Ruby 15.59% Dockerfile 5.26% Go 15.05% Makefile 0.14%
cloudfoundry nfs-server bosh diego nfs-service cff-wg-service-management

nfs-volume-release's Introduction

NFS volume release

This is a bosh release that packages:

The broker and driver allow you to provision existing NFS volumes and bind those volumes to your applications for shared file access.

The test NFS and LDAP servers provide easy test targets with which you can try out volume mounts.

Deploying to Cloud Foundry

As of release v1.2.0 we no longer support old cf-release deployments with bosh v1 manifests. Nfs-volume-release jobs should be added to your cf-deployment using provided ops files.

Pre-requisites

  1. Install Cloud Foundry, or start from an existing CF deployment. If you are starting from scratch, the article Overview of Deploying Cloud Foundry provides detailed instructions.

  2. If you plan to deploy with bosh-lite please note the limitations described below.

Redeploy Cloud Foundry with nfs enabled

  1. You should have it already after deploying Cloud Foundry, but if not clone the cf-deployment repository from git:

    $ cd ~/workspace
    $ git clone https://github.com/cloudfoundry/cf-deployment.git
    $ cd ~/workspace/cf-deployment
  2. Now redeploy your cf-deployment while including the nfs ops file:

    $ bosh -e my-env -d cf deploy cf.yml -v deployment-vars.yml -o operations/enable-nfs-volume-service.yml

    Note: the above command is an example, but your deployment command should match the one you used to deploy Cloud Foundry initially, with the addition of a -o operations/enable-nfs-volume-service.yml option.

  3. The above ops file will deploy the nfsbrokerpush bosh errand. You must invoke the errand to push the broker to cloud foundry where it will run as an application.

    $ bosh -e my-env -d cf run-errand nfs-broker-push

Your CF deployment will now have a running service broker and volume drivers, ready to mount nfs volumes.

Security note: because connecting to NFS shares will require you to open your NFS mountpoint to all Diego cells, and outbound traffic from application containers is NATed to the Diego cell IP address, there is a risk that an application could initiate an NFS IP connection to your share and gain unauthorized access to data.

To mitigate this risk, consider one or more of the following steps:

  • Avoid using insecure NFS exports, as that will allow non-root users to connect on port 2049 to your share.
  • Avoid enabling Docker application support as that will allow root users to connect on port 111 even when your share is not insecure.
  • Use CF Security groups to block direct application access to your NFS server IP, especially on ports 111 and 2049.

NFS Test Server

If you wish to also deploy the NFS test server, you can include this operations file with a -o flag also. That will create a separate VM with nfs exports that you can use to experiment with volume mounts.

Note: by default, the nfs test server expects that your CF deployment is deployed to a 10.x.x.x subnet. If you are deploying to a subnet that is not 10.x.x.x (e.g. 192.168.x.x) then you will need to override the export_cidr property. Edit the generated manifest, and replace this line: nfstestserver: {} with something like this: nfstestserver: {export_cidr: 192.168.0.0/16}

Testing and General Usage

You can refer to the Cloud Foundry docs for testing and general usage information.

BBR Support

If you are using Bosh Backup and Restore (BBR) to keep backups of your Cloud Foundry deployment, consider including the enable-nfs-broker-backup.yml operations file from cf-deployment when you redeploy Cloud Foundry. This file will install the requiste backup and restore scripts for nfs service broker metadata on the backup/restore VM.

Bosh-lite deployments

NFS volume services can be deployed with bosh lite, with some caveats:

  1. The nfstestserver job cannot be started in bosh lite because the containers supplied by the warden CPI do not have access to start services, so we fail when attempting to start the nfs service. Testing in a bosh lite environment is still possible, but requires an external NFS server to test against.
  2. NFSv3 connections fail in bosh lite because the rpcbind service is required in order to implement the NFSv3 out-of-band file locking protocol, and that service is not available within the bosh-lite container. NFS4 inlines the locking protocol and doesn't require rpcbind, so version 4 connections work in bosh-lite. You can create NFS4 mounts by including "version":"4" in your create-service or bind-service configuration.

Troubleshooting

If you have trouble getting this release to operate properly, try consulting the Volume Services Troubleshooting Page

nfs-volume-release's People

Contributors

alamages avatar blgm avatar cryogenics-ci avatar davewalter avatar dennisdenuto avatar dennisjbell avatar dependabot[bot] avatar dlresende avatar fnaranjo-vmw avatar gmrodgers avatar ifindlay-cci avatar jeffpak avatar jenspinney avatar jhvhs avatar julian-hj avatar kaixiang avatar kkallday avatar lwoydziak avatar mariash avatar metron2 avatar mook-as avatar nouseforaname avatar paulcwarren avatar samze avatar t0fff avatar totherme avatar wendorf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nfs-volume-release's Issues

permission denied when writing to file system

First of all, thanks for the great work.

We are on PCF 1.10 on vsphere and have enabled the nfs service broker flag in the elastic runtime. Also we have deployed the nfstestserver from this project in a separate bosh deployment.

The following commands result in a permission denied:

$ cf create-service nfs Existing myVolume -c '{"share":"<ip>/export/vol1"}'
$ cf bind-service pora myVolume -c '{"uid":"0","gid":"0"}'
$ cf restage pora
$ curl -k https://pora.apps.<domain>/write
Writing     
open /var/vcap/data/<UUID>/poratest-<random-string>: 
permission denied

Any ideas?

How to specify "share" json when create service instance for NFS volume service

Hi All

I am using v1.1.3 of nfs-volume-release. My nfsbroker was deployed as an cf app (using 'cf push') and nfsv3driver job was added to diego cell by bosh add-ons method and both were successfully. The next step is using this service to create service instance and bind to an app. But I don't know which value is correct to indicate for "share" json in below command. Could you please help to explain?

Create service instance :
$ cf create-service nfs Existing myVolume -c '{"share":"<PRIVATE_IP>/export/vol1"}'
---> How can i get <PRIVATE_IP> for this command if my nfsbroker was deployed as above?
What is "/export/vol1" in above command? is it required to be existing file or folder?

Besides, at step to bind service instance to app
$ cf bind-service pora myVolume -c '{"uid":"1000","gid":"1000"}'
If this command is executed successfully, the shared volume will be bound to /var/vcap/data in the cell that 'pora' app was deployed, right?

Thanks
Hong

nfsv3driver pre-start fail with "cp: cannot create regular file '/usr/bin/fusermount': Text file busy"

Just about every time I re-deploy nfsv3driver to an existing vm the pre-start script fails with the log data:

+ cp lib/.libs/libfuse.so lib/.libs/libulockmgr.so /usr/lib
+ cp util/fusermount /usr/bin
cp: cannot create regular file '/usr/bin/fusermount': Text file busy

Watching processes while doing a deploy I noticed that fuse-nfs is running when the pre-start script executes. I believe that pkill https://github.com/cloudfoundry-incubator/nfs-volume-release/blob/master/jobs/nfsv3driver/templates/install.erb#L6 is returning before fuse-nfs has stopped and therefore fusermount is busy when it gets to https://github.com/cloudfoundry-incubator/nfs-volume-release/blob/master/jobs/nfsv3driver/templates/install.erb#L11

It might be a good idea to wait until no fuse-nfs processes are running before continuing with the pre-start script.

You may not see this in your environments because mine might be faster at copying files or slower at killing fuse-nfs than your infrastructure.

nfsbroker fails when using an Azure mysql db

Using PAS v2.3.5 with external mysql v5.7 with Azure, nfsbroker fails to connect to the datbase due to error "this user requires mysql native password authentication". There's no option to set mysql driver to allowNativePassword=true instead of false causing a problem.

Please configure GITBOT

Pivotal uses GITBOT to synchronize Github issues and pull requests with Pivotal Tracker.
Please add your new repo to the GITBOT config-production.yml in the Gitbot configuration repo.
If you don't have access you can send an ask ticket to the CF admins. We prefer teams to submit their changes via a pull request.

Steps:

  • Fork this repo: cfgitbot-config
  • Add your project to config-production.yml file
  • Submit a PR

If there are any questions, please reach out to [email protected].

Binding faild

nfs-service broker in pcf 1.10.3 (seems version 0.1.5)

Binding service tstnfs to app pyapp in org testorg / space testapp-sps as admin...
FAILED
Server error, status code: 502, error code: 10001, message: The service broker returned an invalid response for the request to https://nfsbroker.sys.pcfpoc.com/v2/service_instances/c0499800-ff21-429f-ab82-ee674f71407c/service_bindings/c81a6e37-5aa7-409d-83ba-28650eaa481a. Status Code: 502 Bad Gateway, Body: 502 Bad Gateway: Registered endpoint failed to handle the request.

v0.1.5 broken release on github

Hi,
https://github.com/cloudfoundry-incubator/nfs-volume-release/releases/download/v0.1.5/nfs-volume-0.1.5.tgz is broken. The sha1 of the packages fuse-nfs and golang-nfsvolume are not correct. Bosh fails with:

  Started compiling packages
  Started compiling packages > golang-nfsvolume/bfef694ebb81e79df9fa1b52485520e8f3bca68f
  Started compiling packages > fuse-nfs/f6c002e343774fefff5ccc674c68237491228729. Failed: Action Failed get_task: Task 0769a1ce-573d-47ba-6754-4ce286507792 result: Compiling package fuse-nfs: Fetching package fuse-nfs: Fetching package blob bb773859-f1f6-462b-a03c-89b285db2468: Getting blob from inner blobstore: Checking downloaded blob 'bb773859-f1f6-462b-a03c-89b285db2468': Expected stream to have digest 'b88450cc7f216def918fd059cee5101ba25ca901' but was '35dadc916582206499cd15e9a2c2d84e77e9e92a' (00:01:14)
   Failed compiling packages > golang-nfsvolume/bfef694ebb81e79df9fa1b52485520e8f3bca68f: Action Failed get_task: Task 9d985d78-6158-4be1-5ded-9de0ce56b4ea result: Compiling package golang-nfsvolume: Fetching package golang-nfsvolume: Fetching package blob 086e6726-0ad4-4417-ae89-bc82010f38c0: Getting blob from inner blobstore: Checking downloaded blob '086e6726-0ad4-4417-ae89-bc82010f38c0': Expected stream to have digest 'af0d9f7e1ef518750e072e6acab61c33db7b8c15' but was 'e0ffb7559ffc73d9a1095441039208c8b4f22916' (00:01:22)

Error 450001: Action Failed get_task: Task 0769a1ce-573d-47ba-6754-4ce286507792 result: Compiling package fuse-nfs: Fetching package fuse-nfs: Fetching package blob bb773859-f1f6-462b-a03c-89b285db2468: Getting blob from inner blobstore: Checking downloaded blob 'bb773859-f1f6-462b-a03c-89b285db2468': Expected stream to have digest 'b88450cc7f216def918fd059cee5101ba25ca901' but was '35dadc916582206499cd15e9a2c2d84e77e9e92a'

The corresponding release on bosh.io has the packages with the correct sha1.

NFS mount Access using API Calls

Version of nfs-volume-release
Which version did you deploy with?
2.3.1

Version of Cloudfoundry
Which version of CF are you running?
2.3.12

Kindly help on the below related queries.

  1. Standard Procedure/template to add the NFS service into manifest file.

  2. How to put/get data from mounted NFS share inside app container using API calls.

nfsdriver mounts are single-threaded

the underlying fuse-nfs process we currently use for id-mapped nfs mounts is not thread safe, and uses libnfs which is also unsafe. As a result, nfs RPCs must be serialized. If the nfs server is busy/slow and the client makes multiple concurrent requests to the file system, then those requests will queue up and get processed one at a time. This can lead to very slow access times, and eventual timeouts.

lock not propagated between instances

The nfs mount does not propagate the locks on file between instances.

test:
on same instance:
run cmd: (flock 9 && echo sucess1; sleep 100) 9> $NFS_MOUNT_POINT/lockfile &
then run: (flock 9 && echo sucess2; sleep 100) 9> $NFS_MOUNT_POINT/lockfile &
will see: sucess1 and after 100 seconds, see sucess2.
the lock works on same instance.

on different instances:
instance 1:
run cmd: (flock 9 && echo sucess1; sleep 100) 9> $NFS_MOUNT_POINT/lockfile &
will see sucess1 immediately
then on instance 2:
run cmd: (flock 9 && echo sucess2; sleep 100) 9> $NFS_MOUNT_POINT/lockfile &
we still will see sucess2 immediately
that means flock does not propage to instance 2

in CF, all applications have multiple instances and they will mount the same volume, then these application wont have proper mechanisms to avoid competing write which will cause data corruption.

Volume doesn't recover from crashed fuse-nfs process

I'm running nfs-volume-release 0.1.6. I'm not sure which component this issue belongs in so I'm starting here.

While doing some testing I tried killing a fuse-nfs process on one of my cells and discovered that process seemingly never comes back for that volume on that cell.

During this time if I ssh to the application and try to access the share I see:

vcap@362b4acd-8180-4826-5ce4-0e19375ecbeb:~$ cd /var/vcap/data
bash: cd: /var/vcap/data: Transport endpoint is not connected

At first I thought it would have been awesome if the dead process was detected and restarted. When that didn't happen I tried restarting the application instance and was surprised that not even that would recover the process.

SSHing into the container after a restart I now can cd into the mount directory but the directory is empty. (didn't really expect this to work but found it interesting the result was different)

Anyway, would expect at the very least that restarting the application instance would be a way to recover from a crashed fuse-nfs process.

Thoughts?

Error filling in template 'db_ca.crt.erb' (line 1: Can't find property '["nfsbrokerpush.db.ca_cert"]'

PAS 2.6.0 and 2.6.1 are running into the following error when an external database is configured without TLS (ie the "Database CA certificate" field is left blank)

Task 3885233 | 21:59:29 | Preparing deployment: Rendering templates (00:01:34)
                    L Error: Unable to render instance groups for deployment. Errors are:
 - Unable to render jobs for instance group 'clock_global'. Errors are:
   - Unable to render templates for job 'nfsbrokerpush'. Errors are:
     - Error filling in template 'db_ca.crt.erb' (line 1: Can't find property '["nfsbrokerpush.db.ca_cert"]')
Task 3885233 | 22:01:03 | Error: Unable to render instance groups for deployment. Errors are:
 - Unable to render jobs for instance group 'clock_global'. Errors are:
   - Unable to render templates for job 'nfsbrokerpush'. Errors are:
     - Error filling in template 'db_ca.crt.erb' (line 1: Can't find property '["nfsbrokerpush.db.ca_cert"]')

This issue is due to the way the PAS tile populates the ca_cert for nfsvolume

for Internal DB:
https://github.com/pivotal-cf/p-runtime/blob/76152a110c58950bc8a593b89d742a0697fbb596/properties/system_database.yml#L221-L238

for External DB:
https://github.com/pivotal-cf/p-runtime/blob/76152a110c58950bc8a593b89d742a0697fbb596/properties/system_database.yml#L471-L488

which the "nfsbrokerpush" uses for its database configuration

https://github.com/pivotal-cf/p-runtime/blob/76152a110c58950bc8a593b89d742a0697fbb596/jobs/nfsbrokerpush.yml#L9

In the case of an internal DB without TLS, the ca_cert is set to an empty string and in the case of an external DB without TLS, the ca_cert is set to NULL.

The nfs-volume-release db_ca.crt.erb as currently written can't deal with the NULL

<%= p("nfsbrokerpush.db.ca_cert") %>

Looking at the bosh docs it looks like the template should be updated to deal with NULL similar to

<% if_p('nfsbrokerpush.db.ca_cert') do |value| %><%= value %><% end %>

nfsbroker-bbr may have a similar issue in its config.json.erb json template.

`nfsbrokerpush` uses a `CF_HOME` dir that collides with other errands

Errands may fail with the following error upon subsequent deploys when CF_HOME is set to the same directory as another errand:

           cf version 6.41.0+dd4c76cdd.2018-11-28  
           API endpoint: https://api.sys.maraudon.cf-app.com  
             
             
           API endpoint:   https://api.sys.maraudon.cf-app.com (API version: 2.128.0)  
           Not logged in. Use 'cf login' to log in.  
           FAILED  
           Service account currently logged in. Use 'cf logout' to log out service account and try again. 

Steps to reproduce:

  • Run smbbrokerpush errand followed by nfsbrokerpush (this should fail with the above error)

We're feeling pretty confident that this is the root case:

  • The NFS errand sets CF_HOME to /var/vcap/bosh/home/cf/.cf here as does the SMB errand (we'll open a separate issue on those repos)
  • If you login as a CF UAA client, then login as a CF UAA user, the CF CLI will throw the above error
  • We configure the NFS errand with a user, but the SMB errand with a client which causes the NFS errand will fail

Workaround is to either disable one of the errands or SSH onto the co-located errand VM and run rm -rf /var/vcap/bosh/home/cf/.cf.

The way other components avoid this is to create a CF_HOME dir that is unique to their release, e.g. export CF_HOME=/var/vcap/data/push-apps-manager/. Could y'all do something similar with NFS errand?

service bind to pora failes

$ cf bind-service pora myVolume
Binding service myVolume to app pora in org spring / space spring as admin...
FAILED
Server error, status code: 502, error code: 10001, message: Service broker error: config requires a "uid"

nfsbroker can push to cf/249, diego/1.0.0, nfs-volume/1.0.6

I am trying to push nfsbroker as cf app.
The app builds fine, and pushed to cf without problem. However, it failed to start with/without cf bind to postgresql service instantances.

cf logs nfsbroker --recent showing following error:

2017-08-22T14:37:54.64-0700 [APP/PROC/WEB/0] OUT Exit status 2
2017-08-22T14:37:54.66-0700 [CELL/0] OUT Exit status 0
2017-08-22T14:37:54.67-0700 [CELL/0] OUT Destroying container
2017-08-22T14:37:54.69-0700 [CELL/0] OUT Creating container
2017-08-22T14:37:54.95-0700 [CELL/0] OUT Successfully destroyed container
2017-08-22T14:37:54.99-0700 [CELL/0] OUT Successfully created container
2017-08-22T14:37:55.21-0700 [CELL/0] OUT Starting health monitoring of container
2017-08-22T14:37:55.30-0700 [APP/PROC/WEB/0] OUT {"data":{},"log_level":1,"message":"nfsbroker.starting","source":"nfsbroker","timestamp":"1503437875.308967352"}
2017-08-22T14:37:55.30-0700 [APP/PROC/WEB/0] OUT {"data":{},"log_level":1,"message":"nfsbroker.ends","source":"nfsbroker","timestamp":"1503437875.309250116"}
2017-08-22T14:37:55.31-0700 [APP/PROC/WEB/0] ERR panic: interface conversion: interface {} is string, not float64
2017-08-22T14:37:55.31-0700 [APP/PROC/WEB/0] ERR goroutine 1 [running]:
2017-08-22T14:37:55.31-0700 [APP/PROC/WEB/0] ERR main.parseVcapServices(0xa7d980, 0xc42001e420)
2017-08-22T14:37:55.31-0700 [APP/PROC/WEB/0] ERR /home/user01/manifest/nfs-volume-release/src/code.cloudfoundry.org/nfsbroker/main.go:183 +0x65d
2017-08-22T14:37:55.31-0700 [APP/PROC/WEB/0] ERR main.createServer(0xa7d980, 0xc42001e420, 0xc42001e420, 0x89491b)
2017-08-22T14:37:55.31-0700 [APP/PROC/WEB/0] ERR /home/user01/manifest/nfs-volume-release/src/code.cloudfoundry.org/nfsbroker/main.go:192 +0x77e
2017-08-22T14:37:55.31-0700 [APP/PROC/WEB/0] ERR main.main()
2017-08-22T14:37:55.31-0700 [APP/PROC/WEB/0] ERR /home/user01/manifest/nfs-volume-release/src/code.cloudfoundry.org/nfsbroker/main.go:118 +0x1bc
2017-08-22T14:37:55.31-0700 [APP/PROC/WEB/0] OUT Exit status 2
2017-08-22T14:37:55.32-0700 [CELL/0] OUT Exit status 0
2017-08-22T14:37:55.33-0700 [CELL/0] OUT Destroying container
2017-08-22T14:37:55.61-0700 [CELL/0] OUT Successfully destroyed container

snipet of manifest.yml

buildpack: binary_buildpack
env:
SERVICENAME: nfs #service name to publish in the marketplace
USERNAME: admin
PASSWORD: goodpassword
LOGLEVEL: info #error, warn, info, debug
DBDRIVERNAME: postgres #mysql or postgres

# if the database is provided via cf bind, DBSERVICENAME must be set otherwise
# it should be omitted and the other db connection parameters should be set.
DBSERVICENAME: postgresql #service name for db service as seen in `cf marketplace`
ALLOWED_OPTIONS: "uid,gid,auto_cache,username,password"

DBHOST: 10.8.8.8

DBPORT: 9999

DBNAME: something

DB_USERNAME: something

DB_PASSWORD: something

DBCACERT: something

Thank you for your help!

Test server example volumes only work on 10.0.0.0/8 networks

Hi,

When deploying the nfs test server, the example volumes located in /etc/exports only allow connections from 10.0.0.0/8 networks. This means that you have to apply a manual update on the server, or apply a hotfix to the file 'nfs-volume-release/jobs/nfstestserver/templates/install.erb' before creating the release. This can be fixed by changing these values to using a wildcard (*) or 0.0.0.0/0 network i.e:

/export 10.0.0.0/8(rw,fsid=0,no_subtree_check,async)

becomes

/export 0.0.0.0/0(rw,fsid=0,no_subtree_check,async)

LDAP Server

installed CF and NFS test server, is there a way to just install ldap server with out touching the existing deployment?

App crashes when attaching to docker app

I am trying to attach an NFS volume to a docker based app but this causes the app to crash and fail. The application runs fine when the volume is not attached, but when I stop the app and attach a volume I cannot start it again.

  • CF v265
  • DIEGO v1.19.0
  • NFS-VOLUME v1.0.6
$ cf bind-service lattice-app lattice-volume -c '{"uid":"1000","gid":"1000"}'
Binding service lattice-volume to app lattice-app in org demo / space demo as demo...
OK
TIP: Use 'cf restage lattice-app' to ensure your env variable changes take effect

$ cf start lattice-app
Starting app lattice-app in org demo / space demo as demo...

0 of 3 instances running, 3 starting
0 of 3 instances running, 3 starting
0 of 3 instances running, 3 starting
0 of 3 instances running, 3 starting
0 of 3 instances running, 3 starting
0 of 3 instances running, 3 starting
0 of 3 instances running, 3 starting
0 of 3 instances running, 1 starting, 2 crashed
FAILED
Start unsuccessful

TIP: use 'cf logs lattice-app --recent' for more information

$ cf logs lattice-app --recent
Connected, dumping recent logs for app lattice-app in org demo / space demo as demo...

2017-07-13T14:48:21.04+0100 [CELL/1]     OUT Destroying container
2017-07-13T14:48:21.04+0100 [CELL/1]     OUT Successfully destroyed container
2017-07-13T14:48:21.09+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:48:21.10+0100 [CELL/1]     OUT Creating container
2017-07-13T14:48:21.12+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"6797fea2-2e66-4bbf-66b3-7638", "index"=>1, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>2, "crash_timestamp"=>1499953700980159051, "version"=>"fd03e786-fae1-4e43-bd85-32df5e4e848e"}
2017-07-13T14:48:21.27+0100 [CELL/0]     ERR Failed to create container
2017-07-13T14:48:21.34+0100 [CELL/0]     OUT Destroying container
2017-07-13T14:48:21.34+0100 [CELL/0]     OUT Successfully destroyed container
2017-07-13T14:48:21.36+0100 [CELL/2]     ERR Failed to create container
2017-07-13T14:48:21.40+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:48:21.43+0100 [CELL/2]     OUT Destroying container
2017-07-13T14:48:21.43+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"160b2f59-eed6-4f6e-448b-a67e", "index"=>0, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>2, "crash_timestamp"=>1499953701286492340, "version"=>"fd03e786-fae1-4e43-bd85-32df5e4e848e"}
2017-07-13T14:48:21.43+0100 [CELL/2]     OUT Successfully destroyed container
2017-07-13T14:48:21.44+0100 [CELL/0]     OUT Creating container
2017-07-13T14:48:21.47+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:48:21.59+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"2f48ddd8-67ee-49fe-6e54-7e55", "index"=>2, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>2, "crash_timestamp"=>1499953701371752041, "version"=>"fd03e786-fae1-4e43-bd85-32df5e4e848e"}
2017-07-13T14:48:21.61+0100 [CELL/2]     OUT Creating container
2017-07-13T14:48:29.36+0100 [CELL/1]     ERR Failed to create container
2017-07-13T14:48:29.39+0100 [CELL/1]     OUT Destroying container
2017-07-13T14:48:29.39+0100 [CELL/1]     OUT Successfully destroyed container
2017-07-13T14:48:29.44+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:48:29.47+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"82f3ecc3-758e-465b-52cf-45c3", "index"=>1, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>3, "crash_timestamp"=>1499953709377010910, "version"=>"fd03e786-fae1-4e43-bd85-32df5e4e848e"}
2017-07-13T14:48:29.70+0100 [CELL/0]     ERR Failed to create container
2017-07-13T14:48:29.72+0100 [CELL/0]     OUT Destroying container
2017-07-13T14:48:29.73+0100 [CELL/0]     OUT Successfully destroyed container
2017-07-13T14:48:29.76+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:48:29.79+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"cc9a8c54-0ae7-4f34-750b-3221", "index"=>0, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>3, "crash_timestamp"=>1499953709711415445, "version"=>"fd03e786-fae1-4e43-bd85-32df5e4e848e"}
2017-07-13T14:48:30.05+0100 [CELL/2]     ERR Failed to create container
2017-07-13T14:48:30.07+0100 [CELL/2]     OUT Destroying container
2017-07-13T14:48:30.07+0100 [CELL/2]     OUT Successfully destroyed container
2017-07-13T14:48:30.11+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:48:30.13+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"ef488043-3516-40f8-72cf-b241", "index"=>2, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>3, "crash_timestamp"=>1499953710059562820, "version"=>"fd03e786-fae1-4e43-bd85-32df5e4e848e"}
2017-07-13T14:49:15.82+0100 [CELL/0]     OUT Creating container
2017-07-13T14:49:15.83+0100 [CELL/1]     OUT Creating container
2017-07-13T14:49:15.86+0100 [CELL/2]     OUT Creating container
2017-07-13T14:49:24.18+0100 [CELL/0]     ERR Failed to create container
2017-07-13T14:49:24.23+0100 [CELL/0]     OUT Destroying container
2017-07-13T14:49:24.23+0100 [CELL/0]     OUT Successfully destroyed container
2017-07-13T14:49:24.56+0100 [CELL/2]     ERR Failed to create container
2017-07-13T14:49:24.59+0100 [CELL/2]     OUT Destroying container
2017-07-13T14:49:24.59+0100 [CELL/2]     OUT Successfully destroyed container
2017-07-13T14:49:24.66+0100 [CELL/1]     ERR Failed to create container
2017-07-13T14:49:24.67+0100 [API/0]      OUT Updated app with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 ({"state"=>"STOPPED"})
2017-07-13T14:49:24.70+0100 [CELL/1]     OUT Destroying container
2017-07-13T14:49:24.70+0100 [CELL/1]     OUT Successfully destroyed container
2017-07-13T14:50:55.04+0100 [API/0]      OUT Updated app with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 ({"state"=>"STARTED"})
2017-07-13T14:50:55.21+0100 [CELL/1]     OUT Creating container
2017-07-13T14:50:55.23+0100 [CELL/2]     OUT Creating container
2017-07-13T14:50:55.24+0100 [CELL/0]     OUT Creating container
2017-07-13T14:51:07.13+0100 [CELL/1]     ERR Failed to create container
2017-07-13T14:51:07.19+0100 [CELL/1]     OUT Destroying container
2017-07-13T14:51:07.20+0100 [CELL/1]     OUT Successfully destroyed container
2017-07-13T14:51:07.23+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:51:07.25+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"a05d42b9-aba3-4539-7e49-39f9", "index"=>1, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>1, "crash_timestamp"=>1499953867139187688, "version"=>"2dbbf32c-eb0a-4102-b3df-a2046f640405"}
2017-07-13T14:51:07.27+0100 [CELL/1]     OUT Creating container
2017-07-13T14:51:07.27+0100 [CELL/2]     ERR Failed to create container
2017-07-13T14:51:07.30+0100 [CELL/0]     ERR Failed to create container
2017-07-13T14:51:07.33+0100 [CELL/2]     OUT Destroying container
2017-07-13T14:51:07.34+0100 [CELL/2]     OUT Successfully destroyed container
2017-07-13T14:51:07.36+0100 [CELL/0]     OUT Destroying container
2017-07-13T14:51:07.37+0100 [CELL/0]     OUT Successfully destroyed container
2017-07-13T14:51:07.38+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:51:07.42+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:51:07.42+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"f4c37e52-b715-41a2-7aad-ec6a", "index"=>2, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>1, "crash_timestamp"=>1499953867289139364, "version"=>"2dbbf32c-eb0a-4102-b3df-a2046f640405"}
2017-07-13T14:51:07.43+0100 [CELL/2]     OUT Creating container
2017-07-13T14:51:07.45+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"0eb93fdb-e2c3-4778-425f-7fa0", "index"=>0, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>1, "crash_timestamp"=>1499953867317046507, "version"=>"2dbbf32c-eb0a-4102-b3df-a2046f640405"}
2017-07-13T14:51:07.45+0100 [CELL/0]     OUT Creating container
2017-07-13T14:51:18.37+0100 [CELL/2]     ERR Failed to create container
2017-07-13T14:51:18.43+0100 [CELL/2]     OUT Destroying container
2017-07-13T14:51:18.43+0100 [CELL/2]     OUT Successfully destroyed container
2017-07-13T14:51:18.45+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:51:18.48+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"a52f0e58-4326-4b9b-4e4c-7a53", "index"=>2, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>2, "crash_timestamp"=>1499953878384243087, "version"=>"2dbbf32c-eb0a-4102-b3df-a2046f640405"}
2017-07-13T14:51:18.50+0100 [CELL/2]     OUT Creating container
2017-07-13T14:51:18.81+0100 [CELL/1]     ERR Failed to create container
2017-07-13T14:51:18.88+0100 [CELL/1]     OUT Destroying container
2017-07-13T14:51:18.88+0100 [CELL/1]     OUT Successfully destroyed container
2017-07-13T14:51:18.93+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:51:18.95+0100 [CELL/1]     OUT Creating container
2017-07-13T14:51:18.97+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"911a85a7-1f8a-476e-57db-1a77", "index"=>1, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>2, "crash_timestamp"=>1499953878823882841, "version"=>"2dbbf32c-eb0a-4102-b3df-a2046f640405"}
2017-07-13T14:51:23.59+0100 [CELL/0]     ERR Failed to create container
2017-07-13T14:51:23.66+0100 [CELL/0]     OUT Destroying container
2017-07-13T14:51:23.67+0100 [CELL/0]     OUT Successfully destroyed container
2017-07-13T14:51:23.72+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:51:23.75+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"d38afb1a-6701-4ad7-6c7d-4f7c", "index"=>0, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>2, "crash_timestamp"=>1499953883608024285, "version"=>"2dbbf32c-eb0a-4102-b3df-a2046f640405"}
2017-07-13T14:51:23.77+0100 [CELL/0]     OUT Creating container
2017-07-13T14:51:31.00+0100 [CELL/2]     ERR Failed to create container
2017-07-13T14:51:31.03+0100 [CELL/2]     OUT Destroying container
2017-07-13T14:51:31.03+0100 [CELL/2]     OUT Successfully destroyed container
2017-07-13T14:51:31.07+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:51:31.11+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"1ec13499-90b5-482f-56b3-3095", "index"=>2, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>3, "crash_timestamp"=>1499953891012606481, "version"=>"2dbbf32c-eb0a-4102-b3df-a2046f640405"}
2017-07-13T14:51:31.21+0100 [CELL/1]     ERR Failed to create container
2017-07-13T14:51:31.24+0100 [CELL/1]     OUT Destroying container
2017-07-13T14:51:31.24+0100 [CELL/1]     OUT Successfully destroyed container
2017-07-13T14:51:31.28+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:51:31.32+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"e95342f4-cee8-4867-4a9f-a697", "index"=>1, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>3, "crash_timestamp"=>1499953891220437015, "version"=>"2dbbf32c-eb0a-4102-b3df-a2046f640405"}
2017-07-13T14:51:37.26+0100 [CELL/0]     ERR Failed to create container
2017-07-13T14:51:37.29+0100 [CELL/0]     OUT Destroying container
2017-07-13T14:51:37.30+0100 [CELL/0]     OUT Successfully destroyed container
2017-07-13T14:51:37.35+0100 [API/0]      OUT Process has crashed with type: "web"
2017-07-13T14:51:37.38+0100 [API/0]      OUT App instance exited with guid 788000fa-402d-4e4f-a8f3-ff5085de6636 payload: {"instance"=>"9c6dd52e-5ebc-47ee-4b97-9278", "index"=>0, "reason"=>"CRASHED", "exit_description"=>"failed to initialize container", "crash_count"=>3, "crash_timestamp"=>1499953897280095251, "version"=>"2dbbf32c-eb0a-4102-b3df-a2046f640405"}

If I detach the volume the app runs fine:

$ cf unbind-service lattice-app lattice-volume
Unbinding app lattice-app from service lattice-volume in org demo / space demo as demo...
OK

$ cf restart lattice-app
Stopping app lattice-app in org demo / space demo as demo...
OK

Starting app lattice-app in org demo / space demo as demo...

0 of 3 instances running, 3 starting
0 of 3 instances running, 3 starting
3 of 3 instances running

App started

OK

I have tested omitting UID/GID settings which had the same outcome as above in my example and have confirmed that attaching the volume to buildpack based applications works without any issues.

NetApp NFSv3 shares with "Allow Superuser Access" disabled don't work with experimental=true

Using NFS Volume Release: 1.7.6

Netapp has a setting on nfs v3 volumes called "Allow Superuser Access" https://docs.netapp.com/ontap-9/index.jsp?topic=%2Fcom.netapp.doc.exp-nfsv3-cg%2FGUID-22C99AFB-C64C-4E99-8DD0-8F705BC803F8.html . With the nfsdriver experimental=false we recommended users "uncheck" that box and everything works fine.

With experimental=true and "Allow Superuser Access" unchecked users cannot cd to the mapfs mount. But, they can ls the mount. With "Allow Superuser Access" checked everything appears to work fine.

I'm not sure what actual benefit "Allow Superuser Access" provides from a security standpoint. But, it may be something to note for users migrating from experimental=false to experimental=true.

This also presents an edge case that seems to fail the mount verification test done at container start time. I would consider the mount in accessible if I cannot cd into the directory.

thoughts about a possible security exploit

hi all

we're thinking about offering nfs volumes in our CF.
however, we're very concerned about the security:

  • on our NFS server, we'd have to open the exports to all the Diego cells, since it's NATted (as described in the README
  • thus, every app or cell could access all NFS shares of all users
  • to mitigate this, we would disallow the apps from accessing the NFS server's IP by using security groups

but: it looks to me that nfs volume driver mounts all service bindings that follow a specific format (volume_mounts). what if a user pushes his own service broker that gives out service bindings with volume_mounts that points to a victim's NFS share IP? would then the driver happily mount the NFS share?

in nfsv3driver can I somehow whitelist/blacklist brokers so that only 'our' broker is allowed to give out nfs mount points? or can we somehow lock down the driver so that above attack vector would be impossible?

p.s. using ldap is not an option, since the user could use above 'adversary-broker' to circumvent the ldap check by giving out a binding that doesn't contain the ldap stuff. this could probably be mitigated if we can force ldap on the nfsv3driver level, but as far as I can see, this is not possible atm.

Add documentation binding service through manifest

Hello,

The documentation is great for setting things up and getting started, but I think it would be nice if you guys add to your documentation how to bind the nfs service using a sample manifest file.

For example:

cf bind-service pora nfs_test -c '{"uid":"1000","gid":"1000","mount":"/dir"}' <-- How would that be done in a manifest file?

Thanks

Multitenancy support?

The current implementation of nfs-volume-release does not seem to support "multitenancy", as cephfs-broker is supporting.

With the same NFS export used, even when there are two different service instances created and bound to two different apps, one of the apps can see files created on its NFS volume by another one.

The things I want to know is:

  1. Is this just tentative state? (multitenancy will be supported in the future?)
  2. If the answer to the first question is 'NO', what is the best way to use NFS Volume service securely with multiple developers?
    • Providing different NFS exports for each developer
    • Using space-scoped service brokers
    • Managing UID / GID pairs on NFS appropriately
    • or something else?
  3. Is there any other production-level volume service provided as open source? (I know this is not an appropriate question to be asked in a issue of this repository. Sorry)

Broken: "description": "Service broker error: Not allowed options: gid, uid",

Environment
Pivotal Application Service: 2.0.16

I cannot supply uid, gid parameters when binding to a service:

$ cf bind-service pora NFS001 -c '{"uid":"65534","gid":"65534","mount":"/mnt","readonly":false}'

Binding service NFS001 to app pora in org Development / space ALEX001 as alex001...
Unexpected Response
Response code: 502
CC code:       0
CC error code:
Request ID:    b2dc0cc3-9c27-4f29-7b90-b0951c9d7153::d8e132c4-0f59-4d4e-9b8d-c9b284dde6ac
Description:   {
  "description": "Service broker error: Not allowed options: gid, uid",
  "error_code": "CF-ServiceBrokerBadResponse",
  "code": 10001,
  "http": {
    "uri": "https://nfsbroker.system.lab-cg.localnet.local/v2/service_instances/85a2e72c-41c4-4c47-b71b-d97e5ea9f30b/service_bindings/08924272-1485-4a7e-b970-3261ad2abad8",
    "method": "PUT",
    "status": 500
  }
}

FAILED

Volume attachment fails without rpcbind

We set up the nfs broker, driver, and test server to test volume services and found that the nfs driver couldn't attach shares unless we ran service rpcbind start on the test server. Did we miss something, or should the nfstestserver job also run rpcbind?

Container successfully starts when using experimental driver with non root share

I have a share that doesn't have root mount permissions turned on. My understanding of the experimental driver is that this shouldn't work. And it doesn't. However, my container successfully starts and when I attempt to access the mount directory (via sshing into the container) I get a permission denied error.

In this scenario I would expect the container to fail to start. Thoughts?

ldap Authentication on NFS volume services

We are testing our ldap integration with nfs shared volume, On reviewing the documents i found the below Note:

If your LDAP server password changes, you must re-bind your app to the service and restage. If you do not, your app will fail during restart or scaling. This is because user credentials are stored as part of the service binding and checked whenever an app is placed on a cell.

We have more than 1000 of applications and if we integrate with LDAP, as per AD policy the account will be expired in 90 days and new password will be set. So we need to rebind and re-stage the apps? In production we cant able to consider this scenerio, how we can overcome on this? We prefer the NAS share needs to be mounted with specific AD account, so other cannot access that NAS mount.

nfsv3driver job ctl script errors aren't sent to log files

My nfsv3driver got into a state where it wouldn't start. tailing the logs showed no log messages, the job simply wouldn't start. Running the ctl script manually nfsv3driver_ctl start showed me:

+ RUN_DIR=/var/vcap/sys/run/nfsv3driver
+ LOG_DIR=/var/vcap/sys/log/nfsv3driver
+ PIDFILE=/var/vcap/sys/run/nfsv3driver/nfsv3driver.pid
+ case $1 in
+ id -u cvcap
+ mkdir -p /var/vcap/sys/run/nfsv3driver
+ chown -R cvcap:cvcap /var/vcap/sys/run/nfsv3driver
+ mkdir -p /var/vcap/sys/log/nfsv3driver
+ chown -R cvcap:cvcap /var/vcap/sys/log/nfsv3driver
+ mkdir -p /var/vcap/data/voldrivers
+ chmod 777 /var/vcap/data/voldrivers
+ mkdir -p /var/vcap/data/volumes/nfs
+ chown -R cvcap:cvcap /var/vcap/data/volumes/nfs
chown: cannot access ‘/var/vcap/data/volumes/nfs/2f5673df-3f08-4ae1-98e8-06819a5f6f6b658484ad-867a-4e2e-9c97-3edce02a92dc’: Transport endpoint is not connected

I'm still tracking down how I got into this state and may create another issue for that. But, at the very least I think this is some information that should be appearing in the nfsv3driver logs.

Other cf jobs use a syslog_utils.sh script to tee all console output to files in the /var/vcap/sys/logs directory.

Something to consider.

nfsbrokerpush errand fails on v2.0.2

What is this issue about?

The nfsbrokerpush errand fails on a new CF deployment.

What version of cf-deployment are you using?

cf-deployment 7.6.0 which includes nfs-volume-service 2.0.2

Please include the bosh deploy... command, including all the operations files:

Using the deploy-cf task from cf-deployment-concourse-tasks we see the following command to interpolate the deployment manifest:

$ bosh -n interpolate \
  -v system_domain=drats-with-config-manx.cf-app.com \
  -o ops-files/operations/scale-to-one-az.yml \
  -o ops-files/operations/use-compiled-releases.yml \
  -o ops-files/operations/backup-and-restore/enable-backup-restore.yml \
  -o ops-files/operations/enable-nfs-volume-service.yml \
  -o ops-files/operations/backup-and-restore/enable-backup-restore-nfs-broker.yml \
  -o ops-files/operations/experimental/disable-interpolate-service-bindings.yml \
  -o ops-files/operations/experimental/enable-traffic-to-internal-networks.yml \
  -o ops-files/operations/experimental/enable-smb-volume-service.yml \
  cf-deployment/cf-deployment.yml

Please provide output that helps describe the issue:

Output of the nfsbrokerpush errand:

$ bosh -d cf run-errand nfsbrokerpush
Using environment 'https://10.0.0.6:25555' as client 'admin'

Using deployment 'cf'

Task 563

Task 563 | 09:53:04 | Preparing deployment: Preparing deployment
Task 563 | 09:53:15 | Preparing package compilation: Finding packages to compile (00:00:01)
Task 563 | 09:53:16 | Preparing deployment: Preparing deployment (00:00:12)
Task 563 | 09:53:16 | Creating missing vms: nfs-broker-push/268b1094-e4fa-41c3-be00-316c11874244 (0) (00:00:46)
Task 563 | 09:54:02 | Updating instance nfs-broker-push: nfs-broker-push/268b1094-e4fa-41c3-be00-316c11874244 (0) (canary) (00:02:26)
Task 563 | 09:56:28 | Running errand: nfs-broker-push/268b1094-e4fa-41c3-be00-316c11874244 (0) (00:00:03)
Task 563 | 09:56:31 | Fetching logs for nfs-broker-push/268b1094-e4fa-41c3-be00-316c11874244 (0): Finding and packing log files (00:00:01)

Task 563 Started  Wed Mar  6 09:53:04 UTC 2019
Task 563 Finished Wed Mar  6 09:56:32 UTC 2019
Task 563 Duration 00:03:28
Task 563 done

Instance   nfs-broker-push/268b1094-e4fa-41c3-be00-316c11874244  
Exit Code  2  
Stdout     cf version 6.43.0+815ea2f3d.2019-02-20  
           Logging out ...  
           OK  
             
           Setting api endpoint to https://api.drats-with-config-manx.cf-app.com...  
           OK  
             
           api endpoint:   https://api.drats-with-config-manx.cf-app.com  
           api version:    2.132.0  
           Not logged in. Use 'cf login' to log in.  
           API endpoint: https://api.drats-with-config-manx.cf-app.com  
           Authenticating...  
           OK  
             
           Use 'cf target' to view or set your target org and space.  
           Creating org system as nfs-broker-push-client...  
           OK  
             
           api endpoint:   https://api.drats-with-config-manx.cf-app.com  
           api version:    2.132.0  
           user:           nfs-broker-push-client  
           org:            system  
           space:          nfs-broker-space  
           Creating space nfs-broker-space in org system as nfs-broker-push-client...  
           OK  
             
           api endpoint:   https://api.drats-with-config-manx.cf-app.com  
           api version:    2.132.0  
           user:           nfs-broker-push-client  
           org:            system  
           space:          nfs-broker-space  
           ---  
           applications:  
           - name: "nfs-broker"  
             buildpacks:  
             - binary_buildpack  
             routes:  
             - route: "nfs-broker.drats-with-config-manx.cf-app.com"  
             memory: "256M"  
             env:  
               USERNAME: "nfs-broker"  
               PASSWORD: ****  
                 
               DB_USERNAME: "nfs-broker"  
               DB_PASSWORD: ****  
                 
           Creating security group credhub_open as   
           OK  
           Security group credhub_open already exists  
           Updating security group credhub_open as   
           OK  
             
             
             
           TIP: Changes will not apply to existing running applications until they are restarted.  
           Assigning security group credhub_open to space nfs-broker-space in org system as nfs-broker-push-client...  
           OK  
             
           TIP: Changes require an app restart (for running) or restage (for staging) to apply to existing applications.  
           {"timestamp":"1551866190.857934475","source":"migrate_mysql_to_credhub","message":"migrate_mysql_to_credhub.migrating","log_level":1,"data":{}}  
           {"timestamp":"1551866190.858255625","source":"migrate_mysql_to_credhub","message":"migrate_mysql_to_credhub.initialize-database.start","log_level":1,"data":{"session":"1"}}  
           {"timestamp":"1551866190.858333588","source":"migrate_mysql_to_credhub","message":"migrate_mysql_to_credhub.initialize-database.mysql-connection-connect.start","log_level":1,"data":{"session":"1.1"}}  
           {"timestamp":"1551866190.858571291","source":"migrate_mysql_to_credhub","message":"migrate_mysql_to_credhub.initialize-database.mysql-connection-connect.end","log_level":1,"data":{"session":"1.1"}}  
           {"timestamp":"1551866190.924384832","source":"migrate_mysql_to_credhub","message":"migrate_mysql_to_credhub.initialize-database.end","log_level":1,"data":{"session":"1"}}  
           {"timestamp":"1551866190.924566269","source":"migrate_mysql_to_credhub","message":"migrate_mysql_to_credhub.sql-failed-to-initialize-database","log_level":2,"data":{"error":"Error 1045: Access denied for user 'nfs-broker'@'10.0.1.2' (using password: YES)"}}  
           {"timestamp":"1551866190.924676895","source":"migrate_mysql_to_credhub","message":"migrate_mysql_to_credhub.failed-to-initialize-sql-store","log_level":3,"data":{"error":"Error 1045: Access denied for user 'nfs-broker'@'10.0.1.2' (using password: YES)","trace":"goroutine 1 [running]:\ncode.cloudfoundry.org/lager.(*logger).Fatal(0xc0000603c0, 0x867b97, 0x1e, 0x8c7dc0, 0xc000170980, 0x0, 0x0, 0x0)\n\t/var/vcap/packages/migrate_mysql_to_credhub/src/code.cloudfoundry.org/lager/logger.go:138 +0xcd\nmain.main()\n\t/var/vcap/packages/migrate_mysql_to_credhub/src/code.cloudfoundry.org/migrate_mysql_to_credhub/main.go:98 +0x3bb\n"}}  
           {"timestamp":"1551866190.924763918","source":"migrate_mysql_to_credhub","message":"migrate_mysql_to_credhub.ends","log_level":1,"data":{}}  
             
Stderr     Org system already exists.  
             
           Space nfs-broker-space already exists  
             
           App nfs-broker not found  
           panic: Error 1045: Access denied for user 'nfs-broker'@'10.0.1.2' (using password: YES)  
             
           goroutine 1 [running]:  
           code.cloudfoundry.org/lager.(*logger).Fatal(0xc0000603c0, 0x867b97, 0x1e, 0x8c7dc0, 0xc000170980, 0x0, 0x0, 0x0)  
           	/var/vcap/packages/migrate_mysql_to_credhub/src/code.cloudfoundry.org/lager/logger.go:162 +0x609  
           main.main()  
           	/var/vcap/packages/migrate_mysql_to_credhub/src/code.cloudfoundry.org/migrate_mysql_to_credhub/main.go:98 +0x3bb  

What IaaS is this issue occurring on?

GCP

Is there anything else unique or special about your setup?

We noticed that we did not delete credentials generated for the previous CF deployment from CredHub.

Tag your pair, your PM, and/or team!

Josh and @terminatingcode
cc @cloudfoundry/bosh-backup-and-restore-team

"Service broker error: Not allowed options: username, password"

PAS enabled with LDAP and i am not able to bind service using username and password.

cf bind-service my-test-nfs nfs_service_instance -c '{"username":"username","password":"******","mount":"/var/vol1","readonly":false}'

Binding service nfs_service_instance to app my-test-nfs in org org-name / space development as username...
Unexpected Response
Response code: 502
CC code: 0
CC error code:
Request ID: 7f1570f2-cd1b-4cd3-5b4d-7d74c4b50694::ec7824a6-871e-4b65-b805-582b50fcd397
Description: {
"description": "Service broker error: Not allowed options: password, username",
"error_code": "CF-ServiceBrokerBadResponse",
"code": 10001,
"http": {
"uri": "https://nfsbroker.system.us-chd1-np2.1dc.com/v2/service_instances/6da57ba5-9909-41b8-ac66-de381396aaf4/service_bindings/842c4588-1255-4307-9de0-e0922e8b2d26",
"method": "PUT",
"status": 500
}
}

FAILED

Any input can be helpful to fix this issue.

Error after restaging the application

hi,
We have deployed the nfs broker along with the server for testing and tried pushing volume application after binding it with the nfs-service and it worked.
But now if i am trying to push the same application that i have pushed before, it gives me an error while restaging the application.

Any suggestions for trouble-shooting

thanks in advance.

UID/GID based mapping NFS-experimental

Expected behavior
A clear and concise description of what you expected to happen.

Version of nfs-volume-release
Which version did you deploy with?
2.3.1

Version of Cloudfoundry
Which version of CF are you running?
6.38

I am doing my NFS shared volume implementation in our environment using UID/GID based.

Apprestage-logs.txt
nfs - I am able to mount the nfs share successfully.
nfs-experimental - Failed to mount nfs share.

In diego cell, I am able to mount both NFS version 3 and 4 without any issues. Below steps followed.
mkdir foo
sudo mount -t nfs -o vers=3 :
sudo umount foo
sudo mount -t nfs -o vers=4 :
sudo umount foo

Steps Followed:

  1. cf create-service nfs-experimental Existing nfs_service_instance-test -c '{"uid":"0","gid":"0","share":"10.xxx/QVOL_PCFCloud","version":"4.0"}'
  2. cf bind-service my-test-nfs nfs_service_instance -c '{"uid":"0","gid":"0","mount":"/var/vol1","readonly":false}'
  3. cf restage my-test-nfs -> Failed to mount NFS

Any idea or suggestion will be highly useful.

nfsv3driver is using a shared bpm volume

Hi,

While working on a completely unrelated bug we (CF Garden) noticed that the nfsv3driver job declares the nfsv3driver.cell_mount_path BPM volume as shared. Looking at the history, this seems to be a spike work that made its way to master.
So far this used to have no impact as BPM did not support shared volumes and the shared attribute was simply ignored. As of BPM 1.1.0 shared volumes are officially supported and I wanted to give you a heads up to make sure that you do not encounter undesired behaviour.

Submodules -> Versioned Deps

It would be useful for downstream consumers if there was a decoupling of the bosh release from the service broker itself. This would allow for the service broker to progress independently of the bosh release, allowing for a bit more flexibility when dealing with bugs and issues.

From cloudfoundry/nfsbroker#5:

we treat nfs-volume-release as a GOPATH and just gather all the dependencies in the correct versions as git submodules

With go modules, the need for GOPATH munging is mostly over, as stuff can be built in-place. Decoupling the git submodules from the development of the broker aligns much more with the go philosophy moving forward. It also makes working with the broker much more lightweight as you aren't carrying a lot of different submobules forward all the time.

cloudfoundry/nfsbroker#8 allows for the broker to be independently managed, which is a dependency of this issue.

enable-nfs-volume-service.yml

Hi,

I was looking to enable nfs on my open source Cloud Foundry distribution deployed to Azure (using https://github.com/cloudfoundry/bosh-azure-cpi-release/tree/master/docs )

Looking at your readme, you reference an ops file "enable-nfs-volume-services.yml" however I noticed this file no longer appears in your operations directory.

Two questions - firstly am I doing the right thing and secondly am I following the right instructions?

Appreciate that this is more of a question regarding usage and happy to relocate the question if you think there is somewhere more appropriate, in which case this may be just a pointer to updating the readme.

Thanks!

NFS Broker doesn't currently support Postgres

The nfsbroker itself has CLI support baked into it for Postgres, but when you try and leverage it, you get this error:

   2019-05-30T16:07:55.29-0600 [APP/PROC/WEB/0] ERR panic: sql: unknown driver "postgres" (forgotten import?)
   2019-05-30T16:07:55.29-0600 [APP/PROC/WEB/0] ERR goroutine 1 [running]:
   2019-05-30T16:07:55.29-0600 [APP/PROC/WEB/0] ERR code.cloudfoundry.org/lager.(*logger).Fatal(0xc420064600, 0x91893a, 0x19, 0xb5b7c0, 0xc420047c30, 0x0, 0x0, 0x0)
   2019-05-30T16:07:55.29-0600 [APP/PROC/WEB/0] ERR 	/tmp/build/1110d2e9/release-src/src/code.cloudfoundry.org/lager/logger.go:162 +0x611
   2019-05-30T16:07:55.29-0600 [APP/PROC/WEB/0] ERR code.cloudfoundry.org/service-broker-store/brokerstore.NewStore(0xb648e0, 0xc420064600, 0x7ffe50a5b562, 0x8, 0xc420016c40, 0x10, 0xc420014900, 0x19, 0xc42001c4b0, 0x4d, ...)
   2019-05-30T16:07:55.29-0600 [APP/PROC/WEB/0] ERR 	/tmp/build/1110d2e9/release-src/src/code.cloudfoundry.org/service-broker-store/brokerstore/store.go:54 +0x645
   2019-05-30T16:07:55.29-0600 [APP/PROC/WEB/0] ERR main.createServer(0xb648e0, 0xc420064600, 0xc420064600, 0x90b08e)
   2019-05-30T16:07:55.29-0600 [APP/PROC/WEB/0] ERR 	/tmp/build/1110d2e9/release-src/src/code.cloudfoundry.org/nfsbroker/main.go:288 +0x38f
   2019-05-30T16:07:55.29-0600 [APP/PROC/WEB/0] ERR main.main()
   2019-05-30T16:07:55.29-0600 [APP/PROC/WEB/0] ERR 	/tmp/build/1110d2e9/release-src/src/code.cloudfoundry.org/nfsbroker/main.go:159 +0x110

Upon further investigation, the cause is because there is no driver support for Postgres in cloudfoundry/service-broker-store.

In order to resolve this, cloudfoundry/goshims#5 needs to first be addressed to add core shim support. Once that's been resolved, cloudfoundry/service-broker-store#4 needs to be addressed. Once those are handled, in that order, this issue should go away and a new release can be cut.

Detect stale nfs mounts and mark as unhealthy

This might be a rare problem for some, but we make extensive use of this driver and occasionally diego-cells get a stale mount. New containers will fail to start with "failed to mount volume".

I wrote a script to detect this case and I'd like it added to the health check for the daemon so that bosh can automatically recreate the cell.

MOUNTS=($(mount -v | grep -E -o '/var/vcap/data/volumes/nfs/[a-f0-9\-]+))
for mount in "${MOUNTS{@}}"
do
   timeout 5s ls $mount > /dev/null
   exit_code = $?
   if [ $exit_code -ne 0 ]; then
     echo "stale mount $mount $exit_code";
     exit $exit_code
   fi 
done

The idea would be to have a monit check that fails when one of the mounts goes stale

Attempt to improve mount failure message reported to user in case of access denied

Today when an nfsv3 volume fails to mount because of access denied all the user gets back is a crash description of failed to mount volume. If I attempt to mount on the command line I see a more useful error like mount.nfs: access denied by server while mounting. It would be nice if we could attempt to expose more useful failure reasons to the user like in this access denied case.

Proxy properties ignored

The http_proxy, https_proxy and no_proxy properties are ignored, therefore the nfstestserver job fails when running apt-get.

ldap Authentication on NFS volume services

Hi Team

I am getting the below error while restage the apps after binding the nfs-volume. I am using the nfs-volume version 2.3.1

cf create-service nfs Existing nfs_service_instance -c '{"share":"10.x.x.x/QVOL_PCFcloud"}'
cf bind-service test-java nfs_service_instance -c '{"username":"xxx-xxxxx","password":"xxx-xxxxx","mount":"/var/vol1","readonly":false}'
cf restage test-java --> Failed with below logs

2019-10-16T12:33:24.44+0000 [API/0] OUT App instance exited with guid e099054d-744e-4fee-8d37-66279cd24281 payload: {"instance"=>"fe4ae7ce-a992-4163-630a-82a1", "index"=>0, "cell_id"=>"f7dcc0db-4bf6-4a58-9ee3-86057fce2acf", "reason"=>"CRASHED", "exit_description"=>"failed to mount volume, errors: LDAP server could not be reached, please contact your system administrator", "crash_count"=>3, "crash_timestamp"=>1571229204423131446, "version"=>"01edd9bc-b045-49b3-ad10-71e2b8878fc4"}

I am able to mount and unmount the nfs share properly in diego cell. We can have Webex to troubleshoot further on this.

Any troubleshooting steps would be much appreciated.

nfs-volume/1.6.0 support Xenial Stemcell

Hi Team,

I am using nfs-volume/1.6.0 in our environment which is using stemcell -
bosh-vsphere-esxi-ubuntu-trusty-go_agent/3541.34

Kindly confirm whether nfs-volume/1.6.0 version support Xenial stemcell?
And also confirm from which version onwards it will start support Xenial stemcell?

Error: Transport endpoint is not connected and how to configure NFS mount options

We have been trying to use nfs-volume-release (v1.1.0) as the persistent store for Postgres container (Docker-based application) in CF. The Postgres container is running fine for a while but as soon as it come under any load, the Postgres server fails with this error:

cannot access /data: Transport endpoint is not connected

The error frequently occurs and the problem will goes away if we restart the Postgres container (app).

The Postgres doc (https://www.postgresql.org/docs/9.5/static/creating-cluster.html) suggests that we should use sync and hard options for the NFS mount when using Postgres on a network file system (NFS) and another source (https://dhelios.blogspot.com/2016/07/nfs-mount-options.html) suggests some additional NFS mount options.

If the nfs-volume-release is using fuse.nfs (https://github.com/sahlberg/fuse-nfs), how do we configure these NFS mount options (hard, sync, bg, and etc) when mounting NFS share into the Docker containers (apps) in CF?

NFS Mount issue using LDAP

Hi Team

We are using NFS shared volume version 2.3.1. We enabled LDAP Authentication and in my environment using Windows AD. So we enabled Unix attributes for the account in AD.

uidNumber: 10001
gidNumber: 10001
loginshell: /bin/bash
unixHomeDirectory: /bin/nologin

While re-staging the app, we NFS failed to mount and attach the nfsv3driver
logs for reference. Kindly let me know where it went wrong. Thanks in Advance.
nfsv3driver.log

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.