Giter Site home page Giter Site logo

Comments (4)

edmc-ss avatar edmc-ss commented on September 22, 2024 1

I did just verify that, in my containerized environment at least, all three size= values succeed on stable/HEAD (2.01.00).

from proxyfs.

SadeghTb avatar SadeghTb commented on September 22, 2024

1 MiB sample result :

4KiB_write: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
4KiB_read: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.16
Starting 2 processes
4KiB_write: Laying out IO file (1 file / 1MiB)
4KiB_read: Laying out IO file (1 file / 1MiB)

4KiB_write: (groupid=0, jobs=2): err= 0: pid=1621: Sat Apr  9 17:06:41 2022
  read: IOPS=1753, BW=7014KiB/s (7182kB/s)(1024KiB/146msec)
    clat (nsec): min=1800, max=33275k, avg=552421.08, stdev=2065224.15
     lat (nsec): min=1869, max=33276k, avg=552658.18, stdev=2065271.34
    clat percentiles (usec):
     |  1.00th=[   46],  5.00th=[  227], 10.00th=[  273], 20.00th=[  306],
     | 30.00th=[  343], 40.00th=[  371], 50.00th=[  392], 60.00th=[  420],
     | 70.00th=[  453], 80.00th=[  490], 90.00th=[  570], 95.00th=[  693],
     | 99.00th=[ 1647], 99.50th=[ 2606], 99.90th=[33162], 99.95th=[33162],
     | 99.99th=[33162]
  write: IOPS=522, BW=2090KiB/s (2140kB/s)(1024KiB/490msec); 0 zone resets
    clat (usec): min=215, max=105894, avg=1387.33, stdev=9403.43
     lat (usec): min=215, max=105895, avg=1387.69, stdev=9403.47
    clat percentiles (usec):
     |  1.00th=[   231],  5.00th=[   281], 10.00th=[   289], 20.00th=[   322],
     | 30.00th=[   338], 40.00th=[   367], 50.00th=[   388], 60.00th=[   416],
     | 70.00th=[   437], 80.00th=[   482], 90.00th=[   578], 95.00th=[   840],
     | 99.00th=[ 33817], 99.50th=[103285], 99.90th=[105382], 99.95th=[105382],
     | 99.99th=[105382]
  lat (usec)   : 2=0.20%, 10=0.20%, 50=0.20%, 250=3.91%, 500=77.73%
  lat (usec)   : 750=12.89%, 1000=1.56%
  lat (msec)   : 2=1.76%, 4=0.78%, 50=0.39%, 250=0.39%
  cpu          : usr=0.00%, sys=4.42%, ctx=1112, majf=0, minf=24
  IO depths    : 1=100.2%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=256,256,0,1 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=7014KiB/s (7182kB/s), 7014KiB/s-7014KiB/s (7182kB/s-7182kB/s), io=1024KiB (1049kB), run=146-146msec
  WRITE: bw=2090KiB/s (2140kB/s), 2090KiB/s-2090KiB/s (2140kB/s-2140kB/s), io=1024KiB (1049kB), run=490-490msec

10MiB sample result:

4KiB_write: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
4KiB_read: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.16
Starting 2 processes
4KiB_write: Laying out IO file (1 file / 10MiB)
4KiB_read: Laying out IO file (1 file / 10MiB)
^C^C
fio: terminating on signal 2


Run status group 0 (all jobs):

from proxyfs.

edmc-ss avatar edmc-ss commented on September 22, 2024

Nice detail in the issue, SadeghTb. I believe the published fio.conf file you specify 100Mi size, rather than the 1Mi or 10Mi, correct?

I ran with each of the size= settings of 1Mi, 10Mi, and 100Mi... and it worked, in my container setup just fine. I will say that, running in the containerized environment (currently), the lowly 10MiB/sec does mean the test pauses for a bit while it lays out the 100MiB file... and then takes 10ish seconds to complete:

=====

4KiB_write: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
4KiB_read: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.28
Starting 2 processes
4KiB_write: Laying out IO file (1 file / 100MiB)
4KiB_read: Laying out IO file (1 file / 100MiB)
Jobs: 2 (f=2): [W(1),R(1)][15.0%][r=11.7MiB/s,w=6956KiB/s][r=2996,w=1739 IOPS][eJobs: 2 (f=2): [W(1),R(1)][23.5%][r=9722KiB/s,w=8247KiB/s][r=2430,w=2061 IOPS][eJobs: 2 (f=2): [W(1),R(1)][31.2%][r=13.0MiB/s,w=7492KiB/s][r=3323,w=1873 IOPS][eJobs: 2 (f=2): [W(1),R(1)][40.0%][r=10.3MiB/s,w=7808KiB/s][r=2641,w=1952 IOPS][eJobs: 2 (f=2): [W(1),R(1)][46.7%][r=9100KiB/s,w=7480KiB/s][r=2275,w=1870 IOPS][eJobs: 2 (f=2): [W(1),R(1)][53.3%][r=9502KiB/s,w=7272KiB/s][r=2375,w=1818 IOPS][eJobs: 2 (f=2): [W(1),R(1)][60.0%][r=12.9MiB/s,w=7480KiB/s][r=3291,w=1870 IOPS][eJobs: 1 (f=1): [W(1),(1)][66.7%][r=12.4MiB/s,w=7444KiB/s][r=3173,w=1861 IOPS][eJobs: 1 (f=1): [W(1),(1)][78.6%][w=9285KiB/s][w=2321 IOPS][eta 00m:03s] Jobs: 1 (f=1): [W(1),_(1)][100.0%][w=8960KiB/s][w=2240 IOPS][eta 00m:00s]
4KiB_write: (groupid=0, jobs=2): err= 0: pid=3191: Sun Apr 10 06:03:07 2022
read: IOPS=2874, BW=11.2MiB/s (11.8MB/s)(100MiB/8906msec)
clat (nsec): min=474, max=83337k, avg=346269.03, stdev=1326771.80
lat (nsec): min=497, max=83338k, avg=346427.62, stdev=1326782.20
clat percentiles (nsec):
| 1.00th=[ 498], 5.00th=[ 516], 10.00th=[ 540],
| 20.00th=[ 612], 30.00th=[ 31616], 40.00th=[ 220160],
| 50.00th=[ 309248], 60.00th=[ 378880], 70.00th=[ 436224],
| 80.00th=[ 493568], 90.00th=[ 561152], 95.00th=[ 643072],
| 99.00th=[ 1011712], 99.50th=[ 1843200], 99.90th=[11599872],
| 99.95th=[12255232], 99.99th=[71827456]
bw ( KiB/s): min= 8495, max=15032, per=98.49%, avg=11324.24, stdev=1948.10, samples=17
iops : min= 2123, max= 3758, avg=2831.06, stdev=487.02, samples=17
write: IOPS=1989, BW=7957KiB/s (8148kB/s)(100MiB/12869msec); 0 zone resets
clat (usec): min=64, max=83442, avg=496.97, stdev=1562.39
lat (usec): min=64, max=83442, avg=497.19, stdev=1562.39
clat percentiles (usec):
| 1.00th=[ 137], 5.00th=[ 223], 10.00th=[ 269], 20.00th=[ 326],
| 30.00th=[ 367], 40.00th=[ 396], 50.00th=[ 416], 60.00th=[ 445],
| 70.00th=[ 474], 80.00th=[ 515], 90.00th=[ 578], 95.00th=[ 652],
| 99.00th=[ 1090], 99.50th=[ 2073], 99.90th=[11863], 99.95th=[12911],
| 99.99th=[79168]
bw ( KiB/s): min= 6432, max=10416, per=100.00%, avg=7987.68, stdev=1022.27, samples=25
iops : min= 1608, max= 2604, avg=1996.84, stdev=255.54, samples=25
lat (nsec) : 500=0.83%, 750=11.23%, 1000=0.15%
lat (usec) : 2=1.09%, 4=0.05%, 10=0.75%, 20=0.34%, 50=0.80%
lat (usec) : 100=0.60%, 250=9.54%, 500=53.92%, 750=18.02%, 1000=1.57%
lat (msec) : 2=0.60%, 4=0.07%, 10=0.02%, 20=0.38%, 100=0.03%
cpu : usr=0.94%, sys=9.84%, ctx=84140, majf=0, minf=26
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=25600,25600,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
READ: bw=11.2MiB/s (11.8MB/s), 11.2MiB/s-11.2MiB/s (11.8MB/s-11.8MB/s), io=100MiB (105MB), run=8906-8906msec
WRITE: bw=7957KiB/s (8148kB/s), 7957KiB/s-7957KiB/s (8148kB/s-8148kB/s), io=100MiB (105MB), run=12869-12869msec

=====

When I've seen hangs like this in the past, it's always boiled down to one of two things:

  1. I've got some "lock" I've neglected to release whereupon a subsequent attempt to re-acquire the lock hangs...

  2. I've run out of disk space in my Swift instance...

To get to the bottom of #1, just dumping the stack trace for all Go Routines is sufficient to nail the culprit. #2 is exposed with df commands on the Swift Object Server(s) of course.

As you've probably seen, ProxyFS is composed of two or three components:

ickpt - this guy works around the "eventual consistency" of Swift Object PUT behavior in that the CheckPoint Record is PUT to Object "0000000000000000" each time. Thus, a subsequent GET (upon restarting) is not guaranteed to GET the last CheckPoint Record that was PUT. You can certainly run w/out ickpt - I do this all the time in test environments where the Eventual Consistency of Swift is very unlikely to bite us (i.e. since there are no failures of Swift nodes during the test sequence).

imgr - this is the "centralized" metadata server. Well, really, it just is the entity recording updates to the Inode Table... a simple mapping from Inode Number to an Object Number in the Swift Container. It also is responsible for tracking "leases" (distributed lock manager) and open/close references. The only Object it PUTs are the CheckPoint Record Object and the B+Tree "pages" that get modified when the Inode Table is updated.

iclient - this is the FUSE file system driver that really does all the heavy lifting of modifying the Objects making up the file system elements (i.e. directories, files, and symlinks).

BTW - cool that you enabled tracing. You can do that on the imgr side as well...kind of paints both ends of the RPC echanges between iclient & imgr that way.

I'll keep trying to reproduce your issue over the next couple of days... but any further info you might provide would be extremely helpful. One such critical piece of info would be the commit tag you are running. Here's mine (this is currently development/HEAD:

$ git describe
2.01.0-10-gb00f4d7f

I suppose you might be using the HEAD of stable:

$ git describe
2.01.0

I haven't tested that one but will endeavor to do so at my end. There have certainly been improvements in the 10 commits on the development branch. Frankly, we are nearly done getting the portion of the standard PJD test suite (https://github.com/pjd/pjdfstest) that is pertinent to the feature set of ProxyFS (i.e. only dir, file, and symlinks... no atime, no sticky/setuid/setgit, etc...). The intention was to do a 2.02.00 tagged release once we get to that point...probably this upcoming week.

from proxyfs.

SadeghTb avatar SadeghTb commented on September 22, 2024

Thank you so much for your complete answer, edmc-ss.
yeah, there was not an issue with proxyfs and the problem was my swift that its backend storage got full and It had a capacity of only a few megabytes.

from proxyfs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.