Hello,
In my application I need to (concurrently) access as quick as possible to byte arrays stored on disk.
I have a multi threaded method where each thread preform read write access to its own byte array, one array per thread. The method is executed in a loop; at each iteration I need to read the previous state and save the next one. The read operation is performed at the begin of the method to initialize it and the write is at the end, to store the iteration state. The length of each byte array may change from one iteration to the other.
To satisfy these specifications I use an esent database where each byte array is stored in a different table having one row and one column.
Everything works well on my development machine, with relatively low degree of parallelism and fast disk (SSD), but, as often happens, the problem arisen in the production environment, a virtualized machine with 12 cores and possibly slower, but still SSD, disk.
The size of the .edb file in the production machine reached ~50GB after few iterations, while in the dev environment it never exceeded the 15GB.
I also reproduced the problem on my dev environment writing a program that just write byte arrays of random length, filled with random data thus maximizing the concurrent write accesses.
I temporarily solved the problem with a Lock statement around the write operation, but I'm looking for a better solution (e.g. using a number of databases equal to the degree of parallelism).
The following is a snippet of my test application with full parallel and locked versions:
string parallel = "parallel.csv";
using (StreamWriter outfile = new StreamWriter(parallel)){
for (int i = 0; i < iterations; i++){
Parallel.For(0, tables, j => {
byte[] buffer;
Random r = rr[j];
long size = r.Next(minBytes, maxBytes);
buffer = new byte[size];
r.NextBytes(buffer);
esentDb.Write(j, buffer); //save the data on disk
});
outfile.WriteLine("{0}", new FileInfo(parallel).Length);
}
}
string locked = "locked.csv";
using (StreamWriter outfile = new StreamWriter(locked))
{
for (int i = 0; i < iterations; i++)
{
Parallel.For(0, tables, j => {
byte[] buffer;
Random r = rr[j];
long size = r.Next(minBytes, maxBytes);
buffer = new byte[size];
r.NextBytes(buffer);
lock (lockObject){
esentDb.Write(j, buffer); //save the data on disk
}
});
outfile.WriteLine("{0}", new FileInfo(locked).Length);
}
}
with this example I optained a .edb file size of ~1.5GB in thhe full parallel case and ~300MB in the locked one, both after 100 iterations