Author: Alexandre Bagnoud ([email protected]) version: 29th of octobre 2019
This script automatically analyses daily reports generated by the AMPTS II system in order to measure the biomethane potential (BMP) of substrate. The pipeline relies on the biogas R package from Sasha D. Hafner et al.
All the inputs and outputs related to this script can be found in this repository. The R version of this script is also available here.
Providing a an AMPTS II daily report (as csv format) and another csv file that describes the BMP test setup, the script will automatically generate the following outputs:
- A table summarizing for each reactor and each day the gross methane production (in NmL CH4);
- A plot showing the cumulated methane production for each reactor (in NmL CH4);
- A plot showing for each reactor the methane production rate (i.e. the percentage of the daily CH4 production over the total CH4 production);
- A plot showing for each reactor and for each day the BMP value (in NmL CH4 g-1 vs);
- A table indicating for each reactor and for each day some useful test indicators (such as the CH4 rate production, the relative standard deviation of the inoculum, and the CH4 produced by the inoculum);
- A table summarizing the BMP values for each substrate tested and for three termination criteria:
- end of the test (all date are used);
- after 25 days of test;
- when the daily CH4 production rate stays below 1% of the total CH4 production during three consecutive days;
- A barplot showing the BMP (expressed as NmL CH4 g-1 vs) for each sample tested;
- A barplot showing the BMP (expressed as NmL CH4 g-1 ww) for each sample tested;
- A table which reports the BMP values for each substrates, alongside with other useful (RSD of cellulose, RSD of substrates, fraction of CH4 produced by the inoculum, etc.).
Use here the daily report generated by AMPTS II. If not in csv format, you'll have to convert it by opening the xml file with Excel, and then by saving it as csv file.
Place all the report files into a subfolder named log_files
.
The setup file bmp_setup.csv
is a table that includes the following information (for each reactor):
bottle.num
: the bottle number, which should correspond to the bottle number of the AMPTS II assay;bottle.id
: a unique identifier for each reactor;inoc.only
: 1 for inoculum-only reactors, 0 for the others;pos.control
: 1 for the positive control reactors (e.g. cellulose), 0 for the others;inoc.mass
: the mass of the inoculum used in each reactor (ww- or vs-based);substrate.mass.vs
: the mass of the substrate used in each reactor (VS-based only);sample.group
: a name for each group of sample replicates;plot.label
: the label that will be displayed on the plots for each group of sample replicates;plot.color
: the color (in hex format) that will be used for each reactors in the plots;dw
: the dry weight of each substrate used (in % [g DW g-1 ww]);loi
: the loss on ignition of each substrate used (in % [g loi g-1 dw]).
Set below your R variables:
bmp.folder
: the folder that contains thelog_files/
folder and the setup file. The outputs will be generated here.log.folder
: the name of the folder containing the daily reports.setup.file
: the path to the setup file (relative tobmp.folder
).output.folder
: the name of the folder in which outputs will be saved. This folder will be created inbmp.folder
.inoculum
: a short description of the inoculum used for the BMP assay. It will be only used for generating the final report.start.time
: the starting date and time of the BMP assay (in"%Y-%m-%d_%H%M"
format).
# Input files and output folder
bmp.folder <- "D:/eistore2_local/BMP/BMP_R_analysis/ampts2_scripts/github_repo/"
log.folder <- "log_files/"
setup.file <- "setup_bmp.csv"
output.folder <- "output_R_v1/"
inoculum <- "Anaerobic sludge from WWTP"
start.time <- "2019-10-07 11:30:00"
# Load R packages
library("biogas")
library("ggplot2")
library("RColorBrewer")
library("reshape2")
setwd(bmp.folder)
In case log_files/
contains multiple daily reports, only the most recent one will be considered. Outputs from previous days won't be erased, but kept in a separate folder.
l <- list.files(path = log.folder, pattern = "report_.*csv")
log.file.day <- paste0(log.folder, l[length(l)])
daily.log <- read.table(log.file.day, header = T, as.is = T, skip = 16, fill = T, sep = ",")
# Starting date and time
s <- as.POSIXct(start.time, tz = "Europe/Berlin")
# End time
e <- sub(".*report_", "", log.file.day)
e <- sub(" day RS.txt", "", e)
e <- strptime(e, "%Y-%m-%d_%H%M", tz = "Europe/Berlin")
# Number of complete days
last.day <- as.numeric(floor(e-s))
Based on the test duration, all the uncomplete days in the daily report are dropped:
# Parse the log file
daily.log <- daily.log[daily.log$X0 <= last.day, c(1,17:31)]
colnames(daily.log) <- c("Days", 1:15)
Here xCH4
is the proportion of methane in the measured gas. As CO2 is trapped in an alakine solution, the biogas produced is expected to contain only CH4, hence xCH4
is set to 1
.
# Load setup data
setup.data <- read.table(setup.file, header = TRUE, sep = ",", nrows = 15, as.is = TRUE, comment.char = "@")
setup.data$channel <- row.names(setup.data)
setup.data$xCH4 <- 1
The results of these calculation will be used later for the final report and for calculating BMP on a wet weight basis.
# Compute VS mass (on wet mass basis)
setup.data$vs.obs <- setup.data$dw * setup.data$loi / 100
setup.data$vs.mean <- NA
for (s in unique(setup.data$sample.group)) {
setup.data$vs.mean[grepl(s, setup.data$sample.group)] <- mean(setup.data$vs.obs[grepl(s, setup.data$sample.group)])
}
# Compute ISR
setup.data$isr.obs <- setup.data$inoc.mass / setup.data$substrate.mass.vs
setup.data$isr.mean <- NA
for (s in unique(setup.data$sample.group)) {
setup.data$isr.mean[grepl(s, setup.data$sample.group)] <- mean(setup.data$isr.obs[grepl(s, setup.data$sample.group)])
}
# Melt the dataframe
daily.log <- melt(daily.log, id.vars = 1)
colnames(daily.log) <- c("days", "channel", "CH4")
# Merge both dataframes
daily.log$bottle.id <- NA
daily.log$inoc.mass <- NA
daily.log$substrate.mass.vs <- NA
daily.log$xCH4 <- NA
daily.log$sample.group <- NA
daily.log$vs.mean <- NA
daily.log$isr.mean <- NA
daily.log$plot.label <- NA
daily.log$plot.color <- NA
daily.log$pos.control <- NA
daily.log$inoc.only <- NA
daily.log$pos.control <- NA
for (n in setup.data$bottle.num) {
daily.log$bottle.id[n == daily.log$channel] <- setup.data$bottle.id[setup.data$bottle.num == n]
daily.log$inoc.mass[n == daily.log$channel] <- setup.data$inoc.mass[setup.data$bottle.num == n]
daily.log$substrate.mass.vs[n == daily.log$channel] <- setup.data$substrate.mass.vs[setup.data$bottle.num == n]
daily.log$xCH4[n == daily.log$channel] <- setup.data$xCH4[setup.data$bottle.num == n]
daily.log$sample.group[n == daily.log$channel] <- setup.data$sample.grou[setup.data$bottle.num == n]
daily.log$vs.mean[n == daily.log$channel] <- setup.data$vs.mean[setup.data$bottle.num == n]
daily.log$isr.mean[n == daily.log$channel] <- setup.data$isr.mean[setup.data$bottle.num == n]
daily.log$plot.label[n == daily.log$channel] <- setup.data$plot.label[setup.data$bottle.num == n]
daily.log$plot.color[n == daily.log$channel] <- setup.data$plot.color[setup.data$bottle.num == n]
daily.log$pos.control[n == daily.log$channel] <- setup.data$pos.control[setup.data$bottle.num == n]
daily.log$inoc.only[n == daily.log$channel] <- setup.data$inoc.only[setup.data$bottle.num == n]
daily.log$pos.control[n == daily.log$channel] <- setup.data$pos.control[setup.data$bottle.num == n]
}
As the methane volume of the AMPTS II daily report are already normalized, we can set the temperature as 0
degrees Celsius and the pressure as 1013.25
hPa
daily.log$temp.celsius <- 0
daily.log$pressure.hPa <- 1013.25
We can use here the cumBg
function of the biogas package to do this calculation.
# Compute and plot cumulative methane production
cum.prod <- cumBg(dat=daily.log, dat.type = "vol", temp = "temp.celsius", pres = "pressure.hPa", data.struct = "longcombo",
id.name = "bottle.id", time.name = "days", dat.name = "CH4", comp.name = "xCH4",
unit.pres = "hPa", addt0 = TRUE, dry = TRUE)
## Working with volume data, applying volumetric method.
## Using a standard pressure of 1013.25 hPa and standard temperature of 0 C for standardizing volume.
Here we fill the NA gaps of the time zero lines
for (n in 1:nrow(cum.prod)) {
if (is.na(cum.prod$channel[n])){
cum.prod[n, 2:15] <- cum.prod[n+1, 2:15]
}
}
# Create the output folder (if it does not exist)
if(!dir.exists(output.folder)) {dir.create(output.folder)}
dir.create(file.path(paste0(output.folder, "day_", last.day, "/")), showWarnings = FALSE)
# Save cum.prod as a csv file
output1 <- cum.prod[,-c(3,5:7, 9:17, 19, 21)]
write.table(output1, file = paste0(output.folder, "day_", last.day, "/1-cumulative_production_day",last.day, ".csv"), row.names = FALSE, quote = FALSE, sep = ",")
head(output1)
## days channel bottle.id sample.group vCH4 cvCH4 rvCH4
## 1 0 4 cellulose 1 Cellulose 0.0 0.0 NA
## 2 1 4 cellulose 1 Cellulose 93.3 93.3 93.3
## 3 2 4 cellulose 1 Cellulose 239.2 332.5 239.2
## 4 3 4 cellulose 1 Cellulose 392.2 724.7 392.2
## 5 4 4 cellulose 1 Cellulose 398.3 1123.0 398.3
## 6 5 4 cellulose 1 Cellulose 150.7 1273.7 150.7
In this output :
vCH4
is the volume of the produced CH4 (in NmL) during the time interval (which is 1 day);cCH4
is the cumulated volume of the produced CH4 (in NmL);rvCH4
is the production rate of CH4 (in NmL day-1).
plot <- ggplot(cum.prod, aes(x=days, y=cvCH4, group = channel, color = sample.group)) +
geom_line(size = 1) +
theme_bw() +
ylab("Cumulative methane production (NmL)") +
scale_x_continuous(name = "Time (days)", breaks = seq(0, max(cum.prod$days), by = 2)) +
scale_colour_manual(values= unique(cum.prod$plot.color), name = "Substrates:", labels = unique(cum.prod$plot.label))
print(plot)
# Save the plot
png(filename = paste0(output.folder, "day_", last.day, "/2-cumulative_production_day", last.day, ".png"), units = "in", res = 300, width = 7, height = 5)
print(plot)
dev.off()
First, rates are calculated with summBg
from the biogas package:
summBg.rates <- summBg(vol = cum.prod, setup = setup.data, id.name = "bottle.id", time.name = "days",
descrip.name = "sample.group",
inoc.name = unique(setup.data$sample.group[setup.data$inoc.only == 1]),
norm.name = "substrate.mass.vs",
inoc.m.name = "inoc.mass", show.obs = TRUE, show.rates = TRUE)
## Response variable (volume) is cum.prod$cvCH4.
Then, the rates are plotted. Be aware that the y-axis has a sqrt scale. The 1% threshold is highlighted with a thicker horizontal line.
rates.plot <- ggplot(summBg.rates, aes(x=days, y=rrvCH4, group = channel, color = sample.group)) +
geom_line(size = 1) +
theme_bw() +
ylab("Daily methane production compared to total methane production (%)") +
scale_x_continuous(name = "Time (days)", breaks = seq(0, max(cum.prod$days), by = 2)) +
scale_y_sqrt(breaks = c(0, 1, 10, 25, 50, 75, 100, 125)) +
scale_colour_manual(values = unique(summBg.rates$plot.color[summBg.rates$inoc.only == 0]), name = "Substrates:", labels = unique(summBg.rates$plot.label[summBg.rates$inoc.only == 0])) +
geom_hline(yintercept=1, size = 1)
print(rates.plot)
The plot is saved
png(filename = paste0(output.folder, "day_", last.day, "/3-daily_CH4_production_day", last.day, ".png"), units = "in", res = 300, width = 7, height = 5)
print(rates.plot)
dev.off()
The plot that will be computed here will show us what would have been the BMP value (of each reactor) if we would have stop the test earlier.
First, we compute BMP values for each reactor and for each day with summBg
function from the biogas package:
summBg.bmp.curves <- summBg(vol = cum.prod, setup = setup.data, id.name = "bottle.id", time.name = "days",
descrip.name = "sample.group",
inoc.name = unique(setup.data$sample.group[setup.data$inoc.only == 1]),
norm.name = "substrate.mass.vs",
inoc.m.name = "inoc.mass", show.obs = TRUE, when = "meas")
## Response variable (volume) is cum.prod$cvCH4.
## Inoculum contribution subtracted based on setup$inoc.mass.
## Response normalized by setup$substrate.mass.vs.
# Here we simply copy the methane rates production from 'summBg.rates'
summBg.bmp.curves$rrvCH4 <- summBg.rates$rrvCH4
BMP values can be plotted:
bmp.curves <- ggplot(summBg.bmp.curves, aes(x=days, y=cvCH4, group = channel, color = sample.group)) +
geom_line(size = 1) +
theme_bw() +
ylab("BMP (NmL CH4 / g vs substrate)") +
scale_x_continuous(name = "Time (days)", breaks = seq(0, max(cum.prod$days), by = 2)) +
scale_colour_manual(values = unique(summBg.rates$plot.color[summBg.rates$inoc.only == 0]), name = "Substrates:", labels = unique(summBg.rates$plot.label[summBg.rates$inoc.only == 0]))
print(bmp.curves)
The plot is saved:
png(filename = paste0(output.folder, "day_", last.day, "/4-bmp_curves_day", last.day, ".png"), units = "in", res = 300, width = 7, height = 5)
print(bmp.curves)
dev.off()
summBg.bmp.curves
is saved as a csv file:
output5 <- summBg.bmp.curves[,-c(3:6,8:17)]
write.table(output5, file = paste0(output.folder, "day_", last.day, "/5-bmp_obs_and_rates_day", last.day, ".csv"), row.names = FALSE, quote = FALSE, sep = ",")
head(output5)
## days bottle.id substrate.mass.vs isr.mean cvCH4 vol.mi.mn
## 1 0 cellulose 1 3.071 3.233826 0.000000 0.000000
## 2 1 cellulose 1 3.071 3.233826 5.497507 7.642569
## 3 2 cellulose 1 3.071 3.233826 60.072174 14.803489
## 4 3 cellulose 1 3.071 3.233826 170.693639 20.052223
## 5 4 cellulose 1 3.071 3.233826 285.855052 24.516652
## 6 5 cellulose 1 3.071 3.233826 322.386436 28.368294
## vol.mi.se n rsd.inoc cvCH4.tot cvCH4.inoc fv.inoc se.inoc cvCH4.se
## 1 0.0000000 3 NaN 0.00000 0.00000 NaN 0.0000000 0
## 2 0.2528687 3 5.730813 30.38098 24.88348 0.8190478 0.8233162 0
## 3 0.4824469 3 5.644767 108.27092 48.19875 0.4451680 1.5708009 0
## 4 0.7028500 3 6.071007 235.98176 65.28813 0.2766660 2.2884124 0
## 5 0.9380567 3 6.627177 365.67893 79.82388 0.2182895 3.0542231 0
## 6 1.1171079 3 6.820599 414.75090 92.36446 0.2226986 3.6371968 0
## rrvCH4
## 1 NA
## 2 100.00
## 3 90.85
## 4 64.81
## 5 40.29
## 6 11.33
In this output, some columns intersting columns include:
rsd.inoc
: relative standard deviation of the inoculum-only bottles (in %);fv.inoc
: fraction of methane in each bottle that comes from the inoculum;rrvCH4
: Daily methane production rate (in % of the total methane prodcution; what is plotted inrates.plot
).
More information about these values can be found by typing ?summBg
in the console.
First, we calculate the BMP values based on all data available:
bmp.end <- summBg(vol = cum.prod, setup = setup.data, id.name = "bottle.id", time.name = "days",
descrip.name = "sample.group",
inoc.name = unique(setup.data$sample.group[setup.data$inoc.only == 1]),
norm.name = "substrate.mass.vs",
inoc.m.name = "inoc.mass", when = "end", show.more = TRUE)
## Response variable (volume) is cum.prod$cvCH4.
## Inoculum contribution subtracted based on setup$inoc.mass.
## Response normalized by setup$substrate.mass.vs.
Then, we calculate the BMP values after 25 days. If the test did not last 25 days, we replaced all output values by NA
:
if (last.day >= 25) {
bmp.25d <- summBg(vol = cum.prod, setup = setup.data, id.name = "bottle.id", time.name = "days",
descrip.name = "sample.group",
inoc.name = unique(setup.data$sample.group[setup.data$inoc.only == 1]),
norm.name = "substrate.mass.vs",
inoc.m.name = "inoc.mass", when = 25, show.more = TRUE)
} else {
bmp.25d <- bmp.end
bmp.25d$days <- 25
bmp.25d[,3:11] <- NA
}
Then, the BMP test of (all replicates of) a given substrate is terminated when the daily methane production drops below 1% of the total production for three consecutive days:
bmp.1p3d <- summBg(vol = cum.prod, setup = setup.data, id.name = "bottle.id", time.name = "days",
descrip.name = "sample.group",
inoc.name = unique(setup.data$sample.group[setup.data$inoc.only == 1]),
norm.name = "substrate.mass.vs",
inoc.m.name = "inoc.mass", when = "1p3d", show.more = TRUE)
## Response variable (volume) is cum.prod$cvCH4.
## Inoculum contribution subtracted based on setup$inoc.mass.
## Response normalized by setup$substrate.mass.vs.
## Warning in summBg(vol = cum.prod, setup = setup.data, id.name =
## "bottle.id", : You selected 1p3d option for "when" argument but there are
## no observations that meet the criterion for the following bottles. Instead,
## the latest time was selected. cellulose 1,
The chunk below is a bit tricky. It basically replace output value by NA
in case the termination criterion is not achieved:
c.vector <- vector(mode = "character")
g.vector <- vector(mode = "character")
r1.vector <- vector(mode = "numeric")
r2.vector <- vector(mode = "numeric")
r3.vector <- vector(mode = "numeric")
for (c in unique(summBg.rates$channel)) {
c.vector <- c(c.vector, c)
group <- unique(summBg.rates$sample.group[summBg.rates$channel == c])
g.vector <- c(g.vector, group)
r1 <- summBg.rates$rrvCH4[summBg.rates$channel == c & summBg.rates$days == last.day-2]
r1.vector <- c(r1.vector, r1)
r2 <- summBg.rates$rrvCH4[summBg.rates$channel == c & summBg.rates$days == last.day-1]
r2.vector <- c(r2.vector, r2)
r3 <- summBg.rates$rrvCH4[summBg.rates$channel == c & summBg.rates$days == last.day]
r3.vector <- c(r3.vector, r3)
}
if(length(r1.vector) == 0) {r1.vector <- rep(NA, length(unique(summBg.rates$channel)))}
if(length(r2.vector) == 0) {r2.vector <- rep(NA, length(unique(summBg.rates$channel)))}
if(length(r3.vector) == 0) {r3.vector <- rep(NA, length(unique(summBg.rates$channel)))}
rates.df <- data.frame(channel = c.vector, group = g.vector, r1 = r1.vector, r2 = r2.vector, r3 = r3.vector)
for (g in unique(bmp.1p3d$sample.group)){
m <- max(rates.df[rates.df$group == g,3:5])
if (m >= 1 | is.na(m)) {
bmp.1p3d[bmp.1p3d$sample.group == g, 2:11] <- NA
}
}
Finally, the 3 dataframes can be merged and saved as csv file:
bmp <- rbind(bmp.end, bmp.25d, bmp.1p3d)
bmp$term <- c(rep("end", nrow(bmp.end)), rep("25d", nrow(bmp.25d)), rep("1p3d", nrow(bmp.end)))
write.table(bmp, file = paste0(output.folder, "day_", last.day, "/6-bmp_values_day", last.day, ".csv"), row.names = FALSE, quote = FALSE, sep = ",")
bmp
## sample.group days mean se sd n rsd.inoc fv.inoc
## 1 Cellulose 13 356.5648 10.267664 17.784116 3 6.397894 0.3001387
## 2 Sample 1 13 429.3575 7.081338 12.265238 3 6.397894 0.2519656
## 3 Sample 2 13 455.3702 9.924730 17.190137 3 6.397894 0.2363577
## 4 Sample 3 13 439.8379 5.559033 9.628527 3 6.397894 0.2382067
## 5 Cellulose 25 NA NA NA NA NA NA
## 6 Sample 1 25 NA NA NA NA NA NA
## 7 Sample 2 25 NA NA NA NA NA NA
## 8 Sample 3 25 NA NA NA NA NA NA
## 9 Cellulose NA NA NA NA NA NA NA
## 10 Sample 1 11 425.3428 6.760054 11.708756 3 6.627789 0.2393978
## 11 Sample 2 12 453.0848 9.689786 16.783201 3 6.469474 0.2309978
## 12 Sample 3 12 437.7448 5.370338 9.301698 3 6.469474 0.2327726
## se1 se2 se3 term
## 1 8.577060 5.644374 0 end
## 2 4.646466 5.343754 0 end
## 3 8.449819 5.205845 0 end
## 4 2.250100 5.083296 0 end
## 5 NA NA NA 25d
## 6 NA NA NA 25d
## 7 NA NA NA 25d
## 8 NA NA NA 25d
## 9 NA NA NA 1p3d
## 10 4.408944 5.124406 0 1p3d
## 11 8.249357 5.083311 0 1p3d
## 12 2.050059 4.963646 0 1p3d
More details about the values of this output can be found by typing ?summBg
in the console.
In the last output, up to three BMP values were calculated according to 3 different termination criteria. Here only one will be kept, following this rationale:
- If the test lasted for more than 25 days and the production rate went below 1% before 25 days: the BMP test is stopped after 25 days;
- If the test lasted for more than 25 days and the production rate went below 1% after 25 days: the BMP test is stopped when the rates dropped below 1%;
- If the test did not last for 25 days and the production rate went below 1%: the BMP test is stopped when the rates dropped below 1%;
- If the test did not last for 25 days and the production rate did not go below 1%: all the production data are used to calculate the BMP values.
bmp.test.end <- as.data.frame(matrix(ncol = 12, nrow = 0))
colnames(bmp.test.end) <- colnames(bmp)
for (s in unique(bmp$sample.group)) {
bmp.sample <- bmp[bmp$sample.group == s,]
if (!is.na(bmp.sample$mean[3]) & !is.na(bmp.sample$mean[2])) {
if (bmp.sample$days[3] < 25) {test.end <- 2}
else {test.end <- 3}
}
else if (!is.na(bmp.sample$mean[3])) {test.end <- 3}
else {test.end <- 1}
bmp.test.end <- rbind(bmp.test.end, bmp.sample[test.end,])
}
Here we replace the labels "25d"
by "1p3d+25d"
, as they are all expected to have a production rate lower than 1%. We also replace the labels "1p3d"
by "1p3d+25d"
in case the BMP test lasted for more than 25 days:
bmp.test.end$term <- gsub("25d", "1p3d+25d",x = bmp.test.end$term)
for (r in 1:nrow(bmp.test.end)) {
if (bmp.test.end$days[r] > 25) {
bmp.test.end$term[r] <- gsub("1p3d", "1p3d+25d", x = bmp.test.end$term[r])
}
}
For each substrate, the BMP termination day according to bmp.test.end
is read. Then, for this substrate and for this day, the BMP value of all replicates is extracted from summBg.bmp.curves
:
bmp.obs.test.end <- as.data.frame(matrix(ncol = 31, nrow = 0))
colnames(bmp.obs.test.end) <- colnames(summBg.bmp.curves)
for(s in bmp.test.end$sample.group) {
for (t in bmp.test.end$days[bmp.test.end$sample.group == s]) {
log <- summBg.bmp.curves$days == t & summBg.bmp.curves$sample.group == s
bmp.obs.test.end <- rbind(bmp.obs.test.end, summBg.bmp.curves[log,])
}
}
And here is the corresponding barplot, including the standard deviation (as error bars) and all individual measurments (as open circles):
barplot.vs <- ggplot(data=bmp.test.end, aes(x=sample.group, y=mean)) +
geom_bar(stat="identity", width=0.5, fill="steelblue") +
geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width=.1) +
geom_jitter(data=bmp.obs.test.end, aes(x=sample.group, y=cvCH4), width = 0, shape = 21, size = 3) +
xlab("") +
ylab("BMP (NmL CH4 / g vs substrate)") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
theme_bw() +
scale_x_discrete(labels = unique(summBg.rates$plot.label))
print(barplot.vs)
Finally, the plot is saved:
png(filename = paste0(output.folder, "day_", last.day, "/7-bmp_barplot_vs_day", last.day, ".png"), units = "in", res = 300, width = 6, height = 4)
print(barplot.vs)
dev.off()
It can be useful to compute the BMP of substrate on a wet basis. We exclude the positive control from this calculation:
sample.list <- unique(setup.data$sample.group[setup.data$inoc.only + setup.data$pos.control == 0])
The final vs-based BMP values can be multiplied by the vs content (% g vs g-1 ww) measured in substrates. This is done on the mean BMP values, and on the individual BMP values:
# mean ww-based BMP values
bmp.test.end.ww <- bmp.test.end[bmp.test.end$sample.group %in% sample.list , c(1,2,3,5,6,8,12)]
for (r in 1:nrow(bmp.test.end.ww)) {
s <- bmp.test.end.ww$sample.group[r]
vs.f <- unique(setup.data$vs.mean[setup.data$sample.group == s])
bmp.test.end.ww$mean[r] <- bmp.test.end.ww$mean[r] * vs.f / 100
bmp.test.end.ww$sd[r] <- bmp.test.end.ww$sd[r] * vs.f / 100
}
# Individual ww-based BMP values
bmp.obs.test.end.ww <- bmp.obs.test.end[bmp.obs.test.end$sample.group %in% sample.list,]
for (r in 1:nrow(bmp.obs.test.end.ww)) {
s <- bmp.obs.test.end.ww$sample.group[r]
vs.f <- unique(setup.data$vs.mean[setup.data$sample.group == s])
bmp.obs.test.end.ww$cvCH4[r] <- bmp.obs.test.end.ww$cvCH4[r] * vs.f / 100
}
And here is the corresponding barplot, including the standard deviation (as error bars) and all individual measurments (as open circles):
barplot.ww <- ggplot(data=bmp.test.end.ww, aes(x=sample.group, y=mean)) +
geom_bar(stat="identity", width=0.5, fill="steelblue") +
geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width=.1) +
geom_jitter(data=bmp.obs.test.end.ww, aes(x=sample.group, y=cvCH4), width = 0, shape = 21, size = 3) +
xlab("") +
ylab("BMP (NmL CH4 / g ww substrate)") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
theme_bw() +
scale_x_discrete(labels = unique(summBg.rates$plot.label[summBg.rates$pos.control==0]))
print(barplot.ww)
Finally, the plot is saved:
png(filename = paste0(output.folder, "day_", last.day, "/8-bmp_barplot_ww_day", last.day, ".png"), units = "in", res = 300, width = 5, height = 4)
print(barplot.ww)
dev.off()
Here, we will generate a report as table that will summarize the BMP assay.
First, we define a function that print from a vector of values its mean value plus/minus a 95% confidence interval:
print.mean.ic <- function(vec, quantile = 0.975, decimals = 1, suffix = "") {
m <- mean(vec)
sd <- sd(vec)
n <- length(vec)
error <- qt(quantile,df=n-1)*sd/sqrt(n)
paste0(round(m, digits = decimals) , " ± ", round(error, digits = decimals), suffix)
}
The report will be generated as in a new dataframe summary
:
summary <- as.data.frame(matrix(nrow = length(sample.list), ncol = 16))
colnames(summary) <- c("tag", "Substrate", "dw (g/g ww)", "vs (g/g dw)", "vs (g/g ww)", "BMP (NmL CH4/g vs)", "BMP (NmL CH4/g ww)",
"Replicates number", "ISR (g vs/g vs)", "BMP cellulose (NmL CH4/g vs)",
"RSD cellulose", "CH4 fraction from inoculum", "RSD BMP", "Test duration (days)", "Termination criterion", "Inoculum source")
The values that are identical for each tested substrate are filled in:
cpos.tag <- unique(setup.data$sample.group[setup.data$pos.control == 1])
summary$tag <- unique(setup.data$sample.group[setup.data$inoc.only == 0 & setup.data$pos.control == 0])
summary$`Inoculum source` <- inoculum
summary$`BMP cellulose (NmL CH4/g vs)` <- print.mean.ic(bmp.obs.test.end$cvCH4[bmp.obs.test.end$sample.group == cpos.tag])
summary$`RSD cellulose`<- paste0(round(bmp.test.end$sd[bmp.test.end$sample.group == cpos.tag] / bmp.test.end$mean[bmp.test.end$sample.group == cpos.tag] * 100,1), "%")
And then the other values are filled in, by picking them from other datframes:
sub.summary <- NULL
ms.summary <- NULL
vs.ms.summary <- NULL
vs.ww.summary <- NULL
bmp.vs.summary <- NULL
bmp.ww.summary <- NULL
n.summary <- NULL
duration.summary <- NULL
isr.summary <- NULL
f.inoc.summary <- NULL
rsd.summary <- NULL
end.criterion <- NULL
for (s in summary$tag) {
sub.summary <- c(sub.summary, unique(setup.data$plot.label[setup.data$sample.group == s]))
ms.summary <- c(ms.summary, print.mean.ic(setup.data$dw[setup.data$sample.group == s], suffix = "%"))
vs.ms.summary <- c(vs.ms.summary, print.mean.ic(setup.data$loi[setup.data$sample.group == s], suffix = "%"))
vs.ww.summary <- c(vs.ww.summary, print.mean.ic(setup.data$vs.obs[setup.data$sample.group == s], suffix = "%"))
bmp.vs.summary <- c(bmp.vs.summary, print.mean.ic(bmp.obs.test.end$cvCH4[bmp.obs.test.end$sample.group == s]))
bmp.ww.summary <- c(bmp.ww.summary, print.mean.ic(bmp.obs.test.end.ww$cvCH4[bmp.obs.test.end.ww$sample.group == s]))
n.summary <- c(n.summary, sum(setup.data$sample.group == s))
duration.summary <- c(duration.summary, bmp.test.end$days[bmp.test.end$sample.group == s])
isr.summary <- c(isr.summary, round(unique(setup.data$isr.mean[setup.data$sample.group == s]),2))
f.inoc.summary <- c(f.inoc.summary, paste0(round(bmp.test.end$fv.inoc[bmp.test.end$sample.group == s] * 100,1), "%"))
rsd.summary <- c(rsd.summary, paste0(round(bmp.test.end$sd[bmp.test.end$sample.group == s] / bmp.test.end$mean[bmp.test.end$sample.group == s] * 100,1), "%"))
end.criterion <- c(end.criterion, bmp.test.end$term[bmp.test.end$sample.group == s])
}
summary$Substrate <- sub.summary
summary$`dw (g/g ww)` <- ms.summary
summary$`vs (g/g dw)` <- vs.ms.summary
summary$`vs (g/g ww)` <- vs.ww.summary
summary$`BMP (NmL CH4/g vs)` <- bmp.vs.summary
summary$`BMP (NmL CH4/g ww)` <- bmp.ww.summary
summary$`Replicates number` <- n.summary
summary$`Test duration (days)` <- duration.summary
summary$`ISR (g vs/g vs)` <- isr.summary
summary$`CH4 fraction from inoculum` <- f.inoc.summary
summary$`RSD BMP` <- rsd.summary
summary$`Termination criterion` <- end.criterion
And here is the final polishing of the report dataframe, which is saved as csv file:
summary$`Termination criterion` <- gsub("1p3d+25d", "<1% 3d, >= 25 d", summary$`Termination criterion`)
summary$`Termination criterion` <- gsub("1p3d", "<1% 3d", summary$`Termination criterion`)
summary$`Termination criterion` <- gsub("end", "end of test", summary$`Termination criterion`)
print(summary)
## tag Substrate dw (g/g ww) vs (g/g dw) vs (g/g ww)
## 1 Sample 1 Sample 1 8.7 ± 0% 74.9 ± 0% 6.5 ± 0%
## 2 Sample 2 Sample 2 11.6 ± 0% 93.5 ± 0% 10.8 ± 0%
## 3 Sample 3 Sample 3 8.4 ± 0% 93.5 ± 0% 7.9 ± 0%
## BMP (NmL CH4/g vs) BMP (NmL CH4/g ww) Replicates number ISR (g vs/g vs)
## 1 425.3 ± 19 27.7 ± 1.2 3 3.06
## 2 453.1 ± 35.5 49.1 ± 3.8 3 2.98
## 3 437.7 ± 8.8 34.4 ± 0.7 3 2.91
## BMP cellulose (NmL CH4/g vs) RSD cellulose CH4 fraction from inoculum
## 1 356.6 ± 36.9 5% 23.9%
## 2 356.6 ± 36.9 5% 23.1%
## 3 356.6 ± 36.9 5% 23.3%
## RSD BMP Test duration (days) Termination criterion
## 1 2.8% 11 <1% 3d
## 2 3.7% 12 <1% 3d
## 3 2.1% 12 <1% 3d
## Inoculum source
## 1 Anaerobic sludge from WWTP
## 2 Anaerobic sludge from WWTP
## 3 Anaerobic sludge from WWTP
write.table(summary, file = paste0(output.folder, "day_", last.day, "/9-bmp_summary_day", last.day, ".csv"), row.names = FALSE, quote = FALSE, sep = ",")