Hello, I am a user of your great package.
I have inquiries about use of ref_batch.
I guess the ref_batch option enables us to set the reference batch that indicates data for calculating the general mean (, which makes the selected data are not modified by pycombat).
Firstly, I would like to know if my above understanding is correct or not.
The second one may be derived from incorrect use of your package, but when I give a batch list composed of 0 and 1 like [0, 0, ..., 1, 1] and indicate ref_batch=0, I got an IndexError.
On the other hand, it works well when I used a batch list composed of 1 and 2 and ref_batch=1.
I would be glad if you would have an idea about this situation.
C:\Users\{xxxx}\miniconda3\envs\pipenv\lib\site-packages\combat\pycombat.py:358: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
batches = np.asarray(batches)
Found 2 batches.
Adjusting for 0 covariate(s) or covariate level(s).
Standardizing Data across genes.
IndexError: arrays used as indices must be of integer (or boolean) type
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-63-012733a94a1e> in <module>
5 test = pd.concat([test0,test1],axis=1,join='inner')
6 batch = [0] * test0.shape[1] + [1] * test1.shape[1]
----> 7 res = pycombat(test,batch,par_prior=True,ref_batch=0)
8 res
~\miniconda3\envs\pipenv\lib\site-packages\combat\pycombat.py in pycombat(data, batch, mod, par_prior, prior_plots, mean_only, ref_batch, precision, **kwargs)
659 design = treat_covariates(batchmod, mod, ref, n_batch)
660 NAs = check_NAs(dat)
--> 661 B_hat, grand_mean, var_pooled = calculate_mean_var(
662 design, batches, ref, dat, NAs, ref_batch, n_batches, n_batch, n_array)
663 stand_mean = calculate_stand_mean(
~\miniconda3\envs\pipenv\lib\site-packages\combat\pycombat.py in calculate_mean_var(design, batches, ref, dat, NAs, ref_batch, n_batches, n_batch, n_array)
460 if not NAs: # NAs not supported
461 if ref_batch is not None: # depending on ref batch
--> 462 ref_dat = np.transpose(np.transpose(dat)[batches[ref]])
463 var_pooled = np.dot(np.square(ref_dat - np.transpose(np.dot(np.transpose(
464 design)[batches[ref]], B_hat))), [1/n_batches[ref]]*n_batches[ref])
IndexError: arrays used as indices must be of integer (or boolean) type```
My environment is briefly Win10 and python 3.9.5.
Best,