Python soft for morphometric stain analysis. Software counts the stained area with determined stain (DAB-chromagen for example) using the typical immunohistochemistry protocols. After the analysis user can measure the difference of proteins content in tested samples.
- Application
- Requirements
- Installation
- Basic principles
- Image samples requirements
- Interface type
- Composite image examples
- Summary statistics
- Log
- CSV output
- Statistical data output
- Command line arguments
- Typical options usage
- Authorship
- Acknowledgements
Quantitative analysis of extracellular matrix proteins in IHC-analysis, designed for scientists in biotech sphere. Could be also used for general morphometric analysis.
Python 2.7 or Python 3.4-3.5
Python libraries: numpy, scipy, skimage, matplotlib, pandas
Optional (for group analysis): seaborn
Install pip using your system package manager. For example in Debian/Ubuntu:
sudo apt-get install python3-pip
Install package using PyPi repository:
sudo pip3 install morphostain
Alternative installation
Clone this repository.
In the root folder of repository clone perform:
sudo pip3 install .
Note: you may require additional libraries for matplotlib:
sudo apt-get install libfreetype6 libpng3
Uninstall:
sudo pip3 uninstall morphostain
No GUI, command line interface only.
Script uses the color deconvolution method. It was well described by G. Landini. Python port from Skimage package of his algorythm was used. See also: Ruifrok AC, Johnston DA. Quantification of histochemical staining by color deconvolution. Anal Quant Cytol Histol 23: 291-299, 2001.
Color deconvolution is used to separate stains in multi-stained sample. This soft is mainly applied for Hematoxyline + DAB staining. Script uses hardcoded stain matrix or custom one in JSON format. You should determine your own for better result using ImageJ and hyperlink above. Determined custom matrix should replace the default one. For additional information see the comments in code.
After stain separation, script determines the stain-positive area using the default or user-defined threshold. The empty areas are excluded from the final relative area measurement as the sample could contain free space, which would affect the result accuracy.
Script creates the result folder inside the --path. Statistics, log and composite images for each sample are saved there.
- Image samples' white balance should be normalized! It is important to get the right colors of stains before separation. I could suggest free software like Rawtherapee
- Images should be acquired using the same exposure values
- Threshold should be the same at the whole image sequence if you want to compare them
- It would be better to use the manual mode in microscope camera to be sure, that your images were taken with the same parameters.
- Don't change light intensity in microscope during the sequence acquiring.
- Correct file naming should be used if group analysis is active. Everything before _ symbol will be recognized as a group name.
There should be files of the following extensions in the input directory: '.jpg', '.jpeg', '.tif', '.tiff', '.png', '.bmp'
You can find test images in this repository.
Script will render this type of image for each of your samples. User should control the result to be sure that the threshold values are right
Images for analysis: 62
Stain threshold = 40, Empty threshold = 101
Empty area filtering is disabled.
It should be adjusted in a case of hollow organ or unavoidable edge defects
CPU cores used: 2
Image saved: /home/meklon/temp/sample_native/result/Col1_02_analysis.png
Image saved: /home/meklon/temp/sample_native/result/Col1_01_analysis.png
Image saved: /home/meklon/temp/sample_native/result/Col4_02_analysis.png
Image saved: /home/meklon/temp/sample_native/result/Col4_03_analysis.png
Group analysis is active
Statistical data for each group was saved as stats.csv
Boxplot with statistics was saved as summary_statistics.png
Analysis time: 44.3 seconds
Average time per image: 0.7 seconds
Filename | Stain+ area, % |
---|---|
Alex_Pan_06.jpg | 61.55 |
Native_Pan_05.jpg | 14.23 |
Native_Trop_02.jpg | 10.83 |
Group | mean | std | median | amin | amax |
---|---|---|---|---|---|
Col1 | 38.906666666666666 | 11.818569075823012 | 37.16 | 24.58 | 61.12 |
Col4 | 30.514444444444443 | 9.177221953171763 | 30.12 | 16.62 | 45.66 |
Fibr | 38.287499999999994 | 7.836421832881198 | 34.875 | 30.41 | 53.51 |
Lam | 34.327777777777776 | 8.20530130125911 | 33.02 | 21.88 | 46.8 |
Pan | 10.21375 | 7.495407998997023 | 7.29 | 2.92 | 21.97 |
Trop | 13.702000000000002 | 3.9725329171421317 | 14.235 | 7.22 | 20.34 |
VEGF | 6.644444444444444 | 5.6577117969880515 | 4.84 | 0.96 | 16.7 |
Place all the sample images (8-bit) inside the separate folder. Subdirectories are excluded from analysis. Use the following options:
-p, --path (obligate) - path to the target directory with samples
-t0, --thresh0 (optional) - Global threshold for stain-positive area of channel_0 stain.
-t1, --thresh1 (optional) - Global threshold for stain-positive area of channel_1 stain.
-e, --empty (optional) - threshold for empty area separation. If empty the default value would be used (threshEmptyDefault = 101). It is disabled for default and should be used only in a case of hollow organs and unavoidable edge defects.
-s, --silent (optional) - if True, the real-time composite image visualisation would be supressed. The output will be just saved in the result folder.
-a, --analyze (optional) - Add group analysis after the indvidual image processing. The groups are created using the filename. Everything before _ symbol will be recognized as a group name. Example: Native_10.jpg, Native_11.jpg will be counted as a single group Native.
-m, --matrix (optional) - Your matrix in a JSON formatted file. Could be used for alternative stain vectors. Not for regular use yet. Test in progress.
-d, --dpi (optional) - Output images DPI. 900 is recommended for printing quality. High resolution can significally slow down the process.
-rs, --resize (optional) - Image resolution for processing speed up. Higher resolution increases the accuracy, but can significally slow down the process. Default value is (768,1024). Pass values like --resize 768 1024
-sc, --save_channels (optional) - Save separate stain channels to subfolder. Could be useful if you plan to process the deconvolution result using other software. For example, counting separated nuclei with CellProfiler.
-n, --notch (optional) - Notches for boxplot in group analysis to show confidence interval.
morphostain -p /home/meklon/Data/sample/test/ -t0 35 -t1 40 -e 89 -s -a --dpi 600 --resize 480 640
Stain vectors and predefined values like typical thresholds are stored in JSON format. By default dab.json is loaded. You can also use your own one with --matrix option. Histogram shift values are used to normalize image histogram. Typical structure:
{
"channel_0":"Hematoxylin",
"channel_1":"DAB-chromogen",
"channel_2":"Supplementary channel",
"vector": [[0.66504073, 0.61772484, 0.41968665],
[0.4100872, 0.5751321, 0.70785],
[0.6241389, 0.53632, 0.56816506]],
"thresh_0":30,
"thresh_1":40,
"hist_shift_0":5,
"hist_shift_1":18,
"hist_shift_2":0
}
Gumenyuk Ivan, Kuban state medical university, Russia.
Special thanks to my teammates from our lab, @radioxoma (Eugene Dvoretsky), @direvius (Alexey Lavrenuke), @freuser for his GUI (sorry, I haven't implemented it yet) and everyone, who helped me with this work.