Last modified: 2021-11-04 21:30:18
Compiled: 2021-11-04 21:31:28

Getting started

Load packages

library(ExperimentHub)
library(rMiW)
library(EBImage)
library(keras)

packageVersion("keras")
packageVersion("tensorflow")

Optional: Update (ver 3.14)

BiocManager::install(version = "3.14")

Optional: Python environment on R

#On MacOSX
library(reticulate)
#reticulate::install_miniconda(force = T)
#reticulate::use_python("~/Library/r-miniconda/envs/r-reticulate/bin/python")
reticulate::py_config()

#install pydot
#reticulate::py_install("pydot")

#On CentOS
library(reticulate)
#reticulate::install_miniconda(force = T)
#reticulate::use_python("~/.local/share/r-miniconda/envs/r-reticulate/bin/python")
reticulate::py_config()

Optional: install python packages for R keras / tensorflow (ver 3.14)

#For CPU
keras::install_keras()

Optional: Removes all files in the cache directory

install.packages("R.cache")
#system("open ~/Library/Caches/org.R-project.R/R/")
R.cache::clearCache("~/Library/Caches/org.R-project.R/R/ExperimentHub")

Optional: extract R script from the Rmd file

knitr::purl("./rMiW/vignettes/rMiW_02_BioImageDbs.Rmd", output="./rMiW/vignettes/rMiW_02_BioImageDbs.R")

Obtain 2D image dataset via BioImageDbs

About the BioImageDbs package, please ckeck Providing Bioimage Dataset for ExperimentHub document for more information.

Please check the metadata (CSV) of BioImageDbs in GitHub.

#Description: Providing Bioimage Dataset for ExperimentHub
browseURL("https://bioconductor.org/packages/release/data/experiment/vignettes/BioImageDbs/inst/doc/BioImageDbs.html")

#Metadata (CSV) for BioImageDbs
browseURL("https://github.com/kumeS/BioImageDbs/blob/main/inst/extdata/v02/metadata_v02.csv")

Search query for the BioImageDbs

Via the ExperimentHub function, we can obtain the supervised image data as a list of R arrays and their metadata.

Here shows an example of a search query for the BioImageDbs (Currently, snapshotDate(): 2021-10-18 for version 3.14).

#Set the ExperimentHub function
eh <- ExperimentHub::ExperimentHub()

#All entities of BioImageDbs
AnnotationHub::query(eh, c("BioImageDbs"))

#Query with LM_id0001 (Light Microscopy ID 0001)
AnnotationHub::query(eh, c("BioImageDbs", "LM_id0001"))

#check 4d tensor of LM_id0001
(qr <- AnnotationHub::query(eh, c("BioImageDbs", "LM_id0001_DIC_C2DH_HeLa_4dTensor_Binary")))

#Select their metadata using `qr$`
#show title
qr$title

#show description
qr$description[3]

Note: small .rds data does not work. They will be removed in future.

Optional: Download from Google Drive

file <- system.file("script", "gdrive_download.sh", package="rMiW")
system(paste0("source ", file, " ; gdrive_download 1J-wR0icTCpFgeKPP0iF4cyzD-b1m3tOO ./output.Rds"))
ImgData <- readRDS("output.Rds")
str(ImgData)

https://drive.google.com/file/d//view?usp=sharing

Acquire the image arrays

We use [] to access its metadata while [[]] to get its data instance.

We could load from cache (~/Library/Caches/org.R-project.R/R/) once the data was downloaded.

#Access metadata
qr[3]

#Show metadata 
qr[3]$title
qr[3]$description

#Download the dataset of LM_id0001 (LM_id0001_DIC_C2DH_HeLa_4dTensor_Binary.rds)
ImgData <- qr[[3]]
str(ImgData)
#List of 2
# $ Train:List of 2
#  ..$ Train_Original          : num [1:84, 1:512, 1:512, 1] 0.518 0.455 0.455 0.447 0.439 ...
#  ..$ Train_GroundTruth_Binary: num [1:84, 1:512, 1:512, 1] 0 0 0 0 0 0 0 0 0 0 ...
# $ Test :List of 2
#  ..$ Test_Original          : num [1:84, 1:512, 1:512, 1] 0.604 0.467 0.459 0.435 0.408 ...
#  ..$ Test_GroundTruth_Binary: num [1:84, 1:512, 1:512, 1] 0 1 1 1 1 1 1 0 0 0 ...

LM_id0001_DIC_C2DH_HeLa_4dTensor_Binary.Rds is a list of 4D arrays with the binary labels for the image segmentation of Human HeLa cells on a flat glass.

Show the gif animation

Here we will get a gif animation and check the result of data visualization.

#Access metadata
qr[2]

#show metadata
qr[2]$title
qr[2]$description
    
#Get gif animation
GifData <- qr[[2]]
str(GifData)  # Data path
magick::image_read(GifData)

LM_id0001_DIC_C2DH_HeLa_4dTensor_Binary_train_dataset.gif is an animation file (.gif) of the train dataset of LM_id0001_DIC_C2DH_HeLa_4dTensor_Binary.rds

Currently, only magick::image_read is supported to view gif animation files.

Image Segmentation for cell division images with two class / binary class

Check dimensions of images

We will use dimensions of images (Width, Height, Channel(Gray)) for the model construction.

#Dimensions of ImgData
#Image number, Width, Height, Channel(Gray)
str(ImgData)
dim(ImgData$Train$Train_Original)

#Use Width, Height, Channel(Gray)
ImgShape <- dim(ImgData$Train$Train_Original)[-1]
ImgShape
#[1] 512 512   1

Create an U-NET-based model

We will make the U-Net model with dropout layers.

model <- rMiW::unet2D_v01(shape = ImgShape)
model

Vidualize the model

Here visualizes the U-NET network.

rMiW::plot_model(model=model)

#OR
#use plot_model in tensorflow
rMiW::Py_plot_model(model=model)
EBImage::display(EBImage::readImage("Model.png"))
#Alternatively, perform this if do not work above.
source("https://gist.githubusercontent.com/kumeS/41fed511efb45bd55d468d4968b0f157/raw/b7205c6285422e5166f70b770e1e8674d65f5ea2/DL_plot_modi_v1.2.R")
plot_model_modi(model=model)

Set compile parameters for

Compile the model

Here we will choose the optimizer and loss function.

model <- model %>%
     keras::compile(
       optimizer = keras::optimizer_rmsprop(learning_rate = 0.01),
       loss = rMiW::bce_dice_loss,
       metrics = rMiW::dice_coef
     )
  • Parameters:
    • optimizer: optimizer instance / (最適化アルゴリズム)
      • learning_rate: Learning rate (float >= 0) / 学習率
    • loss: Objective function / Loss function / 損失関数 (評価指標)
    • metrics: Evaluated function / 評価関数

Check the reference sheet in keras.rstudio.com:

Fit the model using 20 images for training

We should use drop=F to avoid any change of array shape.

#Create Train Data
X <- ImgData$Train$Train_Original[1:20,,,,drop=FALSE]
str(X)
Y <- ImgData$Train$Train_GroundTruth_Binary[1:20,,,,drop=FALSE]
str(Y)

history <- model %>%
  keras::fit(x = X, 
             y = Y,
             batch_size = 2,
             epochs = 2,
             verbose = 1)
  • Parameters:
    • batch_size: Number of samples per gradient update / 1度に計算するサンプル数
    • epochs: Number of epochs to train the model / エポック数:一つの訓練データを何回繰り返して学習させるか
    • verbose: Verbosity mode (0 = silent, 1 = progress bar, 2 = one line per epoch) / 表示モード
  • Training Speed (3rd Nov 2021):
    • Orchestra: 15.7 s/step
    • MacOSX (2.3 GHz quad-core Intel Core i7, KUME): 2.15 s/step (7.3-fold faster)
    • GPU (Quadro RTX 8000): 51 ms/step (307.8-fold faster)

Save the model by save_model_hdf5()

model %>% 
  keras::save_model_hdf5("model_v01.h5")

#Model weights as R arrays
keras::get_weights(model)[[1]]

The save_model_hdf5 function can save all information of the model; the weight values, the model’s configuration(architecture), and the optimizer configuration.

Re-load, re-compile and re-fit

We will load the saved model and run compile and fit.

We can see that the calculation is done from the continuation.

model_v01.h5 (training: 60 epochs)

#Re-read model
file <- system.file("extdata", "model_v01.h5", package="rMiW")

#Re-load
modelR <- keras::load_model_hdf5(file, compile=F)
summary(modelR)
keras::get_weights(modelR)[[1]]

#Re-compile
modelR <- modelR %>%
     keras::compile(
       optimizer = keras::optimizer_rmsprop(learning_rate = 0.01),
       loss = rMiW::bce_dice_loss,
       metrics = rMiW::dice_coef
     )

#Re-fit: Do not re-fit in this section
if(F){
history <- modelR %>%
  keras::fit(x = X, 
             y = Y,
             batch_size = 2,
             epochs = 1,
             verbose = 1)
}

model_v02.h5 (training: 2000 epochs)

#Re-read model
file <- system.file("extdata", "model_v02.h5", package="rMiW")

#Re-load
modelR2 <- keras::load_model_hdf5(file, compile=F)
summary(modelR2)
keras::get_weights(modelR2)[[1]]

#Re-compile
modelR2 <- modelR2 %>%
     keras::compile(
       optimizer = keras::optimizer_rmsprop(learning_rate = 0.01),
       loss = rMiW::bce_dice_loss,
       metrics = rMiW::dice_coef
     )

Model evaluation of model_v01.h5

Here we evaluate the model object using keras::evaluate function.

## Model evaluation
Score <- modelR %>% 
  keras::evaluate(X,
                  Y, 
                  verbose = 1)

cat(paste0('Train loss:', round(Score[[1]], 4), 
           '\nTrain accuracy:', round(Score[[2]], 4)))

#model_v01 (training: 60 epochs)
#Train loss:1.3279
#Train accuracy:0.8672

Model prediction at image pixel level

The model is used to predict the binarization at the pixel level.

Y_pred <- rMiW::model.pred(model=modelR, 
                           x=X)

Visualization of training results

We use ImageView2D function for the visualization.

for(n in 1:20){
#n <- 2
rMiW::ImageView2D(X,
            Y_pred,
            ImgN=n)
}

We can visualize the results using another function.

#Imge: 2
ImageView2D_pred(ImgArray_x=X,
                 ImgArray_y=Y,
                 ImgArray_pred=Y_pred,
                 ImgN=2)

#Imge: 6
ImageView2D_pred(ImgArray_x=X,
                 ImgArray_y=Y,
                 ImgArray_pred=Y_pred,
                 ImgN=6)

#Image: All
for(N in 1:20){
ImageView2D_pred(ImgArray_x=X,
                 ImgArray_y=Y,
                 ImgArray_pred=Y_pred,
                 ImgN=N)
}

Prediction for test dataset (20 images)

#Data
Test_X <- ImgData$Train$Train_Original[21:40,,,,drop=FALSE]
str(Test_X)
Test_Y <- ImgData$Train$Train_GroundTruth_Binary[21:40,,,,drop=FALSE]
str(Test_Y)

## Model evaluation
Score <- modelR %>% 
  keras::evaluate(Test_X, 
                  Test_Y, 
                  verbose = 1)

cat(paste0('Train loss:', round(Score[[1]], 4), 
           '\nTrain accuracy:', round(Score[[2]], 4)))

#model_v01 (training: 60 epochs)
#Train loss:1.1545
#Train accuracy:0.8758

Visualization of test results

We use ImageView2D function for the visualization.

Test_Y_pred <- rMiW::model.pred(model=modelR, 
                                x=Test_X)

#visualization
for(N in 1:20){
ImageView2D_pred(ImgArray_x=Test_X,
                 ImgArray_y=Test_Y,
                 ImgArray_pred=Test_Y_pred,
                 ImgN=N)
}

Model evaluation of model_v02.h5

Here we evaluate the model object using keras::evaluate function.

## Model evaluation
Score <- modelR2 %>% 
  keras::evaluate(X,
                  Y, 
                  verbose = 1)

cat(paste0('Train loss:', round(Score[[1]], 4), 
           '\nTrain accuracy:', round(Score[[2]], 4)))

#model_v02 (training: 2000 epochs)
#Train loss:0.0051
#Train accuracy:0.9978

Model prediction at image pixel level

The model is used to predict the binarization at the pixel level.

Y_pred <- rMiW::model.pred(model=modelR2, 
                           x=X)

Visualization of results

We use ImageView2D function for the visualization.

for(n in 1:20){
#n <- 2
rMiW::ImageView2D(X,
            Y_pred,
            ImgN=n)
}

We can visualize the results using another function.

#Imge: 2
ImageView2D_pred(ImgArray_x=X,
                 ImgArray_y=Y,
                 ImgArray_pred=Y_pred,
                 ImgN=2)

#Imge: 6
ImageView2D_pred(ImgArray_x=X,
                 ImgArray_y=Y,
                 ImgArray_pred=Y_pred,
                 ImgN=6)

#Image: All
for(N in 1:20){
ImageView2D_pred(ImgArray_x=X,
                 ImgArray_y=Y,
                 ImgArray_pred=Y_pred,
                 ImgN=N)
}

Prediction for test dataset (20 images)

#Data
Test_X <- ImgData$Train$Train_Original[21:40,,,,drop=FALSE]
str(Test_X)
Test_Y <- ImgData$Train$Train_GroundTruth_Binary[21:40,,,,drop=FALSE]
str(Test_Y)

## Model evaluation
Score <- modelR2 %>% 
  keras::evaluate(Test_X, 
                  Test_Y, 
                  verbose = 1)

cat(paste0('Train loss:', round(Score[[1]], 4), 
           '\nTrain accuracy:', round(Score[[2]], 4)))

#model_v01 (training: 2000 epochs)
#Train loss:0.8893
#Train accuracy:0.9292

Visualization of test results

We use ImageView2D function for the visualization.

Test_Y_pred <- rMiW::model.pred(model=modelR2, 
                                x=Test_X)

#visualization
for(N in 1:20){
ImageView2D_pred(ImgArray_x=Test_X,
                 ImgArray_y=Test_Y,
                 ImgArray_pred=Test_Y_pred,
                 ImgN=N)
}

Session information

## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
## 
## Matrix products: default
## BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] BiocStyle_2.20.2
## 
## loaded via a namespace (and not attached):
##  [1] bookdown_0.24       rprojroot_2.0.2     digest_0.6.28      
##  [4] crayon_1.4.2        R6_2.5.1            magrittr_2.0.1     
##  [7] evaluate_0.14       stringi_1.7.5       rlang_0.4.12       
## [10] cachem_1.0.6        fs_1.5.0            jquerylib_0.1.4    
## [13] ragg_1.2.0          rmarkdown_2.11      pkgdown_1.6.1      
## [16] textshaping_0.3.6   desc_1.4.0          tools_4.1.1        
## [19] stringr_1.4.0       yaml_2.2.1          xfun_0.27          
## [22] fastmap_1.1.0       compiler_4.1.1      systemfonts_1.0.3  
## [25] BiocManager_1.30.16 memoise_2.0.0       htmltools_0.5.2    
## [28] knitr_1.36