Tutorial of flowSpy

Introduction

Multidimensional flow and mass cytometric assays are widely used for cellular subpopulation identification, tissue microenvironment composition determination, clinical immunophenotyping and differential lineage reconstruction[1]. Modern fluorescence-based flow cytometers typically can detect up to 20 features at the single-cell level in one routine experiment, whereas mass cytometers increase the capacity to nearly 50 features[2]. Because of the traditional manual gating strategies used for flow cytometry data, effective analysis methods for multidimensional cytometric data still face challenges in terms of trajectory inference, pseudotime estimation, visualization and workflow standardization.

Here we present flowSpy, a trajectory inference, pseudotime estimation and visualization toolkit for flow and mass cytometry data. The flowSpy package is built in R and offers a complete up-to-date analysis workflow for flow and mass cytometry data that includes subpopulation classification, dimensionality reduction, trajectory construction, differentially expressed marker calculation, pseudotime estimation, intermediate state identification and visualization. The flowSpy runs on several platforms, such as UNIX, Windows and MacOS, and provides an up-to-date, feature-rich and readily scalable workflow.

The flowSpy package was developed to provide a complete standardized analysis and visualization workflow for FCS data. In flowSpy workflow, an R S4 object is built to integrated all computational modules into one single channel, which is named as an FSPY object. This design not only packages most statistical and computational approaches into a single comprehensive analysis workflow but also provides a convenient way for users to adjust parameters and obtain results. The computational modules of flowSpy can be mainly divided into four main categories: preprocessing, trajectory, analysis and visualization.

Installation

The flowSpy is freely available at GitHub and Bioconductor.

Installation via Bioconductor

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("flowSpy")

Installation via GitHub

This requires the devtools package to be installed first.

# If not already installed
install.packages("devtools") 
devtools::install_github("JhuangLab/flowSpy")

library(flowSpy)

Quick guide

This is the quick guide of flowSpy workflow. All datasets and source code can be downloaded via git clone https://github.com/JhuangLab/flowSpy-dataset.git.

# Loading packages
suppressMessages({
library(ggplot2)
library(flowSpy)
library(flowCore)
library(stringr)
})

#######################################################
##### Preprocessing
#######################################################
# Read fcs files
fcs.path <- system.file("extdata", package = "flowSpy")
fcs.files <- list.files(fcs.path, pattern = '.FCS$', full = TRUE)

fcs.data <- runExprsMerge(fcs.files, comp = FALSE, transformMethod = "none")

# Refine colnames of fcs data
recol <- c(`FITC-A<CD43>` = "CD43", `APC-A<CD34>` = "CD34", 
           `BV421-A<CD90>` = "CD90", `BV510-A<CD45RA>` = "CD45RA", 
           `BV605-A<CD31>` = "CD31", `BV650-A<CD49f>` = "CD49f",
           `BV 735-A<CD73>` = "CD73", `BV786-A<CD45>` = "CD45", 
           `PE-A<FLK1>` = "FLK1", `PE-Cy7-A<CD38>` = "CD38")
colnames(fcs.data)[match(names(recol), colnames(fcs.data))] = recol
fcs.data <- fcs.data[, recol]

day.list <- c("D0", "D2", "D4", "D6", "D8", "D10")
meta.data <- data.frame(cell = rownames(fcs.data),
                        stage = str_replace(rownames(fcs.data), regex(".FCS.+"), "") )
meta.data$stage <- factor(as.character(meta.data$stage), levels = day.list)

markers <- c("CD43","CD34","CD90","CD45RA","CD31","CD49f","CD73","CD45","FLK1","CD38")

#######################################################
####  Standard workflow of flowSpy
#######################################################

# Build the FSPY object
fspy <- createFSPY(raw.data = fcs.data, markers = markers,
                   meta.data = meta.data,
                   normalization.method = "log",
                   verbose = TRUE)

# See information
fspy

fspy <- runCluster(fspy)
fspy <- processingCluster(fspy)
fspy <- runFastPCA(fspy)
fspy <- runTSNE(fspy)
fspy <- runDiffusionMap(fspy)
fspy <- runUMAP(fspy)
fspy <- buildTree(fspy, dim.type = "umap", dim.use = 1:2)
fspy <- defRootCells(fspy, root.cells = "Root cells")
fspy <- runPseudotime(fspy)
fspy <- defLeafCells(fspy, leaf.cells = "Leaf cells")
fspy <- runWalk(fspy)

Preprocessing

This is preprocessing step for flow cytometry data analysis. In flowSpy workflow, data import, compensation, quality control, filtration, normalization and merge cells from different samples can be implemented in the preprocessing module. In this tutorial, we provided two methods to perform compensation and filtration by using flowCore and flowSpy.

We compared the visualization step and gating control between flowJO and flowSpy using the rectangular gate. And there were no differences.

Preprocessing using flowCore

# Loading packages
suppressMessages({
  library(flowCore)
  library(LSD)
  library(flowSpy)
})

#########################
# Read Flow Cytometry Data
# It can be downloaded via `git clone https://github.com/JhuangLab/flowSpy-dataset.git` 
# fcs.path musted be modified based on the download directory from GitHub
fcs.path <- "FCS/preprocessing/"
fcs.file <- paste0(fcs.path, "D10_raw.fcs")

###########################################
#   Get the expression matrix from FCS file
###########################################

fcs.data.raw <- flowCore::read.FCS(filename = fcs.file)
head(fcs.data.raw)

##         FSC-A FSC-H    FSC-W    SSC-A SSC-H    SSC-W FITC-A PerCP-Cy5-5-A
## [1,] 13192.68 10463 82633.61 45715.02 39837 75205.95 364.26         70.20
## [2,]  9069.84  8948 66428.37  4715.10  4624 66827.16 233.22         35.10
## [3,] 78563.80 62530 82340.60 76214.58 61874 80725.32 112.32        270.66
## [4,]  7031.20  6800 67764.23  2695.68  2740 64475.95   2.34         36.66
## [5,]  5493.04  5289 68064.26  4618.38  4605 65726.42  17.94        161.46
## [6,]  7221.60  7400 63956.05  8058.18  8396 62899.10   3.90        -23.40
##        APC-A APC-Alexa 700-A  BV421-A BV510-A BV605-A  BV650-A BV 735-A
## [1,]   76.70           288.6   929.90  652.80  271.15   997.90   394.40
## [2,]  386.75           330.2   770.10 1691.50  844.90  1602.25   154.70
## [3,]  518.70          1677.0 16720.35 4182.85 4102.95  6295.95  2409.75
## [4,]  264.55           249.6   374.85   57.80  -20.40   656.20   237.15
## [5,] 3781.70          4778.8  4025.60  746.30 3818.20 41966.20  5752.80
## [6,]   46.15            19.5   266.05    0.00  140.25   922.25    22.95
##       BV786-A    PE-A PE-Cy7-A  Time
## [1,]  1128.80  119.19   317.55 162.3
## [2,]  1204.45   60.90   -49.59 169.4
## [3,]  7450.25 6202.23   575.94 171.4
## [4,]   892.50   40.89   -85.26 171.5
## [5,] 14499.30  105.27   412.38 171.6
## [6,]   881.45   -6.09    46.98 174.1

# Need compensation
# If `flow.data@description$SPILL` is not empty, 
# the matrix of flow cytometry need compensation
fcs.data.raw@description$SPILL

##       FITC-A PerCP-Cy5-5-A        APC-A APC-Alexa 700-A     BV421-A     BV510-A
##  [1,]      1  0.0008271705 0.0000000000    0.0000000000 0.001175325 0.073618328
##  [2,]      0  1.0000000000 0.2460516378    0.4907612047 0.000000000 0.000000000
##  [3,]      0  0.0063469268 1.0000000000    0.8748031399 0.000000000 0.000000000
##  [4,]      0  0.0050851828 0.0575297726    1.0000000000 0.000000000 0.000000000
##  [5,]      0  0.0000000000 0.0000000000    0.0042450318 1.000000000 0.111592399
##  [6,]      0  0.0000000000 0.0000000000    0.0006181738 0.093627063 1.000000000
##  [7,]      0  0.0017350493 0.0000000000    0.0000000000 0.054255325 0.000000000
##  [8,]      0  0.0009159050 0.0591153931    0.0529769694 0.170006804 0.022248638
##  [9,]      0  0.0346316604 0.0338806325    0.8703499551 0.176845618 0.019127528
## [10,]      0  0.0000000000 0.0002144425    0.0072909975 0.047762006 0.006277304
## [11,]      0  0.0306223175 0.0000000000    0.0001743376 0.000000000 0.003106608
## [12,]      0  0.0000000000 0.0000000000    0.0029470824 0.000000000 0.000000000
##          BV605-A     BV650-A    BV 735-A      BV786-A        PE-A    PE-Cy7-A
##  [1,] 0.00000000 0.000000000 0.000000000 -0.000465226 0.002444011 0.000000000
##  [2,] 0.00000000 0.852924776 0.292659109  0.836982296 0.000000000 0.290178244
##  [3,] 0.00000000 0.136793110 0.013027907  0.027057954 0.000000000 0.076383907
##  [4,] 0.00000000 0.003422615 0.014423912  0.027747696 0.000000000 0.079759361
##  [5,] 0.00000000 0.000000000 0.000000000  0.000000000 0.000000000 0.000000000
##  [6,] 0.13139261 0.085759242 0.010228166  0.028323530 0.000000000 0.000000000
##  [7,] 1.00000000 0.879787196 0.121276600  0.285106045 0.196998454 0.025642355
##  [8,] 0.14538044 1.000000000 0.121942938  0.264435008 0.001030994 0.005458056
##  [9,] 0.00000000 0.161073808 1.000000000  2.662247223 0.000000000 0.036941120
## [10,] 0.00000000 0.005185583 0.024017466  1.000000000 0.000000000 0.000000000
## [11,] 0.03506033 0.014645453 0.001331401  0.003217543 1.000000000 0.024660047
## [12,] 0.00000000 0.000000000 0.000000000  0.103773120 0.014549372 1.000000000

fcs.data <- flowCore::compensate(fcs.data.raw, spillover = fcs.data.raw@description$SPILL)
head(fcs.data)

##         FSC-A FSC-H    FSC-W    SSC-A SSC-H    SSC-W FITC-A PerCP-Cy5-5-A
## [1,] 13192.68 10463 82633.61 45715.02 39837 75205.95 364.26      56.99024
## [2,]  9069.84  8948 66428.37  4715.10  4624 66827.16 233.22      34.71020
## [3,] 78563.80 62530 82340.60 76214.58 61874 80725.32 112.32      35.84098
## [4,]  7031.20  6800 67764.23  2695.68  2740 64475.95   2.34      28.23213
## [5,]  5493.04  5289 68064.26  4618.38  4605 65726.42  17.94      75.19601
## [6,]  7221.60  7400 63956.05  8058.18  8396 62899.10   3.90     -20.50499
##            APC-A APC-Alexa 700-A     BV421-A    BV510-A      BV605-A    BV650-A
## [1,]    8.691788       -23.21742   690.77902  525.45814    84.727633   784.0919
## [2,]  321.458333        38.41145   397.71232 1604.20581   495.386369   963.1438
## [3,]  298.102478      -222.83079 15491.69177 2325.22489  3183.661078  2879.0558
## [4,]  219.328059      -127.16685   221.75943   12.51574  -123.691747   683.2365
## [5,] 1092.971385       923.10096 -3508.90937  134.32904 -2603.277916 43910.7076
## [6,]   -5.643600        73.34697    82.29203  -34.33112     6.933851   950.2434
##        BV 735-A   BV786-A        PE-A   PE-Cy7-A  Time
## [1,]  264.08833  102.4878   96.673813  283.60862 162.3
## [2,]  -76.64074  939.0676  -36.776549 -101.50172 169.4
## [3,] 1596.90907 1386.5800 5567.927719  266.89475 171.4
## [4,]  151.44369  328.5026   66.116028 -107.84506 171.5
## [5,]  617.55458 1859.6867  572.452755   23.60561 171.6
## [6,] -110.83642  934.7309   -9.121238   46.46614 174.1

###########################################
#   Gating
###########################################

fcs.exp <- fcs.data@exprs

# Plot by FSC-A and SSC-A
heatscatter(fcs.exp[, "FSC-A"], 
            fcs.exp[, "SSC-A"],
            cexplot = 0.3, main = "Raw FCS data", 
            xlab = "FSC-A", ylab = "SSC-A",
            xlim = c(0, 250000), ylim = c(0, 250000))

fcs.exp <- fcs.exp[which((fcs.exp[, "FSC-A"] > 70000) & (fcs.exp[, "FSC-A"] < 180000)), ]
fcs.exp <- fcs.exp[which((fcs.exp[, "SSC-A"] > 30000) & (fcs.exp[, "SSC-A"] < 150000)), ]

heatscatter(fcs.exp[, "FSC-A"], 
            fcs.exp[, "SSC-A"],
            cexplot = 0.3, main = "Filtered by FSC-A and SSC-A",
            xlab = "FSC-A", ylab = "SSC-A",
            xlim = c(0, 250000), ylim = c(0, 250000))

# Plot by FSC-H and FSC-W
heatscatter(fcs.exp[, "FSC-H"], 
            fcs.exp[, "FSC-W"],
            cexplot = 0.3, main = "Filtered by FSC-A and SSC-A",
            xlab = "FSC-H", ylab = "FSC-W",
            xlim = c(0, 250000), ylim = c(0, 250000))

fcs.exp <- fcs.exp[which((fcs.exp[, "FSC-H"] > 40000) & (fcs.exp[, "FSC-H"] < 120000)), ]
fcs.exp <- fcs.exp[which((fcs.exp[, "FSC-W"] > 60000) & (fcs.exp[, "FSC-W"] < 120000)), ]

# Plot by FSC-H and FSC-W
heatscatter(fcs.exp[, "FSC-H"], 
            fcs.exp[, "FSC-W"],
            cexplot = 0.3, main = "Filtered by FSC-H and FSC-W",
            xlab = "FSC-H", ylab = "FSC-W",
            xlim = c(0, 250000), ylim = c(0, 250000))

# Plot by SSC-H and SSC-w
heatscatter(fcs.exp[, "SSC-H"], 
            fcs.exp[, "SSC-W"],
            cexplot = 0.3, main = "Filtered by FSC-H and FSC-W",
            xlab = "SSC-H", ylab = "SSC-W",
            xlim = c(0, 250000), ylim = c(0, 250000))

fcs.exp <- fcs.exp[which((fcs.exp[, "SSC-H"] > 20000) & (fcs.exp[, "SSC-H"] < 120000)), ]
fcs.exp <- fcs.exp[which((fcs.exp[, "SSC-W"] > 60000) & (fcs.exp[, "SSC-W"] < 110000)), ]

# Plot by SSC-H and SSC-w
heatscatter(fcs.exp[, "SSC-H"], 
            fcs.exp[, "SSC-W"],
            cexplot = 0.3, main = "Filtered by SSC-H and SSC-W",
            xlab = "SSC-H", ylab = "SSC-W",
            xlim = c(0, 250000), ylim = c(0, 250000))

# Plot by CD43 and CD31
heatscatter(log10(abs(fcs.exp[, "FITC-A"])+1), 
            log10(abs(fcs.exp[, "BV605-A"])+1),
            cexplot = 0.3, main = "After gating", 
            xlab = "CD43", ylab = "CD31",
            xlim = c(0, 5), ylim = c(0, 5))

# Plot by CD43 and CD31
heatscatter(log10(abs(fcs.exp[, "APC-A"])+1), 
            log10(abs(fcs.exp[, "BV650-A"])+1),
            cexplot = 0.3, main = "After gating", 
            xlab = "CD34", ylab = "CD49f",
            xlim = c(0, 5), ylim = c(0, 5))

# Plot by CD43 and CD31
heatscatter(log10(abs(fcs.exp[, "PE-Cy7-A"])+1), 
            log10(abs(fcs.exp[, "BV421-A"])+1),
            cexplot = 0.3, main = "After gating", 
            xlab = "CD38", ylab = "CD90",
            xlim = c(0, 5), ylim = c(0, 5))

# Output FCS file
fcs.data@exprs <- fcs.exp
flowCore::write.FCS(fcs.data, filename = "FCS/basic/D10.fcs")

## [1] "FCS/basic/D10.fcs"

# Read FCS file and then start your analysis
fcs.exp <- flowSpy::runExprsExtract("FCS/basic/D10.fcs", 
                                    transformMethod = "none", comp = F, showDesc = F)

# Show marker description in each panel
recol <- c(`FITC-A` = "CD43", `APC-A` = "CD34", 
           `BV421-A` = "CD90", `BV510-A` = "CD45RA", 
           `BV605-A` = "CD31", `BV650-A` = "CD49f",
           `BV 735-A` = "CD73", `BV786-A` = "CD45", 
           `PE-A` = "FLK1", `PE-Cy7-A` = "CD38")

colnames(fcs.exp)[match(names(recol), colnames(fcs.exp))] = recol
fcs.exp <- fcs.exp[, recol]

# build FSPY object
meta.data <- data.frame(cell = rownames(fcs.exp),
                        stage = "D10" )
fspy <- createFSPY(raw.data = fcs.exp, markers = colnames(fcs.exp),
                   meta.data = meta.data,
                   normalization.method = "log")

Preprocessing using flowSpy

# Loading packages
suppressMessages({
  library(flowCore)
  library(LSD)
  library(flowSpy)
})

#########################
# Read Flow Cytometry Data
# It can be downloaded via `git clone https://github.com/JhuangLab/flowSpy-dataset.git` 
# fcs.path musted be modified based on the download directory from GitHub
fcs.path <- "FCS/preprocessing/"
fcs.file <- paste0(fcs.path, "D10_raw.fcs")

###########################################
#   Get the expression matrix from FCS file
###########################################

# Need compensation
# If the flow cytometry need compensation, set `comp = TRUE`
fspy.data <- flowSpy::runExprsExtract(fcs.file, comp = TRUE, 
                                      transformMethod = "none", showDesc = FALSE)

##     Compensation is applied on FCS/preprocessing/D10_raw.fcs

head(fspy.data)

##              FSC-A FSC-H    FSC-W    SSC-A SSC-H    SSC-W FITC-A PerCP-Cy5-5-A
## D10_raw_1 13192.68 10463 82633.61 45715.02 39837 75205.95 364.26      56.99024
## D10_raw_2  9069.84  8948 66428.37  4715.10  4624 66827.16 233.22      34.71020
## D10_raw_3 78563.80 62530 82340.60 76214.58 61874 80725.32 112.32      35.84098
## D10_raw_4  7031.20  6800 67764.23  2695.68  2740 64475.95   2.34      28.23213
## D10_raw_5  5493.04  5289 68064.26  4618.38  4605 65726.42  17.94      75.19601
## D10_raw_6  7221.60  7400 63956.05  8058.18  8396 62899.10   3.90     -20.50499
##                 APC-A APC-Alexa 700-A     BV421-A    BV510-A      BV605-A
## D10_raw_1    8.691788       -23.21742   690.77902  525.45814    84.727633
## D10_raw_2  321.458333        38.41145   397.71232 1604.20581   495.386369
## D10_raw_3  298.102478      -222.83079 15491.69177 2325.22489  3183.661078
## D10_raw_4  219.328059      -127.16685   221.75943   12.51574  -123.691747
## D10_raw_5 1092.971385       923.10096 -3508.90937  134.32904 -2603.277916
## D10_raw_6   -5.643600        73.34697    82.29203  -34.33112     6.933851
##              BV650-A   BV 735-A   BV786-A        PE-A   PE-Cy7-A
## D10_raw_1   784.0919  264.08833  102.4878   96.673813  283.60862
## D10_raw_2   963.1438  -76.64074  939.0676  -36.776549 -101.50172
## D10_raw_3  2879.0558 1596.90907 1386.5800 5567.927719  266.89475
## D10_raw_4   683.2365  151.44369  328.5026   66.116028 -107.84506
## D10_raw_5 43910.7076  617.55458 1859.6867  572.452755   23.60561
## D10_raw_6   950.2434 -110.83642  934.7309   -9.121238   46.46614

heatscatter(fspy.data[, "FSC-A"], 
            fspy.data[, "SSC-A"],
            cexplot = 0.3, main = "Raw FCS data", 
            xlab = "FSC-A", ylab = "SSC-A")

###########################################
#   Gating
###########################################

# Gating using the sample parameters
fspy.data.gating <- gatingMatrix(fspy.data, 
                                 lower.gate = c(`FSC-A` = 70000, `SSC-A` = 30000,
                                                `FSC-H` = 40000, `FSC-W` = 60000,
                                                `SSC-H` = 20000, `SSC-W` = 60000),
                                 upper.gate = c(`FSC-A` = 180000, `SSC-A` = 150000,
                                                `FSC-H` = 120000, `FSC-W` = 120000,
                                                `SSC-H` = 120000, `SSC-W` = 110000))

# Plot by CD43 and CD31
heatscatter(log10(abs(fspy.data.gating[, "FITC-A"])+1), 
            log10(abs(fspy.data.gating[, "BV605-A"])+1),
            cexplot = 0.3, main = "After gating", 
            xlab = "CD43", ylab = "CD31",
            xlim = c(0, 5), ylim = c(0, 5))

# Plot by CD43 and CD31
heatscatter(log10(abs(fspy.data.gating[, "APC-A"])+1), 
            log10(abs(fspy.data.gating[, "BV650-A"])+1),
            cexplot = 0.3, main = "After gating", 
            xlab = "CD34", ylab = "CD49f",
            xlim = c(0, 5), ylim = c(0, 5))

# Plot by CD43 and CD31
heatscatter(log10(abs(fspy.data.gating[, "PE-Cy7-A"])+1), 
            log10(abs(fspy.data.gating[, "BV421-A"])+1),
            cexplot = 0.3, main = "After gating", 
            xlab = "CD38", ylab = "CD90",
            xlim = c(0, 5), ylim = c(0, 5))

# Show marker description in each panel
recol <- c(`FITC-A` = "CD43", `APC-A` = "CD34", 
           `BV421-A` = "CD90", `BV510-A` = "CD45RA", 
           `BV605-A` = "CD31", `BV650-A` = "CD49f",
           `BV 735-A` = "CD73", `BV786-A` = "CD45", 
           `PE-A` = "FLK1", `PE-Cy7-A` = "CD38")

colnames(fspy.data.gating)[match(names(recol), colnames(fspy.data.gating))] = recol
fspy.data.gating <- fspy.data.gating[, recol]

# build FSPY object and start your analysis
meta.data <- data.frame(cell = rownames(fspy.data.gating),
                        stage = "D10" )
fspy <- createFSPY(raw.data = fspy.data.gating, 
                   markers = colnames(fspy.data.gating),
                   meta.data = meta.data,
                   normalization.method = "log")

Trajectory

The aim of the trajectory module is to construct trajectory that reveals subpopulation connections and cellular dynamic processes using the clean matrix input. First, all cells included in the FSPY object are unsupervised classified into different clusters based on the expression levels of the markers. The flowSpy provides multiple methods to cluster cells by choosing different parameters, such as self-organizing maps (SOM)[3], k-means clustering (kmeans)[4], clara, phenoGraph[5] and hierarchical clustering (hclust). The default clustering method is SOM, for better performance in precision, coherence, and stability than other unsupervised tools[6]. After clustering, if the total cell size is too large, cluster-dependent downsampling is recommended to reduce the total cell number and avoid small cluster deletion. Dimensionality reduction for both cells and clusters is also implemented in the clustering procedure. Four popular dimensionality reduction method are integrated in flowSpy, namely PCA, tSNE, diffusion maps and UMAP. In the flowSpy workflow, we use a tree-shaped method to identify trajectory topologies, and a minimum spanning tree (MST) will be generated to construct the trajectory for all clusters.

Clustering

# Loading packages
suppressMessages({
  library(flowSpy)
  library(ggplot2)
  library(ggthemes)
})

#########################
# Read Flow Cytometry Data
# It can be downloaded via `git clone https://github.com/JhuangLab/flowSpy-dataset.git` 
# fcs.path musted be modified based on the download directory from GitHub
fcs.path <- "FCS/basic/"
fcs.file <- paste0(fcs.path, "FR-FCM-ZY9R-Bone_Marrow_cytof.fcs")

###########################################
#   Get the expression matrix from FCS file
###########################################

# If you want to see the description of each panel, Set showDesc = T.
fspy.data <- flowSpy::runExprsExtract(fcs.file, comp = FALSE, 
                                      transformMethod = "autoLgcl", showDesc = FALSE)
head(fspy.data)

##                                       CD3    CD45RA         CD19     CD11b
## FR-FCM-ZY9R-Bone_Marrow_cytof_1 2.0801438 2.8996987  2.826880635 1.7313151
## FR-FCM-ZY9R-Bone_Marrow_cytof_2 0.1794195 1.0531011  0.170012264 3.6600487
## FR-FCM-ZY9R-Bone_Marrow_cytof_3 0.8502627 0.5733971 -0.004228061 0.3995415
## FR-FCM-ZY9R-Bone_Marrow_cytof_4 1.2229717 2.5799890  0.614976380 0.7735211
## FR-FCM-ZY9R-Bone_Marrow_cytof_5 0.3870464 2.7158426  2.314903378 0.3524348
## FR-FCM-ZY9R-Bone_Marrow_cytof_6 0.3394628 2.6505533  0.516789259 1.8510639
##                                        CD4       CD8        CD34      CD20
## FR-FCM-ZY9R-Bone_Marrow_cytof_1  1.6550040 2.8491347  1.21746892 2.5478819
## FR-FCM-ZY9R-Bone_Marrow_cytof_2  1.7254517 0.5985356  2.02360240 1.5617139
## FR-FCM-ZY9R-Bone_Marrow_cytof_3  0.2101132 0.2484007  1.45166551 0.2555268
## FR-FCM-ZY9R-Bone_Marrow_cytof_4  2.7294033 0.8376250  1.56778527 0.3729493
## FR-FCM-ZY9R-Bone_Marrow_cytof_5  0.3855824 0.3538444  0.22402275 2.3792987
## FR-FCM-ZY9R-Bone_Marrow_cytof_6 -0.1749056 1.6696339 -0.06868238 0.4264604
##                                      CD33     CD123      CD38      CD90
## FR-FCM-ZY9R-Bone_Marrow_cytof_1 2.3765256 2.5996726 2.1321229 1.7025328
## FR-FCM-ZY9R-Bone_Marrow_cytof_2 3.8344027 2.3662138 2.4821806 0.3923201
## FR-FCM-ZY9R-Bone_Marrow_cytof_3 0.3135138 1.0034261 0.5358351 0.9306099
## FR-FCM-ZY9R-Bone_Marrow_cytof_4 0.5969381 0.2945977 2.1253160 1.2073872
## FR-FCM-ZY9R-Bone_Marrow_cytof_5 0.9107694 1.7835893 1.9741379 1.1260590
## FR-FCM-ZY9R-Bone_Marrow_cytof_6 0.4441221 0.6914673 2.3116921 0.6731089
##                                      CD45
## FR-FCM-ZY9R-Bone_Marrow_cytof_1 3.3546790
## FR-FCM-ZY9R-Bone_Marrow_cytof_2 2.9798902
## FR-FCM-ZY9R-Bone_Marrow_cytof_3 0.5436509
## FR-FCM-ZY9R-Bone_Marrow_cytof_4 2.7200903
## FR-FCM-ZY9R-Bone_Marrow_cytof_5 2.8910706
## FR-FCM-ZY9R-Bone_Marrow_cytof_6 2.8762927

dim(fspy.data)

## [1] 236187     13

# build FSPY object and start your analysis
# If you don't want to see the running log information, set verbose FALSE
# If there is only one case in your analysis workflow, you can just set stage = "D0"
meta.data <- data.frame(cell = rownames(fspy.data),
                        stage = "D0" )
fspy <- createFSPY(raw.data = fspy.data, 
                   markers = colnames(fspy.data),
                   meta.data = meta.data,
                   normalization.method = "none")

fspy

## FSPY Information:
##  Input cell number: 236187  cells 
##  Enroll marker number: 13  markers 
##  Cells after downsampling: 236187  cells

# The flowSpy provides multiple methods to cluster cells by 
# choosing different parameters, som, kmeans, clara, phenoGraph, 
# and hclust. By default is som.
set.seed(1)
fspy <- runCluster(fspy, verbose = T)

## 2020-07-09 21:48:09 [INFO] Calculating FlowSOM.

## 2020-07-09 21:48:11 [INFO] Calculating FlowSOM completed.

table(fspy@meta.data$cluster.id)

## 
##     1     2     3     4     5     6     7     8     9    10    11    12    13 
##  9399 12006  5491  7918  4401  6309  8609  6141  6524  9410  1661  7545  5552 
##    14    15    16    17    18    19    20    21    22    23    24    25    26 
##  5582  6932  6505  4836  7943  6955  6767  7213  6806  5595  7612  5345  4016 
##    27    28    29    30    31    32    33    34    35    36 
##  7369  8440 11386  2255   973  6674  6343  4394  7316  7964

# You can set xdim and ydim to specify the number of clusters
# the cluster number is xdim * ydim
set.seed(1)
fspy <- runCluster(fspy, cluster.method = "som", xdim = 10, ydim = 10, verbose = T)

## 2020-07-09 21:48:11 [INFO] Calculating FlowSOM.

## 2020-07-09 21:48:15 [INFO] Calculating FlowSOM completed.

table(fspy@meta.data$cluster.id)

## 
##    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
## 2476 3082 2282 2324 2413 3496 2604 3286 2464 2122 1774 2018 2007 2151 2322 1799 
##   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   32 
## 2015 1817 2203 1893 2312 1998 3239 2813 2290 1086 3120 2191 3335 1437 2223 2159 
##   33   34   35   36   37   38   39   40   41   42   43   44   45   46   47   48 
## 1939 1878 2848 2032 2461 2411 2333 1739 2610 2164 2705 3175 2293 3052 1888 1284 
##   49   50   51   52   53   54   55   56   57   58   59   60   61   62   63   64 
## 2555 2039 3319 2489 2426 2555 3584 2020 2188 2849 3289 3761 2712 2651 3264 2440 
##   65   66   67   68   69   70   71   72   73   74   75   76   77   78   79   80 
## 1521  766  942 2429 2401 2264 2444 2092 2579 2738 2292 2094 1874 2601 2948 2113 
##   81   82   83   84   85   86   87   88   89   90   91   92   93   94   95   96 
## 1934 2345 2026 3006 2124 2417 1137 2624 2660 3947 2670 2559 2082 2376 2353 1384 
##   97   98   99  100 
## 1547 3195 1244 2755

# Kmeans cluster, you can set k to specify the number of clusters
set.seed(1)
fspy <- runCluster(fspy, cluster.method = "kmeans", k = 100, verbose = T)

## 2020-07-09 21:48:15 [INFO] Calculating Kmeans.

## 2020-07-09 21:48:23 [INFO] Calculating Kmeans completed.

table(fspy@meta.data$cluster.id)

## 
##    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
## 2873 2832 1501 3089  993 2365 2828 2002 2374 2200 2305 3106 1880 2039 3260 2100 
##   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   32 
## 2001 1765 2377 1425 2227 1797 2221 2781 2092 1687 2783 2318 3617 1434 1949 2187 
##   33   34   35   36   37   38   39   40   41   42   43   44   45   46   47   48 
## 1945 1790 2115 2094 2808 2671 1589 1777 2563 3476 3608 2713 2693 3239 2018 1830 
##   49   50   51   52   53   54   55   56   57   58   59   60   61   62   63   64 
## 1717 2147 3108 2946 1833 2388 3580 1984 2220 1912 2766 4466 2348 2307 3269 2227 
##   65   66   67   68   69   70   71   72   73   74   75   76   77   78   79   80 
## 2023  765 2720 2744 2588 2290 2541 3048 3217 2791 1123 2095 1765 1507 2935 2937 
##   81   82   83   84   85   86   87   88   89   90   91   92   93   94   95   96 
## 2244 2415 2693 3781 1547 2150 1182 2552 1495 4019 2867 2488 2370 2780 2173 2915 
##   97   98   99  100 
## 1575 2153 1265 1879

# Clara cluster, you can set k to specify the number of clusters
set.seed(1)
fspy <- runCluster(fspy, cluster.method = "clara", k = 100, verbose = T)

## 2020-07-09 21:48:23 [INFO] Calculating Clara

## 2020-07-09 21:48:28 [INFO] Calculating Clara completed.

table(fspy@meta.data$cluster.id)

## 
##     1     2     3     4     5     6     7     8     9    10    11    12    13 
##   591 10008  2664  5313  4374  4085  6511  2645  3951  1034  4366 12654  3593 
##    14    15    16    17    18    19    20    21    22    23    24    25    26 
##  5619  3481  4845  4326  5349  1748  2524  4862  2447  1404  2252  5009  2609 
##    27    28    29    30    31    32    33    34    35    36    37    38    39 
##  1577   480  1400  3494  2939   505  1281  3544  1796  1581   571  5337   582 
##    40    41    42    43    44    45    46    47    48    49    50    51    52 
##  3216  4262   231  4296  2645  3007  3713  2683   614  4178  7815  3319  2426 
##    53    54    55    56    57    58    59    60    61    62    63    64    65 
##  5778  1053  2244  2080  2970  4972  4710  3365   451  4480  1079   874  1195 
##    66    67    68    69    70    71    72    73    74    75    76    77    78 
##  1334  1069   583   342   397   366  1035   260   312  3240   264  1965   710 
##    79    80    81    82    83    84    85    86    87    88    89    90    91 
##  1928   844   465   727   760  1224   432  1301   698   886   474   623   624 
##    92    93    94    95    96    97    98    99   100 
##   321   206   193   842   108   178   146   263    60

# Hclust cluster, you can set k to specify the number of clusters
# Hclust runs only the cell number is less than 50,000. 
# Or it will take lots of time
if (dim(fspy.data)[1] < 10000) {
  set.seed(1)
  fspy <- runCluster(fspy, cluster.method = "hclust", k = 100, verbose = T)
  table(fspy@meta.data$cluster.id)
}


# phenoGraph cluster. The number of clusters can not be modified
# phenoGraph runs only the cell number is less than 10,000. 
# Or it will take lots of time
if (dim(fspy.data)[1] < 10000) {
fspy <- runCluster(fspy, cluster.method = "phenograph", verbose = T)
table(fspy@meta.data$cluster.id)
}

# By default is som, so we change the clustering method to som
set.seed(8)
fspy <- runCluster(fspy, cluster.method = "som", xdim = 10, ydim = 10, verbose = T)

## 2020-07-09 21:48:28 [INFO] Calculating FlowSOM.

## 2020-07-09 21:48:32 [INFO] Calculating FlowSOM completed.

table(fspy@meta.data$cluster.id)

## 
##    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
## 1427 3505 1996 2344 3109 3749 3047 1963 2286 2852  767 2002 3445 1982 1778 1825 
##   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   32 
## 2319 4014 2235 2337 3058 1420 1805 2744 3318 2873 3671 1288 2145 2647 2442 2410 
##   33   34   35   36   37   38   39   40   41   42   43   44   45   46   47   48 
## 1184 1814 3064 1563 2187 1870 2988 1598 2356 1850 1847 2308 1622 1898 2241 2849 
##   49   50   51   52   53   54   55   56   57   58   59   60   61   62   63   64 
## 2784 2224 1032 2123 2821 3179 2191 3428 2436 2416 2428 2570 3021 1964 2464 3520 
##   65   66   67   68   69   70   71   72   73   74   75   76   77   78   79   80 
## 2090 2522 3285 2888 1453 2913 2818 1250 1982 1237 3179 2146 2905 3136 4410 2774 
##   81   82   83   84   85   86   87   88   89   90   91   92   93   94   95   96 
## 1299 3198 2420 2144 2194 1745 1797 1997 1066 2048 2023 3074 2265 2225 1884 2311 
##   97   98   99  100 
## 2645 1633 2775 1813

# Preprocessing of Clusters
# If the cell number is too large, for example, more than 50,000, 
# cluster-based downsampling is recommended to reduce computation 
# load and save computation time.
# If the downsampling.size is 0.1, it means 10% cell 
# will be kept in the further analysis. By default is 1.
fspy <- processingCluster(fspy, perplexity = 5, downsampling.size = 0.1, 
                          force.resample = TRUE, verbose = T)

## 2020-07-09 21:48:33 [INFO] Calculating PCA

## 2020-07-09 21:48:33 [INFO] Calculating tSNE

## 2020-07-09 21:48:33 [INFO] Calculating Diffusion Map

## 2020-07-09 21:48:33 [INFO] Calculating UMAP

fspy

## FSPY Information:
##  Input cell number: 236187  cells 
##  Enroll marker number: 13  markers 
##  Cells after downsampling: 23662  cells

###################################
#### Visualization
###################################

plotCluster(fspy, item.use = c("PC_1", "PC_2"), category = "numeric",
            size = 100, color.by = "CD4", show.cluser.id = TRUE)

plotCluster(fspy, item.use = c("tSNE_1", "tSNE_2"), category = "numeric",
            size = 100, color.by = "CD4", show.cluser.id = TRUE) + 
  scale_colour_gradientn(colors = c("#00599F", "#EEEEEE", "#FF3222"))

plotCluster(fspy, item.use = c("DC_1", "DC_2"), category = "numeric",
            size = 100, color.by = "CD4", show.cluser.id = TRUE) + 
  scale_colour_gradientn(colors = c("#00599F", "#EEEEEE", "#FF3222"))

plotCluster(fspy, item.use = c("UMAP_1", "UMAP_2"), category = "numeric",
            size = 100, color.by = "CD4", show.cluser.id = FALSE) + 
  scale_colour_gradientn(colors = c("#00599F", "#EEEEEE", "#FF3222"))

Dimensionality Reduction

# Four popular dimensionality reduction method are integrated 
# in flowSpy, namely PCA, tSNE, diffusion maps and UMAP.

# run Principal Component Analysis (PCA)
fspy <- runFastPCA(fspy, verbose = T)

## 2020-07-09 21:48:41 [INFO] Calculating PCA.

## 2020-07-09 21:48:41 [INFO] Calculating PCA completed.

# run t-Distributed Stochastic Neighbor Embedding (tSNE)
set.seed(1)
fspy <- runTSNE(fspy, dims = 2, verbose = T)

## 2020-07-09 21:48:41 [INFO] Calculating tSNE.

## 2020-07-09 21:51:22 [INFO] Calculating tSNE completed.

# run Diffusion map
fspy <- runDiffusionMap(fspy)

# run Uniform Manifold Approximation and Projection (UMAP)
fspy <- runUMAP(fspy)

###################################
#### Visualization
###################################

plot2D(fspy, item.use = c("PC_1", "PC_2"), color.by = "CD3", 
       alpha = 1, main = "PCA", category = "numeric") + 
  scale_colour_gradientn(colors = c("#00599F","#EEEEEE","#FF3222"))

plot2D(fspy, item.use = c("tSNE_1", "tSNE_2"), color.by = "CD3", 
       alpha = 1, main = "tSNE", category = "numeric") + 
  scale_colour_gradientn(colors = c("#00599F","#EEEEEE","#FF3222"))

plot2D(fspy, item.use = c("DC_1", "DC_2"), color.by = "CD3", 
       alpha = 1, main = "Diffusion Maps", category = "numeric") + 
  scale_colour_gradientn(colors = c("#00599F","#EEEEEE","#FF3222"))

plot2D(fspy, item.use = c("UMAP_1", "UMAP_2"), color.by = "CD3", 
       alpha = 1, main = "UMAP", category = "numeric") + 
  scale_colour_gradientn(colors = c("#00599F","#EEEEEE","#FF3222"))

plot2D(fspy, item.use = c("tSNE_1", "tSNE_2"), color.by = "cluster.id", 
       alpha = 1, main = "tSNE", category = "categorical", show.cluser.id = T)

plot3D(fspy, item.use = c("DC_1", "DC_2", "DC_3"), color.by = "CD3", 
       main = "Diffusion Maps CD3", category = "numeric", size = 0.2, 
       color.theme = c("#00599F","#EEEEEE","#FF3222"))

plot3D(fspy, item.use = c("PC_1", "PC_2", "PC_3"), color.by = "CD3", 
       main = "PCA CD3", category = "numeric", size = 0.2, 
       color.theme = c("#00599F","#EEEEEE","#FF3222"))

plot3D(fspy, item.use = c("PC_1", "PC_2", "CD4"), color.by = "CD8", 
       main = "PCA relation with CD8", category = "numeric", size = 0.2, 
       color.theme = c("#00599F","#EEEEEE","#FF3222"))

plot3D(fspy, item.use = c("CD45", "CD4", "CD8"), color.by = "CD45", 
       main = "marker expression by CD45", category = "numeric", size = 0.2, 
       color.theme = c("#00599F","#EEEEEE","#FF3222"))

Trajectory Reconstruction

# flowSpy provides five method to build the tree-shaped trajectory: 
# 1. Raw expression matrix
# 2. PCA
# 3. tSNE
# 4. Diffusion maps
# 5. UMAP

# 1. Raw expression matrix
fspy <- buildTree(fspy, dim.type = "raw")
# Tree plot
plotTree(fspy, color.by = "CD3", show.node.name = F, cex.size = 1) + 
  scale_colour_gradientn(colors = c("#00599F", "#EEEEEE", "#FF3222"))

# 2. PCA
fspy <- buildTree(fspy, dim.type = "pca", dim.use = 1:4)
# Tree plot
plotTree(fspy, color.by = "CD3", show.node.name = F, cex.size = 1) + 
  scale_colour_gradientn(colors = c("#00599F", "#EEEEEE", "#FF3222"))

# 3. tSNE
fspy <- buildTree(fspy, dim.type = "tsne", dim.use = 1:2)
# Tree plot
plotTree(fspy, color.by = "CD3", show.node.name = F, cex.size = 1) + 
  scale_colour_gradientn(colors = c("#00599F", "#EEEEEE", "#FF3222"))

# 4. Diffusion maps
fspy <- buildTree(fspy, dim.type = "dc", dim.use = 1:5)
# Tree plot
plotTree(fspy, color.by = "CD3", show.node.name = F, cex.size = 1) + 
  scale_colour_gradientn(colors = c("#00599F", "#EEEEEE", "#FF3222"))

# 5. UMAP
fspy <- buildTree(fspy, dim.type = "umap", dim.use = 1:2)
# Tree plot
plotTree(fspy, color.by = "CD3", show.node.name = F, cex.size = 1) + 
  scale_colour_gradientn(colors = c("#00599F", "#EEEEEE", "#FF3222"))

# By combining with biological significance, we choose tsne to build 
# the trajectory
fspy <- buildTree(fspy, dim.type = "tsne", dim.use = 1:2)
fspy@meta.data$branch.id <- paste0("B", fspy@meta.data$branch.id)

plotTree(fspy, color.by = "branch.id", show.node.name = T, cex.size = 1)

plotTree(fspy, color.by = "branch.id", show.node.name = T, cex.size = 1)

############# Modify branch id
fspy@meta.data$branch.id[fspy@meta.data$branch.id %in% c("B5", "B2", "B10")] = "CD4 T cells"
fspy@meta.data$branch.id[fspy@meta.data$branch.id %in% c("B7", "B13")] = "CD8 T cells"
fspy@meta.data$branch.id[fspy@meta.data$branch.id %in% c("B1","B6","B12")] = "Megakaryocytic"
fspy@meta.data$branch.id[fspy@meta.data$branch.id %in% c("B3")] = "DCs"
fspy@meta.data$branch.id[fspy@meta.data$branch.id %in% c("B11")] = "B cells"
fspy@meta.data$branch.id[fspy@meta.data$branch.id %in% c("B4","B8","B9","B14")] = "Myeloid"

# In the biological analysis, we may found some clusters are
# in the wrong branch, or division of the branch is insufficient.
# We recommend modify the branch based on the marker expression
fspy@meta.data$branch.id[fspy@meta.data$cluster.id %in% c(74,36,89,11)] = "HSCs"
fspy@meta.data$branch.id[fspy@meta.data$cluster.id %in% c(62,14)] = "CD8 T cells"
fspy@meta.data$branch.id[fspy@meta.data$cluster.id %in% c(72)] = "B cells"

# Plot tree
plotTree(fspy, color.by = "branch.id", show.node.name = T, cex.size = 1)

plot2D(fspy, item.use = c("tSNE_1", "tSNE_2"), color.by = "branch.id", 
       alpha = 1, main = "tSNE", category = "categorical", show.cluser.id = F)

# Run differential expressed markers of different branch
diff.info <- runDiff(fspy)
head(diff.info)

##             logFC  AveExpr         t       P.Value     adj.P.Val          B
## CD8     1.9703425 1.077508 196.15117  0.000000e+00  0.000000e+00 11415.3546
## CD45RA  0.8177313 1.618059  59.23528  0.000000e+00  0.000000e+00  1625.7650
## CD45    0.7563940 2.342068  55.48696  0.000000e+00  0.000000e+00  1437.0040
## CD33   -0.7952377 1.794166 -43.38217  0.000000e+00  0.000000e+00   895.2394
## CD4    -0.6033601 1.250727 -37.34068 1.403120e-296 3.648112e-296   667.1611
## CD3     0.4563564 1.097588  35.70954 4.560437e-272 9.880948e-272   610.7608
##             branch.contrast   Gene
## CD8    CD8 T cells_vs_other    CD8
## CD45RA CD8 T cells_vs_other CD45RA
## CD45   CD8 T cells_vs_other   CD45
## CD33   CD8 T cells_vs_other   CD33
## CD4    CD8 T cells_vs_other    CD4
## CD3    CD8 T cells_vs_other    CD3

# plot heatmap of clusters and branches
plotClusterHeatmap(fspy)

plotClusterHeatmap(fspy, color = colorRampPalette(c("purple","white","yellow"))(100))

plotBranchHeatmap(fspy, colorRampPalette(c("#00599F", "#FFFFFF", "#FF3222"))(100), clustering_method = "complete")

Analysis

This module is designed for feature extraction and cell state refinement. When the tree is built, all branches are extracted to analyze the community structure of the trajectory topologies. The differentially expressed markers in each branch will be calculated in this module, which can be used to further define the subbranches. For the special analysis of FCS data, such as tracing the cell-of-origin during cell differentiation and reprogramming, the pseudotime can be estimated in the next step to reconstruct the processes of cell state changes based on dynamically expressed markers. The flowSpy provided an algorithm to estimate pseudotime and inference intermediate state cells based on prior knowledge (Figure 1, the Analysis panel). After trajectory analysis, the adjacency matrix was calculated using the KNN algorithm, which could be built based on either the expression matrix or the dimensionality reduction coordinates. Cells in the origin state were identified as the root cells. All shortest paths from the root cells to all other cells were calculated, and the mean value of the distances was computed as the pseudotime. The pseudotime of all clusters was then calculated in the flowSpy analysis module. Additionally, the analysis module provides functions to identify intermediate state cells, which may play an important role in the differentiation/reprogramming process.

Pseudotime

###########################################
#   Pseudotime 
###########################################

# Set HSPCs as root cells
fspy <- defRootCells(fspy, root.cells = c(36,89,11))
fspy <- runPseudotime(fspy, verbose = T, dim.type = "raw")

## 2020-07-09 21:58:44 [INFO] Calculating Pseudotime.

## 2020-07-09 21:58:44 [INFO] Pseudotime exists in meta.data, it will be replaced.

## 2020-07-09 21:58:44 [INFO] The log data will be used to calculate pseudotime

## 2020-07-09 22:00:26 [INFO] Calculating Pseudotime completed.

# Plot 2D tSNE. 
fspy@meta.data$stage <- fspy@meta.data$branch.id

plot2D(fspy, item.use = c("tSNE_1", "tSNE_2"), category = "numeric",
       size = 1, color.by = "pseudotime") + 
  scale_colour_gradientn(colors = c("#F4D31D", "#FF3222","#7A06A0"))

plotTree(fspy, color.by = "pseudotime", cex.size = 1) + 
  scale_colour_gradientn(colors = c("#F4D31D","#FF3222","#7A06A0"))

plotPseudotimeDensity(fspy, adjust = 2)

plotPseudotimeTraj(fspy, var.cols = T) + 
  scale_colour_gradientn(colors = c("#F4D31D", "#FF3222","#7A06A0"))

## `geom_smooth()` using formula 'y ~ x'

Intermediate States Analysis

###########################################
#   Intermediate States Analysis
###########################################

###### Intermediate state cells for CD8 T cells
fspy <- defLeafCells(fspy, leaf.cells = c(99,97))
fspy <- runWalk(fspy, verbose = TRUE)

## 2020-07-09 22:01:12 [INFO] Calculating walk between root.cells and leaf.cells .

## 2020-07-09 22:01:16 [INFO] Generating an adjacency matrix.

## 2020-07-09 22:02:41 [INFO] Walk forward.

## 2020-07-09 22:02:47 [INFO] Calculating walk completed.

fspy@meta.data$traj.value.log.CD8T <- fspy@meta.data$traj.value.log

### fetch plot information
plot.meta <- fetchPlotMeta(fspy, markers = colnames(fspy.data))

# heatmap for CD8 T cells
library(pheatmap)

plot.meta.sub <- plot.meta[which(plot.meta$traj.value.log.CD8T > 0), ]
plot.meta.sub <- plot.meta.sub[1:1000, ]
plot.meta.sub <- plot.meta.sub[order(plot.meta.sub$pseudotime), ]
pheatmap(t(plot.meta.sub[, colnames(fspy.data)]), scale  = "row",
         cluster_rows = T, cluster_cols = F, cluster_method = "ward.D",
         color = colorRampPalette(c("blue","blue","blue","white","red","red","red"))(100),
         fontsize_col = 0.01)

Bug Reports

If there is any error in installing or librarying the flowSpy package, please contact us via e-mail forlynna@sjtu.edu.cn

Link to the quick start tutorial

The quick start tutorial provides a quick-reading version of flowSpy workflow. See the quick start tutorial of flowSpy, please visit Quick start of flowSpy.

Link to the time-course tutorial

The time-course tutorial provides a more detailed version of how to deal with time-course FCS data using flowSpy. See time-course data analysis of flowSpy, please visit Time-course workflow of flowSpy.

Note

Dear flowSpy users:

To improve the identification of this package and avoid awkward duplication of names in some situations, we decided to change the name of flowSpy to CytoTree. The package name of CytoTree more fits the functional orientation of this software. The usage and update of flowSpy and CytoTree will be consistent until the end of Bioc 3.11. And for the 3.12 devel, flowSpy will be deprecated.

The package CytoTree has been added to Bioconductor (https://bioconductor.org/packages/CytoTree/), we recommend that users can download this package and replace flowSpy as soon as possible.

We apologized for the inconvenience.

flowSpy team

2020-07-09

Session Information

# Show session information
sessionInfo()

## R version 4.0.0 (2020-04-24)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.5
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] pheatmap_1.0.12 ggthemes_4.2.0  ggplot2_3.3.1   flowSpy_1.2.3  
## [5] igraph_1.2.5    LSD_4.0-0       flowCore_2.0.1 
## 
## loaded via a namespace (and not attached):
##   [1] reticulate_1.16             RUnit_0.4.32               
##   [3] tidyselect_1.1.0            RSQLite_2.2.0              
##   [5] AnnotationDbi_1.50.0        grid_4.0.0                 
##   [7] ranger_0.12.1               BiocParallel_1.22.0        
##   [9] Rtsne_0.15                  scatterpie_0.1.4           
##  [11] munsell_0.5.0               destiny_3.2.0              
##  [13] codetools_0.2-16            umap_0.2.5.0               
##  [15] withr_2.2.0                 colorspace_1.4-1           
##  [17] Biobase_2.48.0              knitr_1.28                 
##  [19] stats4_4.0.0                SingleCellExperiment_1.10.1
##  [21] robustbase_0.93-6           vcd_1.4-7                  
##  [23] VIM_6.0.0                   TTR_0.23-6                 
##  [25] labeling_0.3                GenomeInfoDbData_1.2.3     
##  [27] polyclip_1.10-0             bit64_0.9-7                
##  [29] farver_2.0.3                flowWorkspace_4.0.6        
##  [31] vctrs_0.3.1                 generics_0.0.2             
##  [33] xfun_0.14                   R6_2.4.1                   
##  [35] GenomeInfoDb_1.24.0         RcppEigen_0.3.3.7.0        
##  [37] rmdformats_0.3.7            locfit_1.5-9.4             
##  [39] bitops_1.0-6                DelayedArray_0.14.0        
##  [41] scales_1.1.1                nnet_7.3-14                
##  [43] gtable_0.3.0                sva_3.36.0                 
##  [45] RProtoBufLib_2.0.0          rlang_0.4.6                
##  [47] genefilter_1.70.0           scatterplot3d_0.3-41       
##  [49] flowUtils_1.52.0            splines_4.0.0              
##  [51] hexbin_1.28.1               BiocManager_1.30.10        
##  [53] yaml_2.2.1                  abind_1.4-5                
##  [55] RBGL_1.64.0                 tools_4.0.0                
##  [57] bookdown_0.19               ellipsis_0.3.1             
##  [59] RColorBrewer_1.1-2          proxy_0.4-24               
##  [61] BiocGenerics_0.34.0         Rcpp_1.0.4.6               
##  [63] plyr_1.8.6                  base64enc_0.1-3            
##  [65] zlibbioc_1.34.0             purrr_0.3.4                
##  [67] RCurl_1.98-1.2              FlowSOM_1.20.0             
##  [69] openssl_1.4.1               S4Vectors_0.26.1           
##  [71] zoo_1.8-8                   SummarizedExperiment_1.18.1
##  [73] haven_2.3.1                 cluster_2.1.0              
##  [75] magrittr_1.5                ncdfFlow_2.34.0            
##  [77] data.table_1.12.8           RSpectra_0.16-0            
##  [79] openxlsx_4.1.5              gmodels_2.18.1             
##  [81] lmtest_0.9-37               RANN_2.6.1                 
##  [83] pcaMethods_1.80.0           matrixStats_0.56.0         
##  [85] hms_0.5.3                   evaluate_0.14              
##  [87] xtable_1.8-4                smoother_1.1               
##  [89] XML_3.99-0.3                rio_0.5.16                 
##  [91] jpeg_0.1-8.1                mclust_5.4.6               
##  [93] readxl_1.3.1                IRanges_2.22.2             
##  [95] gridExtra_2.3               ggcyto_1.16.0              
##  [97] compiler_4.0.0              tibble_3.0.1               
##  [99] crayon_1.3.4                htmltools_0.4.0            
## [101] mgcv_1.8-31                 corpcor_1.6.9              
## [103] tidyr_1.1.0                 RcppParallel_5.0.1         
## [105] DBI_1.1.0                   tweenr_1.0.1               
## [107] MASS_7.3-51.6               boot_1.3-25                
## [109] Matrix_1.2-18               car_3.0-8                  
## [111] gdata_2.18.0                parallel_4.0.0             
## [113] GenomicRanges_1.40.0        forcats_0.5.0              
## [115] pkgconfig_2.0.3             rvcheck_0.1.8              
## [117] prettydoc_0.3.1             foreign_0.8-80             
## [119] laeken_0.5.1                sp_1.4-2                   
## [121] xml2_1.3.2                  annotate_1.66.0            
## [123] XVector_0.28.0              stringr_1.4.0              
## [125] digest_0.6.25               tsne_0.1-3                 
## [127] ConsensusClusterPlus_1.52.0 graph_1.66.0               
## [129] rmarkdown_2.2               cellranger_1.1.0           
## [131] edgeR_3.30.3                curl_4.3                   
## [133] gtools_3.8.2                ggplot.multistats_1.0.0    
## [135] nlme_3.1-148                lifecycle_0.2.0            
## [137] jsonlite_1.6.1              carData_3.0-4              
## [139] BiocNeighbors_1.6.0         askpass_1.1                
## [141] limma_3.44.3                pillar_1.4.4               
## [143] lattice_0.20-41             DEoptimR_1.0-8             
## [145] survival_3.2-3              glue_1.4.1                 
## [147] xts_0.12-0                  zip_2.0.4                  
## [149] png_0.1-7                   bit_1.1-15.2               
## [151] Rgraphviz_2.32.0            ggforce_0.3.1              
## [153] class_7.3-17                stringi_1.4.6              
## [155] blob_1.2.1                  RcppHNSW_0.2.0             
## [157] CytoML_2.0.5                latticeExtra_0.6-29        
## [159] memoise_1.1.0               dplyr_1.0.0                
## [161] cytolib_2.0.3               knn.covertree_1.0          
## [163] irlba_2.3.3                 e1071_1.7-3

Reference

[1] Liu Q, Herring CA, Sheng Q, Ping J, Simmons AJ, Chen B, et al. Quantitative assessment of cell population diversity in single-cell landscapes. PLoS Biol. 2018;16:e2006687.

[2] Olsen LR, Leipold MD, Pedersen CB, Maecker HT. The anatomy of single cell mass cytometry data. Cytometry A. 2019;95:156-72.

[3] Van Gassen S, Callebaut B, Van Helden MJ, Lambrecht BN, Demeester P, Dhaene T, et al. FlowSOM: Using self-organizing maps for visualization and interpretation of cytometry data. Cytometry Part A. 2015;87:636-45.

[4] Aghaeepour N, Nikolic R, Hoos HH, Brinkman RR. Rapid cell population identification in flow cytometry data. Cytometry A. 2011;79:6-13.

[5] Chen H, Lau MC, Wong MT, Newell EW, Poidinger M, Chen J. Cytofkit: A Bioconductor Package for an Integrated Mass Cytometry Data Analysis Pipeline. PLoS Comput Biol. 2016;12:e1005112.

[6] Liu X, Song W, Wong BY, Zhang T, Yu S, Lin GN, et al. A comparison framework and guideline of clustering methods for mass cytometry data. Genome Biol. 2019;20:297.