Title: | Haplotype-Aware CNV Analysis from scRNA-Seq |
---|---|
Description: | A computational method that infers copy number variations (CNVs) in cancer scRNA-seq data and reconstructs the tumor phylogeny. 'numbat' integrates signals from gene expression, allelic ratio, and population haplotype structures to accurately infer allele-specific CNVs in single cells and reconstruct their lineage relationship. 'numbat' can be used to: 1. detect allele-specific copy number variations from single-cells; 2. differentiate tumor versus normal cells in the tumor microenvironment; 3. infer the clonal architecture and evolutionary history of profiled tumors. 'numbat' does not require tumor/normal-paired DNA or genotype data, but operates solely on the donor scRNA-data data (for example, 10x Cell Ranger output). Additional examples and documentations are available at <https://kharchenkolab.github.io/numbat/>. For details on the method please see Gao et al. Nature Biotechnology (2022) <doi:10.1038/s41587-022-01468-y>. |
Authors: | Teng Gao [cre, aut], Ruslan Soldatov [aut], Hirak Sarkar [aut], Evan Biederstedt [aut], Peter Kharchenko [aut] |
Maintainer: | Teng Gao <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.4.2 |
Built: | 2024-11-18 05:26:04 UTC |
Source: | https://github.com/kharchenkolab/numbat |
centromere regions (hg19)
acen_hg19
acen_hg19
An object of class tbl_df
(inherits from tbl
, data.frame
) with 22 rows and 3 columns.
centromere regions (hg38)
acen_hg38
acen_hg38
An object of class tbl_df
(inherits from tbl
, data.frame
) with 22 rows and 3 columns.
Utility function to make reference gene expression profiles
aggregate_counts(count_mat, annot, normalized = TRUE, verbose = TRUE)
aggregate_counts(count_mat, annot, normalized = TRUE, verbose = TRUE)
count_mat |
matrix/dgCMatrix Gene expression counts |
annot |
dataframe Cell annotation with columns "cell" and "group" |
normalized |
logical Whether to return normalized expression values |
verbose |
logical Verbosity |
matrix Reference gene expression levels
ref_custom = aggregate_counts(count_mat_ref, annot_ref, verbose = FALSE)
ref_custom = aggregate_counts(count_mat_ref, annot_ref, verbose = FALSE)
Call CNVs in a pseudobulk profile using the Numbat joint HMM
analyze_bulk( bulk, t = 1e-05, gamma = 20, theta_min = 0.08, logphi_min = 0.25, nu = 1, min_genes = 10, exp_only = FALSE, allele_only = FALSE, bal_cnv = TRUE, retest = TRUE, find_diploid = TRUE, diploid_chroms = NULL, classify_allele = FALSE, run_hmm = TRUE, prior = NULL, exclude_neu = TRUE, phasing = TRUE, verbose = TRUE )
analyze_bulk( bulk, t = 1e-05, gamma = 20, theta_min = 0.08, logphi_min = 0.25, nu = 1, min_genes = 10, exp_only = FALSE, allele_only = FALSE, bal_cnv = TRUE, retest = TRUE, find_diploid = TRUE, diploid_chroms = NULL, classify_allele = FALSE, run_hmm = TRUE, prior = NULL, exclude_neu = TRUE, phasing = TRUE, verbose = TRUE )
bulk |
dataframe Pesudobulk profile |
t |
numeric Transition probability |
gamma |
numeric Dispersion parameter for the Beta-Binomial allele model |
theta_min |
numeric Minimum imbalance threshold |
logphi_min |
numeric Minimum log expression deviation threshold |
nu |
numeric Phase switch rate |
min_genes |
integer Minimum number of genes to call an event |
exp_only |
logical Whether to run expression-only HMM |
allele_only |
logical Whether to run allele-only HMM |
bal_cnv |
logical Whether to call balanced amplifications/deletions |
retest |
logical Whether to retest CNVs after Viterbi decoding |
find_diploid |
logical Whether to run diploid region identification routine |
diploid_chroms |
character vector User-given chromosomes that are known to be in diploid state |
classify_allele |
logical Whether to only classify allele (internal use only) |
run_hmm |
logical Whether to run HMM (internal use only) |
prior |
numeric vector Prior probabilities of states (internal use only) |
exclude_neu |
logical Whether to exclude neutral segments from retesting (internal use only) |
phasing |
logical Whether to use phasing information (internal use only) |
verbose |
logical Verbosity |
a pseudobulk profile dataframe with called CNV information
bulk_analyzed = analyze_bulk(bulk_example, t = 1e-5, find_diploid = FALSE, retest = FALSE)
bulk_analyzed = analyze_bulk(bulk_example, t = 1e-5, find_diploid = FALSE, retest = FALSE)
example reference cell annotation
annot_ref
annot_ref
An object of class data.frame
with 50 rows and 2 columns.
Annotate genes on allele dataframe
annotate_genes(df, gtf)
annotate_genes(df, gtf)
df |
dataframe Allele count dataframe |
gtf |
dataframe Gene gtf |
dataframe Allele dataframe with gene column
example pseudobulk dataframe
bulk_example
bulk_example
An object of class tbl_df
(inherits from tbl
, data.frame
) with 3935 rows and 83 columns.
chromosome sizes (hg19)
chrom_sizes_hg19
chrom_sizes_hg19
An object of class data.table
(inherits from data.frame
) with 22 rows and 2 columns.
chromosome sizes (hg38)
chrom_sizes_hg38
chrom_sizes_hg38
An object of class data.table
(inherits from data.frame
) with 22 rows and 2 columns.
Plot CNV heatmap
cnv_heatmap( segs, var = "group", label_group = TRUE, legend = TRUE, exclude_gap = TRUE, genome = "hg38" )
cnv_heatmap( segs, var = "group", label_group = TRUE, legend = TRUE, exclude_gap = TRUE, genome = "hg38" )
segs |
dataframe Segments to plot. Need columns "seg_start", "seg_end", "cnv_state" |
var |
character Column to facet by |
label_group |
logical Label the groups |
legend |
logical Display the legend |
exclude_gap |
logical Whether to mark gap regions |
genome |
character Genome build, either 'hg38' or 'hg19' |
ggplot Heatmap of CNVs along the genome
p = cnv_heatmap(segs_example)
p = cnv_heatmap(segs_example)
example gene expression count matrix
count_mat_example
count_mat_example
An object of class dgCMatrix
with 1024 rows and 173 columns.
example reference count matrix
count_mat_ref
count_mat_ref
An object of class dgCMatrix
with 1000 rows and 50 columns.
Call clonal LOH using SNP density. Rcommended for cell lines or tumor samples with no normal cells.
detect_clonal_loh(bulk, t = 1e-05, snp_rate_loh = 5, min_depth = 0)
detect_clonal_loh(bulk, t = 1e-05, snp_rate_loh = 5, min_depth = 0)
bulk |
dataframe Pseudobulk profile |
t |
numeric Transition probability |
snp_rate_loh |
numeric The assumed SNP density in clonal LOH regions |
min_depth |
integer Minimum coverage to filter SNPs |
dataframe LOH segments
segs_loh = detect_clonal_loh(bulk_example)
segs_loh = detect_clonal_loh(bulk_example)
example allele count dataframe
df_allele_example
df_allele_example
An object of class data.frame
with 41167 rows and 11 columns.
genome gap regions (hg19)
gaps_hg19
gaps_hg19
An object of class data.table
(inherits from data.frame
) with 28 rows and 3 columns.
genome gap regions (hg38)
gaps_hg38
gaps_hg38
An object of class data.table
(inherits from data.frame
) with 30 rows and 3 columns.
Aggregate single-cell data into combined bulk expression and allele profile
get_bulk( count_mat, lambdas_ref, df_allele, gtf, subset = NULL, min_depth = 0, nu = 1, segs_loh = NULL, verbose = TRUE )
get_bulk( count_mat, lambdas_ref, df_allele, gtf, subset = NULL, min_depth = 0, nu = 1, segs_loh = NULL, verbose = TRUE )
count_mat |
dgCMatrix Gene expression counts |
lambdas_ref |
matrix Reference expression profiles |
df_allele |
dataframe Single-cell allele counts |
gtf |
dataframe Transcript gtf |
subset |
vector Subset of cells to aggregate |
min_depth |
integer Minimum coverage to filter SNPs |
nu |
numeric Phase switch rate |
segs_loh |
dataframe Segments with clonal LOH to be excluded |
verbose |
logical Verbosity |
dataframe Pseudobulk gene expression and allele profile
bulk_example = get_bulk( count_mat = count_mat_example, lambdas_ref = ref_hca, df_allele = df_allele_example, gtf = gtf_hg38)
bulk_example = get_bulk( count_mat = count_mat_example, lambdas_ref = ref_hca, df_allele = df_allele_example, gtf = gtf_hg38)
Specify either max_cost or n_cut. max_cost works similarly as h and n_cut works similarly as k in stats::cutree. The top-level normal diploid clone is always included.
get_gtree(tree, P, n_cut = 0, max_cost = 0)
get_gtree(tree, P, n_cut = 0, max_cost = 0)
tree |
phylo Single-cell phylogenetic tree |
P |
matrix Genotype probability matrix |
n_cut |
integer Number of cuts on the phylogeny to define subclones |
max_cost |
numeric Likelihood threshold to collapse internal branches |
tbl_graph Phylogeny annotated with branch lengths and mutation events
example smoothed gene expression dataframe
gexp_roll_example
gexp_roll_example
An object of class data.frame
with 10 rows and 2000 columns.
gene model (hg19)
gtf_hg19
gtf_hg19
An object of class data.table
(inherits from data.frame
) with 26841 rows and 5 columns.
gene model (hg38)
gtf_hg38
gtf_hg38
An object of class data.table
(inherits from data.frame
) with 26807 rows and 5 columns.
gene model (mm10)
gtf_mm10
gtf_mm10
An object of class data.table
(inherits from data.frame
) with 30336 rows and 5 columns.
example hclust tree
hc_example
hc_example
An object of class hclust
of length 7.
example joint single-cell cnv posterior dataframe
joint_post_example
joint_post_example
An object of class data.table
(inherits from data.frame
) with 3806 rows and 71 columns.
example mutation graph
mut_graph_example
mut_graph_example
An object of class igraph
of length 5.
Used to allow users to plot results
a new 'Numbat' object
label
character Sample name
gtf
dataframe Transcript annotation
joint_post
dataframe Joint posterior
exp_post
dataframe Expression posterior
allele_post
dataframe Allele posetrior
bulk_subtrees
dataframe Bulk profiles of lineage subtrees
bulk_clones
dataframe Bulk profiles of clones
segs_consensus
dataframe Consensus segments
tree_post
list Tree posterior
mut_graph
igraph Mutation history graph
gtree
tbl_graph Single-cell phylogeny
clone_post
dataframe Clone posteriors
gexp_roll_wide
matrix Smoothed expression of single cells
P
matrix Genotype probability matrix
treeML
matrix Maximum likelihood tree as phylo object
hc
hclust Initial hierarchical clustering
new()
initialize Numbat class
Numbat$new(out_dir, i = 2, gtf = gtf_hg38, verbose = TRUE)
out_dir
character string Output directory
i
integer Get results from which iteration (default=2)
gtf
dataframe Transcript gtf (default=gtf_hg38)
verbose
logical Whether to output verbose results (default=TRUE)
a new 'Numbat' object
plot_phylo_heatmap()
Plot the single-cell CNV calls in a heatmap and the corresponding phylogeny
Numbat$plot_phylo_heatmap(...)
...
additional parameters passed to plot_phylo_heatmap()
plot_exp_roll()
Plot window-smoothed expression profiles
Numbat$plot_exp_roll(k = 3, n_sample = 300, ...)
k
integer Number of clusters
n_sample
integer Number of cells to subsample
...
additional parameters passed to plot_exp_roll()
plot_mut_history()
Plot the mutation history of the tumor
Numbat$plot_mut_history(...)
...
additional parameters passed to plot_mut_history()
plot_sc_tree()
Plot the single cell phylogeny
Numbat$plot_sc_tree(...)
...
additional parameters passed to plot_sc_tree()
plot_consensus()
Plot consensus segments
Numbat$plot_consensus(...)
...
additional parameters passed to plot_sc_tree()
plot_clone_profile()
Plot clone cnv profiles
Numbat$plot_clone_profile(...)
...
additional parameters passed to plot_clone_profile()
cutree()
Re-define subclones on the phylogeny.
Numbat$cutree(max_cost = 0, n_cut = 0)
max_cost
numeric Likelihood threshold to collapse internal branches
n_cut
integer Number of cuts on the phylogeny to define subclones
clone()
The objects of this class are cloneable with this method.
Numbat$clone(deep = FALSE)
deep
Whether to make a deep clone.
example single-cell phylogeny
phylogeny_example
phylogeny_example
An object of class tbl_graph
(inherits from igraph
) of length 345.
Plot a group of pseudobulk HMM profiles
plot_bulks(bulks, ..., ncol = 1, title = TRUE, title_size = 8)
plot_bulks(bulks, ..., ncol = 1, title = TRUE, title_size = 8)
bulks |
dataframe Pseudobulk profiles annotated with "sample" column |
... |
additional parameters passed to plot_psbulk() |
ncol |
integer Number of columns |
title |
logical Whether to add titles to individual plots |
title_size |
numeric Size of titles |
a ggplot object
p = plot_bulks(bulk_example)
p = plot_bulks(bulk_example)
Plot consensus CNVs
plot_consensus(segs)
plot_consensus(segs)
segs |
dataframe Consensus segments |
ggplot object
p = plot_consensus(segs_example)
p = plot_consensus(segs_example)
Plot single-cell smoothed expression magnitude heatmap
plot_exp_roll( gexp_roll_wide, hc, k, gtf, lim = 0.8, n_sample = 300, reverse = TRUE, plot_tree = TRUE )
plot_exp_roll( gexp_roll_wide, hc, k, gtf, lim = 0.8, n_sample = 300, reverse = TRUE, plot_tree = TRUE )
gexp_roll_wide |
matrix Cell x gene smoothed expression magnitudes |
hc |
hclust Hierarchical clustring result |
k |
integer Number of clusters |
gtf |
dataframe Transcript GTF |
lim |
numeric Limit for expression magnitudes |
n_sample |
integer Number of cells to subsample |
reverse |
logical Whether to reverse the cell order |
plot_tree |
logical Whether to plot the dendrogram |
ggplot A single-cell heatmap of window-smoothed expression CNV signals
p = plot_exp_roll(gexp_roll_example, gtf = gtf_hg38, hc = hc_example, k = 3)
p = plot_exp_roll(gexp_roll_example, gtf = gtf_hg38, hc = hc_example, k = 3)
Plot mutational history
plot_mut_history( G, clone_post = NULL, edge_label_size = 4, node_label_size = 6, node_size = 10, arrow_size = 2, show_clone_size = TRUE, show_distance = TRUE, legend = TRUE, edge_label = TRUE, node_label = TRUE, horizontal = TRUE, pal = NULL )
plot_mut_history( G, clone_post = NULL, edge_label_size = 4, node_label_size = 6, node_size = 10, arrow_size = 2, show_clone_size = TRUE, show_distance = TRUE, legend = TRUE, edge_label = TRUE, node_label = TRUE, horizontal = TRUE, pal = NULL )
G |
igraph Mutation history graph |
clone_post |
dataframe Clone assignment posteriors |
edge_label_size |
numeric Size of edge label |
node_label_size |
numeric Size of node label |
node_size |
numeric Size of nodes |
arrow_size |
numeric Size of arrows |
show_clone_size |
logical Whether to show clone size |
show_distance |
logical Whether to show evolutionary distance between clones |
legend |
logical Whether to show legend |
edge_label |
logical Whether to label edges |
node_label |
logical Whether to label nodes |
horizontal |
logical Whether to use horizontal layout |
pal |
named vector Node colors |
ggplot object
p = plot_mut_history(mut_graph_example)
p = plot_mut_history(mut_graph_example)
Plot single-cell CNV calls along with the clonal phylogeny
plot_phylo_heatmap( gtree, joint_post, segs_consensus, clone_post = NULL, p_min = 0.9, annot = NULL, pal_annot = NULL, annot_title = "Annotation", annot_scale = NULL, clone_dict = NULL, clone_bar = TRUE, clone_stack = TRUE, pal_clone = NULL, clone_title = "Genotype", clone_legend = TRUE, line_width = 0.1, tree_height = 1, branch_width = 0.2, tip_length = 0.2, annot_bar_width = 0.25, clone_bar_width = 0.25, bar_label_size = 7, tvn_line = TRUE, clone_line = FALSE, exclude_gap = FALSE, root_edge = TRUE, raster = FALSE, show_phylo = TRUE )
plot_phylo_heatmap( gtree, joint_post, segs_consensus, clone_post = NULL, p_min = 0.9, annot = NULL, pal_annot = NULL, annot_title = "Annotation", annot_scale = NULL, clone_dict = NULL, clone_bar = TRUE, clone_stack = TRUE, pal_clone = NULL, clone_title = "Genotype", clone_legend = TRUE, line_width = 0.1, tree_height = 1, branch_width = 0.2, tip_length = 0.2, annot_bar_width = 0.25, clone_bar_width = 0.25, bar_label_size = 7, tvn_line = TRUE, clone_line = FALSE, exclude_gap = FALSE, root_edge = TRUE, raster = FALSE, show_phylo = TRUE )
gtree |
tbl_graph The single-cell phylogeny |
joint_post |
dataframe Joint single cell CNV posteriors |
segs_consensus |
datatframe Consensus segment dataframe |
clone_post |
dataframe Clone assignment posteriors |
p_min |
numeric Probability threshold to display CNV calls |
annot |
dataframe Cell annotations, dataframe with 'cell' and additional annotation columns |
pal_annot |
named vector Colors for cell annotations |
annot_title |
character Legend title for the annotation bar |
annot_scale |
ggplot scale Color scale for the annotation bar |
clone_dict |
named vector Clone annotations, mapping from cell name to clones |
clone_bar |
logical Whether to display clone bar plot |
clone_stack |
character Whether to plot clone assignment probabilities as stacked bar |
pal_clone |
named vector Clone colors |
clone_title |
character Legend title for the clone bar |
clone_legend |
logical Whether to display the clone legend |
line_width |
numeric Line width for CNV heatmap |
tree_height |
numeric Relative height of the phylogeny plot |
branch_width |
numeric Line width in the phylogeny |
tip_length |
numeric Length of tips in the phylogeny |
annot_bar_width |
numeric Width of annotation bar |
clone_bar_width |
numeric Width of clone genotype bar |
bar_label_size |
numeric Size of sidebar text labels |
tvn_line |
logical Whether to draw line separating tumor and normal cells |
clone_line |
logical Whether to display borders for clones in the heatmap |
exclude_gap |
logical Whether to mark gap regions |
root_edge |
logical Whether to plot root edge |
raster |
logical Whether to raster images |
show_phylo |
logical Whether to display phylogeny on y axis |
ggplot panel
p = plot_phylo_heatmap( gtree = phylogeny_example, joint_post = joint_post_example, segs_consensus = segs_example)
p = plot_phylo_heatmap( gtree = phylogeny_example, joint_post = joint_post_example, segs_consensus = segs_example)
Plot a pseudobulk HMM profile
plot_psbulk( bulk, use_pos = TRUE, allele_only = FALSE, min_LLR = 5, min_depth = 8, exp_limit = 2, phi_mle = TRUE, theta_roll = FALSE, dot_size = 0.8, dot_alpha = 0.5, legend = TRUE, exclude_gap = TRUE, genome = "hg38", text_size = 10, raster = FALSE )
plot_psbulk( bulk, use_pos = TRUE, allele_only = FALSE, min_LLR = 5, min_depth = 8, exp_limit = 2, phi_mle = TRUE, theta_roll = FALSE, dot_size = 0.8, dot_alpha = 0.5, legend = TRUE, exclude_gap = TRUE, genome = "hg38", text_size = 10, raster = FALSE )
bulk |
dataframe Pseudobulk profile |
use_pos |
logical Use marker position instead of index as x coordinate |
allele_only |
logical Only plot alleles |
min_LLR |
numeric LLR threshold for event filtering |
min_depth |
numeric Minimum coverage depth for a SNP to be plotted |
exp_limit |
numeric Expression logFC axis limit |
phi_mle |
logical Whether to plot estimates of segmental expression fold change |
theta_roll |
logical Whether to plot rolling estimates of allele imbalance |
dot_size |
numeric Size of marker dots |
dot_alpha |
numeric Transparency of the marker dots |
legend |
logical Whether to show legend |
exclude_gap |
logical Whether to mark gap regions and centromeres |
genome |
character Genome build, either 'hg38' or 'hg19' |
text_size |
numeric Size of text in the plot |
raster |
logical Whether to raster images |
ggplot Plot of pseudobulk HMM profile
p = plot_psbulk(bulk_example)
p = plot_psbulk(bulk_example)
Plot single-cell smoothed expression magnitude heatmap
plot_sc_tree( gtree, label_mut = TRUE, label_size = 3, dot_size = 2, branch_width = 0.5, tip = TRUE, tip_length = 0.5, pal_clone = NULL )
plot_sc_tree( gtree, label_mut = TRUE, label_size = 3, dot_size = 2, branch_width = 0.5, tip = TRUE, tip_length = 0.5, pal_clone = NULL )
gtree |
tbl_graph The single-cell phylogeny |
label_mut |
logical Whether to label mutations |
label_size |
numeric Size of mutation labels |
dot_size |
numeric Size of mutation nodes |
branch_width |
numeric Width of branches in tree |
tip |
logical Whether to plot tip point |
tip_length |
numeric Length of the tips |
pal_clone |
named vector Clone colors |
ggplot A single-cell phylogeny with mutation history labeled
p = plot_sc_tree(phylogeny_example)
p = plot_sc_tree(phylogeny_example)
HMM object for unit tests
pre_likelihood_hmm
pre_likelihood_hmm
An object of class list
of length 10.
reference expression magnitudes from HCA
ref_hca
ref_hca
An object of class matrix
(inherits from array
) with 24756 rows and 12 columns.
reference expression counts from HCA
ref_hca_counts
ref_hca_counts
An object of class matrix
(inherits from array
) with 24857 rows and 12 columns.
Run workflow to decompose tumor subclones
run_numbat( count_mat, lambdas_ref, df_allele, genome = "hg38", out_dir = tempdir(), max_iter = 2, max_nni = 100, t = 1e-05, gamma = 20, min_LLR = 5, alpha = 1e-04, eps = 1e-05, max_entropy = 0.5, init_k = 3, min_cells = 50, tau = 0.3, nu = 1, max_cost = ncol(count_mat) * tau, n_cut = 0, min_depth = 0, common_diploid = TRUE, min_overlap = 0.45, ncores = 1, ncores_nni = ncores, random_init = FALSE, segs_loh = NULL, call_clonal_loh = FALSE, verbose = TRUE, diploid_chroms = NULL, segs_consensus_fix = NULL, use_loh = NULL, min_genes = 10, skip_nj = FALSE, multi_allelic = TRUE, p_multi = 1 - alpha, plot = TRUE, check_convergence = FALSE, exclude_neu = TRUE )
run_numbat( count_mat, lambdas_ref, df_allele, genome = "hg38", out_dir = tempdir(), max_iter = 2, max_nni = 100, t = 1e-05, gamma = 20, min_LLR = 5, alpha = 1e-04, eps = 1e-05, max_entropy = 0.5, init_k = 3, min_cells = 50, tau = 0.3, nu = 1, max_cost = ncol(count_mat) * tau, n_cut = 0, min_depth = 0, common_diploid = TRUE, min_overlap = 0.45, ncores = 1, ncores_nni = ncores, random_init = FALSE, segs_loh = NULL, call_clonal_loh = FALSE, verbose = TRUE, diploid_chroms = NULL, segs_consensus_fix = NULL, use_loh = NULL, min_genes = 10, skip_nj = FALSE, multi_allelic = TRUE, p_multi = 1 - alpha, plot = TRUE, check_convergence = FALSE, exclude_neu = TRUE )
count_mat |
dgCMatrix Raw count matrices where rownames are genes and column names are cells |
lambdas_ref |
matrix Either a named vector with gene names as names and normalized expression as values, or a matrix where rownames are genes and columns are pseudobulk names |
df_allele |
dataframe Allele counts per cell, produced by preprocess_allele |
genome |
character Genome version (hg38, hg19, or mm10) |
out_dir |
string Output directory |
max_iter |
integer Maximum number of iterations to run the phyologeny optimization |
max_nni |
integer Maximum number of iterations to run NNI in the ML phylogeny inference |
t |
numeric Transition probability |
gamma |
numeric Dispersion parameter for the Beta-Binomial allele model |
min_LLR |
numeric Minimum LLR to filter CNVs |
alpha |
numeric P value cutoff for diploid finding |
eps |
numeric Convergence threshold for ML tree search |
max_entropy |
numeric Entropy threshold to filter CNVs |
init_k |
integer Number of clusters in the initial clustering |
min_cells |
integer Minimum number of cells to run HMM on |
tau |
numeric Factor to determine max_cost as a function of the number of cells (0-1) |
nu |
numeric Phase switch rate |
max_cost |
numeric Likelihood threshold to collapse internal branches |
n_cut |
integer Number of cuts on the phylogeny to define subclones |
min_depth |
integer Minimum allele depth |
common_diploid |
logical Whether to find common diploid regions in a group of peusdobulks |
min_overlap |
numeric Minimum CNV overlap threshold |
ncores |
integer Number of threads to use |
ncores_nni |
integer Number of threads to use for NNI |
random_init |
logical Whether to initiate phylogney using a random tree (internal use only) |
segs_loh |
dataframe Segments of clonal LOH to be excluded |
call_clonal_loh |
logical Whether to call segments with clonal LOH |
verbose |
logical Verbosity |
diploid_chroms |
vector Known diploid chromosomes |
segs_consensus_fix |
dataframe Pre-determined segmentation of consensus CNVs |
use_loh |
logical Whether to include LOH regions in the expression baseline |
min_genes |
integer Minimum number of genes to call a segment |
skip_nj |
logical Whether to skip NJ tree construction and only use UPGMA |
multi_allelic |
logical Whether to call multi-allelic CNVs |
p_multi |
numeric P value cutoff for calling multi-allelic CNVs |
plot |
logical Whether to plot results |
check_convergence |
logical Whether to terminate iterations based on consensus CNV convergence |
exclude_neu |
logical Whether to exclude neutral segments from CNV retesting (internal use only) |
a status code
example CNV segments dataframe
segs_example
segs_example
An object of class data.table
(inherits from data.frame
) with 27 rows and 30 columns.
UPGMA and WPGMA clustering
upgma(D, method = "average", ...)
upgma(D, method = "average", ...)
D |
A distance matrix. |
method |
The agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "ward", "single", "complete", "average", "mcquitty", "median" or "centroid". The default is "average". |
... |
Further arguments passed to or from other methods. |
example VCF header
vcf_meta
vcf_meta
An object of class character
of length 65.