Full text
37,048 characters
· extracted from
preprint-html
· click to expand
PLO(SC)2: Plots and Scripts for scRNA-seq analysis | bioRxiv /* */ /* */ <!-- <!-- /*! * yepnope1.5.4 * (c) WTFPL, GPLv2 */ (function(a,b,c){function d(a){return"[object Function]"==o.call(a)}function e(a){return"string"==typeof a}function f(){}function g(a){return!a||"loaded"==a||"complete"==a||"uninitialized"==a}function h(){var a=p.shift();q=1,a?a.t?m(function(){("c"==a.t?B.injectCss:B.injectJs)(a.s,0,a.a,a.x,a.e,1)},0):(a(),h()):q=0}function i(a,c,d,e,f,i,j){function k(b){if(!o&&g(l.readyState)&&(u.r=o=1,!q&&h(),l.onload=l.onreadystatechange=null,b)){"img"!=a&&m(function(){t.removeChild(l)},50);for(var d in y[c])y[c].hasOwnProperty(d)&&y[c][d].onload()}}var j=j||B.errorTimeout,l=b.createElement(a),o=0,r=0,u={t:d,s:c,e:f,a:i,x:j};1===y[c]&&(r=1,y[c]=[]),"object"==a?l.data=c:(l.src=c,l.type=a),l.width=l.height="0",l.onerror=l.onload=l.onreadystatechange=function(){k.call(this,r)},p.splice(e,0,u),"img"!=a&&(r||2===y[c]?(t.insertBefore(l,s?null:n),m(k,j)):y[c].push(l))}function j(a,b,c,d,f){return q=0,b=b||"j",e(a)?i("c"==b?v:u,a,b,this.i++,c,d,f):(p.splice(this.i++,0,a),1==p.length&&h()),this}function k(){var a=B;return a.loader={load:j,i:0},a}var l=b.documentElement,m=a.setTimeout,n=b.getElementsByTagName("script")[0],o={}.toString,p=[],q=0,r="MozAppearance"in l.style,s=r&&!!b.createRange().compareNode,t=s?l:n.parentNode,l=a.opera&&"[object Opera]"==o.call(a.opera),l=!!b.attachEvent&&!l,u=r?"object":l?"script":"img",v=l?"script":u,w=Array.isArray||function(a){return"[object Array]"==o.call(a)},x=[],y={},z={timeout:function(a,b){return b.length&&(a.timeout=b[0]),a}},A,B;B=function(a){function b(a){var a=a.split("!"),b=x.length,c=a.pop(),d=a.length,c={url:c,origUrl:c,prefixes:a},e,f,g;for(f=0;f<d;f++)g=a[f].split("="),(e=z[g.shift()])&&(c=e(c,g));for(f=0;f<b;f++)c=x[f](c);return c}function g(a,e,f,g,h){var i=b(a),j=i.autoCallback;i.url.split(".").pop().split("?").shift(),i.bypass||(e&&(e=d(e)?e:e[a]||e[g]||e[a.split("/").pop().split("?")[0]]),i.instead?i.instead(a,e,f,g,h):(y[i.url]?i.noexec=!0:y[i.url]=1,f.load(i.url,i.forceCSS||!i.forceJS&&"css"==i.url.split(".").pop().split("?").shift()?"c":c,i.noexec,i.attrs,i.timeout),(d(e)||d(j))&&f.load(function(){k(),e&&e(i.origUrl,h,g),j&&j(i.origUrl,h,g),y[i.url]=2})))}function h(a,b){function c(a,c){if(a){if(e(a))c||(j=function(){var a=[].slice.call(arguments);k.apply(this,a),l()}),g(a,j,b,0,h);else if(Object(a)===a)for(n in m=function(){var b=0,c;for(c in a)a.hasOwnProperty(c)&&b++;return b}(),a)a.hasOwnProperty(n)&&(!c&&!--m&&(d(j)?j=function(){var a=[].slice.call(arguments);k.apply(this,a),l()}:j[n]=function(a){return function(){var b=[].slice.call(arguments);a&&a.apply(this,b),l()}}(k[n])),g(a[n],j,b,n,h))}else!c&&l()}var h=!!a.test,i=a.load||a.both,j=a.callback||f,k=j,l=a.complete||f,m,n;c(h?a.yep:a.nope,!!i),i&&c(i)}var i,j,l=this.yepnope.loader;if(e(a))g(a,0,l,0);else if(w(a))for(i=0;i (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0];var j=d.createElement(s);var dl=l!='dataLayer'?'&l='+l:'';j.src='//www.googletagmanager.com/gtm.js?id='+i+dl;j.type='text/javascript';j.async=true;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-M677548'); Skip to main content Home About Submit ALERTS / RSS Search for this keyword Advanced Search New Results PLO(SC) 2 : Plots and Scripts for scRNA-seq analysis View ORCID Profile Markus Joppich doi: https://doi.org/10.1101/2025.03.09.642205 Markus Joppich 1 Institute for Informatics, LMU Munich Find this author on Google Scholar Find this author on PubMed Search for this author on this site ORCID record for Markus Joppich For correspondence: joppich{at}bio.ifi.lmu.de Abstract Full Text Info/History Metrics Data/Code Preview PDF ABSTRACT Background scRNA-seq analysis has become a standard technique for studying biological systems. As costs decrease, scRNA-seq experiments become increasingly complex. While typical scRNA-seq analysis frameworks provide basic functionality to analyze such data sets, downstream analysis and visualization become a bottleneck. Standard plots are not always suitable to provide specific insight into such complex data sets and should be extended to provide camera-ready, meaningful plots. Results With PLO(SC) 2 , a collection of plotting and analysis scripts for use in Seurat-based scRNA-seq data analyses is presented, which are accessible for custom script-based analyses or within an R shiny app. The analysis scripts mainly provide a collection of code blocks which enable a comfortable basic analysis of scRNA-seq data from Seurat object creation, filtering, and over data set integration in less than 10 function calls. Subsequently, code blocks for performing differential and enrichment analyses and corresponding visualizations are provided. Finally, several enhanced visualizations are provided, such as the enhanced Heatmap, DotPlot and comparative Box-/Violin plots. These, particularly, allow the user to specify how the shown values should be scaled, allowing the accurate creation of condition-wise plots. Conclusion With the PLO(SC) 2 framework data analysis of scRNA-seq experiments is performed more comfortable and stream-lined, while visualizations are enhanced to be suitable for interpreting complex datasets. The PLO(SC) 2 scripts are available from GitHub and include a vignette showing how PLO(SC) 2 is applied within a script-based analysis, as well as an R shiny app. INTRODUCTION Single-cell RNA sequencing (scRNA-seq) is becoming increasingly popular. The number of publications on scRNA-seq listed in PubMed nearly doubled from 2020 to 2021 and reached a new record in 2024. As scRNA-seq becomes more accessible and wet-lab protocols become easier to perform, it is also increasingly used to perform large comparisons of multiple scRNA-seq libraries from different disease states. Such complex experimental setups are supported by existing frameworks, but require many function calls. Furthermore, standard plots are not always suitable to provide specific insight into such complex data sets and should be extended to provide camera-ready, fully interpretable plots. This motivates the creation of a collection of scripts to streamline first the pre-processing steps of scRNA-seq experiments, and then the downstream analysis and plotting. While typical scRNA-seq analysis frameworks such as Seurat ( Butler et al., 2018 ) or scanpy ( Wolf et al., 2018 ) provide functionalities for the analysis of simple data sets, the analysis of complex data sets usually requires additional administrative work by the user: first each library has to be filtered separately, then different libraries have to be integrated, and finally differential gene expression analyses have to be performed for specific conditions. The results are then summarized in various types of plots, such as dimensional plots, dot plots, or heat maps. It is at this stage that visualizations need to be performed with particular care: Dot plots and heat maps often show scaled expression values, so it is important to be able to specify on which data the scaling is calculated. All of these tasks are included in the PLO(SC) 2 collection of scripts. For ease of use, the functions included in PLO(SC) 2 expose only the most important parameters to the user. The functions aim to reduce the amount of code the user has to write and makes it thus easier to handle large data sets. For such large data sets, often consisting of multiple conditions, the default visualizations are not enough to present the data in an adequate way. In order to give an overview of conditions and timepoints, a 2D array of UMAP-plots is useful. Violin-plots are often used to compare values across conditions, and profit from the addition of statistical tests (such as t-test). When comparing multiple conditions or measurement points, data can be visualized side by side. However, when scaled data points are displayed instead of absolute data points, as is often the case in scRNA-seq data analysis, it is important to be able to control which values the scaling was calculated on. PLO(SC) 2 ‘s enhanced heat map and dot plot make this possible. Much of the functionality of PLO(SC) 2 is also available through a Shiny app, so PLO(SC) 2 can be used by more advanced users from scripts, or by anyone using a standalone application. IMPLEMENTATION PLO(SC) 2 builds upon the Seurat ( Butler et al., 2018 ) scRNA-seq analysis framework and can easily be integrated into new analyses by sourcing the PLO(SC) 2 scripts from GitHub. It takes advantage of several publicly available R packages for data visualization or analysis, like ggplot2 ( Wickham, 2009 ) and clusterProfiler ( Yu and He, 2016 ; Wu et al., 2021 ). In general, most functions require a Seurat object as input, as well as a group.by and split.by clause, identifying the meta-data columns by which plots or analyses are to be generated. PLO(SC) 2 app The PLO(SC) 2 app is a graphical user-interface to (most) of the functionality of PLO(SC) 2 . It is implemented using R shiny and converted to a all-in-one executable for the Windows operating system using the R packages executablePackeR for creating the portable R environment, and https://github.com/wleepang/DesktopDeployR for creating the stand-alone application. RESULTS & DISCUSSION The PLO(SC) 2 scripts can be divided into three groups. The first group helps in efficient data processing and provides useful wrappers for Seurat-based ( Butler et al., 2018 ) scRNA-seq analysis. The second group helps in efficient provision of differential gene expression analysis and provides wrappers for clusterProfiler-based ( Yu and He, 2016 ; Wu et al., 2021 ) gene set enrichment. Finally, the third part of the PLO(SC) 2 scripts provides camera-ready visualizations for gene expression, like violin- or dot-plots. Scripts for Efficient Data Processing Preprocessing data Each scRNA-seq analysis starts with loading all the data. While the functions provided by Seurat work well for single libraries, all functions must be encapsulated in for-loops or apply-s if these operations are to be performed on many libraries. The wrapper functions provided by PLO(SC) 2 already operate on a structure suitable for processing multiple related libraries. Read-in functions are provided for cellranger-generated count matrices, which take a list of h5 or mtx files as input argument ( readH5Files or readMtxFiles ). It is automatically checked whether the input matrices contain only expression counts for the RNA assay, or also additional quantification for the antibody capture (e.g. hashtagging or CITE-seq). The next relevant step is the creation of the Seurat object, which also calculates the mitochondrial and ribosomal RNA fractions ( toObjList function). Quality control plots and filtering of cells based on their mitochondrial/ribosomal count fraction, the number of features detected and the number of UMIs counted for each cell are performed by the scatterAndFilter function. After preparing the RNA assay objects, the antibody capture (if available) can be processed. Within the antibody capture library, hashtag oligo-nucleotides are used to allow the separation of multiple samples per library, or CITE antibodies. For each library, a CITE-seq assay can be added using the processCITE function. The user can also define a list of relevant hash tag oligo names for each library. These can be used to identify specific samples within a 10X Genomics library. Processing is performed using the processHTO function. The individual libraries can then be split by sample or any other annotated feature, for element-wise integration in the next phase ( splitObjListByGroup ). Integration of Multiple Data Sets During the integration phase, multiple libraries can be integrated into one Seurat object, e.g. to correct for batch effects introduced during library preparation. Before the actual integration takes place, the list of Seurat objects must be prepared for integration (e.g. identifying common highly variable features) using the prepareIntegration function. The integration can then be performed according to the vignettes provided by Seurat on the RNA assay, e.g. using cca or rpca based integration, or using SCTransform processed ( Hafemeister and Satija, 2019 ) data. In addition, joint CITE-seq and RNA-based (multimodal) integration using wnn is also possible. The performIntegration function collects the most important arguments (e.g. number of PCA dimensions to use) and passes them to the respective Seurat functions. After integration, the user may be interested in identifying clusters within the combined Seurat object. This can be orchestrated by the preprocessIntegrated function, which also allows the user to select the dimension reduction of choice, for example pca for normal Seurat objects, igpca for the gene expression PCA from performIntegration , or wnn . umap in the case of multimodal integration. To check whether the integration was successful or whether biases remain, such as cells with few measured features being enriched in one cluster, several quality control plots can be generated and checked with the makeQCPlots function. Cell Annotation Annotating the cells of the Seurat object is one of the most important steps because it defines cell identity in terms of experimental knowledge such as condition, patient, or time point. Within PLO(SC) 2 the annotateByCellnamePattern can be used for this task. It takes as input parameters the Seurat object, the new column name in the metadata, and a list of patterns that describe the new annotation. Each entry in this list of patterns is itself a list defining the new annotation of a cell (e.g. disease or control) and either a set of cell names or a regular expression that can be used to select cells. This selection is also possible on existing metadata columns. Differential Analyses After filtering cells, performing dimensional reduction, integration, cluster identification and cell type assignment, the key question is often: What are the differences between cell types? To answer this question, marker genes can be calculated for each cluster or cell type. While such functionality is already available in Seurat with the FindAllMarkers function, its use is rather indirect: it groups the cells and calculates markers for the currently set global identity. With the PLO(SC) 2 wrapper, it is possible to specify the groups for which markers are calculated, together with the assay and test to be used, by a metadata column name. The differential gene expression results are annotated with expression data (mean, quartiles, fraction of expressing genes) for both the selected cluster and the background (all other clusters). This information is relevant for the interpretation of fold changes. The whole process is orchestrated by the makeDEResults function. Another common use case is to compare two sets of cells. For this, the compareClusters function is provided, which, given two sets of cells, performs a differential gene expression analysis between the two sets and computes gene expression data similarly to the marker genes. Since two sets of cells are often compared for all clusters or any other grouping, the compareCellsByCluster function conveniently performs such comparisons. One use case would be to find the difference between each cluster for disease and control samples (e.g. cluster 0 disease vs. cluster 0 control). The results of differential gene expression analysis can be easily visualized as volcano plots using the makeVolcanos function. This function takes the list generated by the compareClusters function as input and draws a volcano plot for each comparison. The volcano plots are drawn by the EnhancedVolcano package ( Blighe et al., 2019 ). By default, the labeled genes are selected from the EnhancedVolcano library, but the user can specify genes to be shown instead. Enrichment Analyses The systematic analysis of the enrichment of the gene set is easily done with the enrichmentAnalysis extension. To run the enrichment analysis on the Seurat object, you need to specify which Seurat object the analysis should be run on, what the background/universe for set enrichment is, which organism the gene sets should be loaded for (currently human and mouse are supported), and where the gene set enrichment results should be saved (rds file). If all results are to be exported or visualized, an output folder should also be specified. Set enrichment is performed using clusterProfiler ( Yu et al., 2012 ) and ReactomePA ( Yu and He, 2016 ). Both overrepresentation and gene set enrichment analysis are performed on all significantly differentially regulated genes, or only on up- or down-regulated genes. Enrichment analysis on the significantly differentially expressed genes is performed for Reactome pathways ( Jassal et al., 2020 ), GeneOntology ( Ashburner et al., 2000 ), KEGG ( Kanehisa et al., 2021 ) and KEGG modules. In the visualization phase, all result lists are exported to a tab-separated file and Excel, and clusterProfiler’s dotplot, barplot, cnetplot, treeplot and emapplot are also visualized. Enhanced Visualizations After preparing all the relevant data of the scRNA-seq analysis, visualizing these results is (probably) the most time-consuming part of any scRNA-seq data analysis. The plotting functionality provided by PLO(SC) 2 makes the creation of camera-ready plots more convenient. The PLO(SC) 2 scripts are intended to be used both in a notebook-based environment (such as Rmark-down or jupyter) and in interactive R sessions. For the latter, plots are usually exported directly to png, pdf, and svg formatted files using the save plot function. To comply with some journal policies that require submission of source files containing the raw data used for the plots, the data table used by ggplot to generate the plots is also exported to data files (which are tab-separated tables). This way, all relevant files for publishing are created at the time of plot creation. Dimensional Plots The regular dimension plot ( DimPlot ) in Seurat is used to provide an overview of the preferential 2D embedding of all cells and often provides a first impression of the data, as it is commonly used to show the different cell populations measured in the scRNA-seq experiment. While plotting all cells in this plot provides a broad overview of the data, plotting these dimensional reductions by condition helps to identify condition-specific differences (e.g., the absence of a particular population). The makeUMAPPlot function can create dimensionality reduction plots along two axes (e.g., condition and time points, Figure 1 ). However, different numbers of cells per selected condition can cause a bias in this visualization. Therefore, this function provides the ability to (uniformly) down-sample all conditions so that similar amounts of cells are plotted for each condition. This provides an unbiased view of population prevalence. Download figure Open in new tab Figure 1. Dimension-wise UMAP-plot. UMAP plot for along multiple dimensions with a shared legend at the bottom. The data set is split along the time series on the x-axis, and the condition (CTRL — ASYMPTOMATIC — SYMPTOMATIC) along the y-axis. While this plot shows all cells, the user can control with a parameter that all sub-plots are down-sampled to the minimal amount of cells within any sub-plot, in order to reduce the visual bias between plots. Similarly, the expression of certain genes may be compared across levels of a disease (e.g., across multiple time points). While Seurat provides the split.by-option in its FeaturePlot, the individual feature plots have differently scaled legends by default. However, this is not suitable for accurate side-by-side comparisons. To work around this problem, the splitFeaturePlot function creates subplots with common legends for all plots, making the individual subplots easily comparable. Violin Plots The standard violin plot is an important plot type because it shows the expression of a feature over specific groups of the Seurat object. Unlike regular box plots, violin plots also show the distribution of values ( Hintze and Nelson, 1998 ). In general, this is very useful, but it only works as well as the kernel density can be calculated for the violin. Thus, for bimodal distributions or for a small number of values, an additional box plot is beneficial. Therefore, the SplitVlnBoxPlot combines both plot types. Also, Violin plots can only be calculated for 3 or more cells. If there are fewer cells for a violin, the violin will not be plotted. This can be a problem when simply overlaying ggplot2’s violin and box plots. Therefore, this implementation can filter out groups with too few cells to make sure that the box plot and the violin plot are well aligned. Due to the nature of the violin plots described above, these plots are also well suited to visualize two conditions over a time series ( Figure 2 ), where the two conditions are compared for each time point. While the box and violin plots already give an idea of whether there is a change between the two conditions, a t-test can be added to evaluate whether the observed change is statistically significant. This type of visualization is possible with the comparativeVioBoxPlot function. Download figure Open in new tab Figure 2. Boxplot and t-test enriched Violin Plot. Violin plots are suitable to show the distribution of the observed values. However, spotting the median from the violin distribution at times can be hard. Here, a regular boxplot becomes helpful. When comparing two conditions for each violin, it might also be interesting to quantify the difference in terms of significance. Therefore, the enhanced violin plot also calculates significance values for the split violins. For this comparison not only the significance value is given, but also the samples sizes which were compared. The user can specify the colors of the groups shown in the plot. Enhanced Heatmap and DotPlot Heat maps and dot plots are common visualizations in single-cell RNA-seq data analysis. Both are often used to display relative expression values as scaled expression (using z-scores). However, such visualizations must be created with care: Comparing scaled expression values across plots is only valid if both plots have been scaled with the same mean and standard deviation, otherwise the plotted z-scores are not comparable (because they are drawn from different normal distri-butions). In Seurat, the scale.data slot provides a global z-score transformation of all expression values, which is comparable. However, genes may not be part of this slot, or the difference may be too small to be visible. The included advanced heat map and dot plot allow for side-by-side visualizations and provide the option to scale values using different strategies. You can choose to use the globally scaled values from the scale.data slot ( scale.by=“GLOBAL” ), or to scale the gene expression values only for the features included in the plot ( scale.by=“ALL” ). When plots are split by condition, it is also possible to scale the values per subplot ( scale.by=“GROUP” )). Finally, the raw values of the expression can also be used for visualization ( scale.by=NULL ). This ability to scale the underlying expression values to the user’s needs is the central element of the enhanced plots. Heat maps are often used to visualize the expression of specific genes across multiple groups of cells, in this case, cell types. While Seurat already provides a DoHeatmap function, this heat map shows the expression per cell. While this has the advantage of showing cluster proportions and how many cells express a particular gene, this visualization sometimes becomes too large due to all the information required. A simple heat map then has the advantage of directly showing the expression states per cluster. The advantage of the PLO(SC) 2 enhanced heat map lies in its usability: the user can specify both the order of genes and the order of clusters, allowing clinicians to visualize exactly the patterns that support their scientific claims. The extended heatmap is accessible via the makeComplexExprHeatmap function ( figure 3 ). An extension to this function ( makeComplexExprHeatmapSplit ) allows heat maps to be displayed side by side to compare results from different conditions. Download figure Open in new tab Figure 3. Enhanced heat map of selected genes Heat maps are commonly used to visualize gene expression of specific genes across several groups of cells - here cell types. While Seurat already provides a DoHeatmap function, this heat map shows the expression per cell. This has the advantage of showing cluster proportions and how many cells express a certain gene, this visualization at times becomes too large due to all the required information. A simple heat map then offers the advantage of directly showing expression states per cluster. The advantage of the enhanced heat map is its usage: the user can specify both the order of genes, and the order of clusters, which enables clinicians to visualize exactly the patterns that support their scientific claims. Data can be shown as scaled values either from the Seurat object itself, or z-scaled within all shown values. Finally, the Dot Plot ( enhancedDotPlot ) is an interesting plot because not only the color of the dot can be interpreted, but also its size. Typically, the color defines the expression strength, while the size of the dot shows the fraction of cells expressing a particular feature. With scRNA-seq data, gene expression is more complicated than with bulk replicates. Gene expression depends on the average intensity of the expressed gene, which depends on the number of cells within a group that express the gene. This is reflected in the dot color and dot size in the DotPlot. In addition, it is also important to know how often a specific group is present within the examined condition: it may be a large group in all conditions compared, or it may be present only in certain conditions. This information is encoded in the background color of the group rows. This addition allows a full evaluation of the expression of a gene in specific groups and between multiple conditions ( Figure 4 ). Download figure Open in new tab Figure 4. Enhanced Dotplot of selected genes across two conditions Dotplots are often used in scRNA-seq analysis because they nicely show gene expression per gene and cluster/group, while also showing the percent expressing cells. However, often such information is displayed besides each other, to compare several conditions. Then it must be made sure that the shown data is actually comparable, which is particularly an issue if scaled data is shown. With the enhanced dotplot the user can control on which subset of data the scaled expression should be calculated (e.g. globally, on all shown expression values, etc.). Moreover, the enhanced Dotplot combines gene expression data with group abundance per shown condition (or overall). This allows a complete interpretation of the observed gene expression data. Shiny App In previous work, it was shown that graphical user interfaces can reduce the burden of using bioinformatics software Joppich and Zimmer (2019) . In order to facilitate the usage of, primarily, the presented plotting functions, an R shiny app has been developed for PLO(SC) 2 . Using this app, the user can load Seurat objects from an RDS file, calculate differentially expressed genes, perform gene set enrichment analysis and visualize the contained data with the enhanced visualization techniques presented here. Additionally, it is possible to pre-process and integrate new datasets. For ease of use, a stand-alone application of this Shiny app is available for the Windows operating system. CONCLUSIONS The PLO(SC) 2 scripts are a collection of wrappers for stream-lined data processing and enhanced visualizations in Seurat-based scRNA-seq analysis, helping to make scRNA-seq accessible to beginner and intermediate users. For users proficient with the R programming language, the PLO(SC) 2 scripts stream-line their scRNA-seq analysis. Instead of writing lines of code for technical purposes (e.g. reading matrices, converting to Seurat objects, writing integration code), the user can focus on the actual tasks. Using quality control plots, it is easy to decide whether a step in the analysis worked satisfactorily or whether other parameters need to be chosen. The various plot types included in the PLO(SC) 2 framework allow the creation of professional, informative, and camera-ready plots. With these enhanced plots, it is easy to visualize complex datasets and analyses, while not having to dive into the details of creating these plots on one’s own. In particular, the enhanced HeatMap and DotPlot allow the user to specify which data source to use for scaling the expression data, improving the ability to focus on existing differences. For users who prefer graphical user interfaces, the included Shiny app provides access to most of the features of PLO(SC) 2 and makes scRNA-seq accessible especially for beginners. The PLO(SC) 2 scripts and shiny app are available online https://github.com/mjoppich/PLOSC . AVAILABILITY AND REQUIREMENTS Project name: PLO(SC) 2 Project home page: https://github.com/mjoppich/PLOSC Operating system(s): Platform independent Programming language: R/Seurat Other requirements: Seurat 5.0+ (further dependencies, see https://github.com/mjoppich/PLOSC/blob/main/DESCRIPTION ) License: Apache-2.0 license Any restrictions to use by non-academics: none AVAILABILITY OF DATA AND MATERIALS The PLO(SC) 2 scripts and Shiny app are available online https://github.com/mjoppich/PLOSC and can easily be installed into R by using devtools::install github(“mjoppich/PLOSC”) . PLO(SC) 2 is distributed under the Apache-2.0 license. The notebook of the analysis of the presented use-case and all required input files are available from Zenodo ( https://doi.org/10.5281/zenodo.8268102 ). The scRNA-seq count matrices are taken from Pekayvaz et al. ( Pekayvaz et al., 2022 ). COMPETING INTERESTS The authors declare that they have no competing interests. AUTHOR’S CONTRIBUTIONS MJ created the PLO(SC) 2 scripts and wrote the manuscript. Download figure Open in new tab Figure S1. Enhanced dot plot of selected genes in the human IFNB-Stimulated and Control PBMCs dataset taken from the Seurat data library. Values in (a) have been scaled based on all shown values in CTRL and STIM, values in (b) are globally scaled. Footnotes https://github.com/mjoppich/PLOSC REFERENCES ↵ Ashburner , M. , Ball , C. A. , Blake , J. A. , Botstein , D. , Butler , H. , Cherry , J. M. , Davis , A. P. , Dolinski , K. , Dwight , S. S. , Eppig , J. T. , Harris , M. A. , Hill , D. P. , Issel-Tarver , L. , Kasarskis , A. , Lewis , S. , Matese , J. C. , Richardson , J. E. , Ringwald , M. , Rubin , G. M. , and Sherlock , G. ( 2000 ). Gene Ontology: tool for the unification of biology . Nat Genet , 25 ( 1 ): 25 – 29 . Publisher: Nature Publishing Group . OpenUrl CrossRef PubMed Web of Science ↵ Blighe , K. , Rana , S. , and Lewis , M. ( 2019 ). EnhancedVolcano: Publication-ready volcano plots with enhanced colouring and labelin . Pages: 1-8 Publication Title: R-Package. ↵ Butler , A. , Hoffman , P. , Smibert , P. , Papalexi , E. , and Satija , R. ( 2018 ). Integrating single-cell transcriptomic data across different conditions, technologies, and species . Nature Biotechnology , 36 ( 5 ): 411 – 420 . Publisher: Nature Publishing Group . OpenUrl CrossRef PubMed ↵ Hafemeister , C. and Satija , R. ( 2019 ). Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression . Genome Biology , 20 ( 1 ): 296 . Publisher: BioMed Central Ltd . OpenUrl CrossRef PubMed ↵ Hintze , J. L. and Nelson , R. D. ( 1998 ). Violin Plots: A Box Plot-Density Trace Synergism . The American Statistician , 52 ( 2 ): 181 – 184 . OpenUrl CrossRef Web of Science ↵ Jassal , B. , Matthews , L. , Viteri , G. , Gong , C. , Lorente , P. , Fabregat , A. , Sidiropoulos , K. , Cook , J. , Gillespie , M. , Haw , R. , Loney , F. , May , B. , Milacic , M. , Rothfels , K. , Sevilla , C. , Shamovsky , V. , Shorser , S. , Varusai , T. , Weiser , J. , Wu , G. , Stein , L. , Hermjakob , H. , and D’Eustachio , P. ( 2020 ). The reactome pathway knowledgebase . Nucleic Acids Research , 48 ( D1 ): D498 – D503 . Publisher: Oxford University Press . OpenUrl CrossRef PubMed ↵ Joppich , M. and Zimmer , R. ( 2019 ). From command-line bioinformatics to bioGUI . PeerJ , 2019 ( 11 ): e8111 . Publisher: PeerJ Inc . OpenUrl CrossRef ↵ Kanehisa , M. , Furumichi , M. , Sato , Y. , Ishiguro-Watanabe , M. , and Tanabe , M. ( 2021 ). KEGG: Integrating viruses and cellular organisms . Nucleic Acids Research, 49(D1):D545–D551 . Publisher: Oxford University Press . ↵ Pekayvaz , K. , Leunig , A. , Kaiser , R. , Joppich , M. , Brambs , S. , Janjic , A. , Popp , O. , Nixdorf , D. , Fumagalli , V. , Schmidt , N. , Polewka , V. , Anjum , A. , Knottenberg , V. , Eivers , L. , Wange , L. E. , Gold , C. , Kirchner , M. , Muenchhoff , M. , Hellmuth , J. C. , Scherer , C. , Rubio-Acero , R. , Eser , T. , Deák , F. , Puchinger , K. , Kuhl , N. , Linder , A. , Saar , K. , Tomas , L. , Schulz , C. , Wieser , A. , Enard , W. , Kroidl , I. , Geldmacher , C. , von Bergwelt-Baildon , M. , Keppler , O. T. , Munschauer , M. , Iannacone , M. , Zimmer , R. , Mertins , P. , Hubner , N. , Hoelscher , M. , Massberg , S. , Stark , K. , and Nicolai , L. ( 2022 ). Protective immune trajectories in early viral containment of non-pneumonic SARS-CoV-2 infection . Nature communications , 13 ( 1 ): 1018 . Publisher: Nature Publishing Group . OpenUrl CrossRef PubMed ↵ Wickham , H. ( 2009 ). ggplot2: Elegant Graphics for Data Analysis . Springer , New York, NY . ↵ Wolf , F. A. , Angerer , P. , and Theis , F. J. ( 2018 ). SCANPY: Large-scale single-cell gene expression data analysis . Genome Biology , 19 ( 1 ): 15 . Publisher: BioMed Central Ltd . OpenUrl CrossRef PubMed ↵ Wu , T. , Hu , E. , Xu , S. , Chen , M. , Guo , P. , Dai , Z. , Feng , T. , Zhou , L. , Tang , W. , Zhan , L. , Fu , X. , Liu , S. , Bo , X. , and Yu , G. ( 2021 ). clusterProfiler 4.0: A universal enrichment tool for interpreting omics data . The Innovation , 2 ( 3 ). Publisher: Elsevier . ↵ Yu , G. and He , Q. Y. ( 2016 ). ReactomePA: An R/Bioconductor package for reactome pathway analysis and visualization . Molecular BioSystems , 12 ( 2 ): 477 – 479 . Publisher: Royal Society of Chemistry . OpenUrl CrossRef PubMed ↵ Yu , G. , Wang , L. G. , Han , Y. , and He , Q. Y. ( 2012 ). ClusterProfiler: An R package for comparing biological themes among gene clusters . OMICS A Journal of Integrative Biology , 16 ( 5 ): 284 – 287 . Publisher: Mary Ann Liebert, Inc. 140 Huguenot Street, 3rd Floor New Rochelle, NY 10801 USA . OpenUrl CrossRef PubMed View the discussion thread. Back to top Previous Next Posted March 14, 2025. Download PDF Data/Code Email Thank you for your interest in spreading the word about bioRxiv. NOTE: Your email address is requested solely to identify you as the sender of this article. Your Email * Your Name * Send To * Enter multiple addresses on separate lines or separate them with commas. You are going to email the following PLO(SC)2: Plots and Scripts for scRNA-seq analysis Message Subject (Your Name) has forwarded a page to you from bioRxiv Message Body (Your Name) thought you would like to see this page from the bioRxiv website. Your Personal Message CAPTCHA This question is for testing whether or not you are a human visitor and to prevent automated spam submissions. Share PLO(SC) 2 : Plots and Scripts for scRNA-seq analysis Markus Joppich bioRxiv 2025.03.09.642205; doi: https://doi.org/10.1101/2025.03.09.642205 Share This Article: Copy Citation Tools PLO(SC) 2 : Plots and Scripts for scRNA-seq analysis Markus Joppich bioRxiv 2025.03.09.642205; doi: https://doi.org/10.1101/2025.03.09.642205 Citation Manager Formats BibTeX Bookends EasyBib EndNote (tagged) EndNote 8 (xml) Medlars Mendeley Papers RefWorks Tagged Ref Manager RIS Zotero Tweet Widget Facebook Like Google Plus One Subject Area Bioinformatics Subject Areas All Articles Animal Behavior and Cognition (7620) Biochemistry (17643) Bioengineering (13866) Bioinformatics (41863) Biophysics (21411) Cancer Biology (18548) Cell Biology (25438) Clinical Trials (138) Developmental Biology (13359) Ecology (19866) Epidemiology (2067) Evolutionary Biology (24289) Genetics (15587) Genomics (22469) Immunology (17705) Microbiology (40301) Molecular Biology (17142) Neuroscience (88451) Paleontology (666) Pathology (2825) Pharmacology and Toxicology (4815) Physiology (7634) Plant Biology (15110) Scientific Communication and Education (2042) Synthetic Biology (4285) Systems Biology (9812) Zoology (2268)
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.