RNA-Seq Differential Expression Dashboard
What you'll learn
~35 min- Build an interactive RNA-seq differential expression dashboard from count matrix CSV
- Understand log2 fold change, statistical testing, and Benjamini-Hochberg correction
- Generate volcano, MA, and clustered heatmap plots with Plotly
- Troubleshoot count matrix formatting and zero-count gene handling
What you’re building
Differential expression analysis is the single most requested service in university bioinformatics cores. A researcher sequences RNA from two conditions — treated vs. control, mutant vs. wild-type, tumor vs. normal — and the core question is always the same: which genes changed?
The standard tools (DESeq2, edgeR) are R packages that require writing R scripts, managing Bioconductor installations, and debugging cryptic error messages about factor levels. They are statistically rigorous and essential for publication. But for a first look at your data — “did the experiment work?” — you need something faster.
In this lesson you will build an interactive RNA-seq differential expression dashboard using Python and Streamlit. Upload a count matrix CSV, pick your conditions, and get volcano plots, MA plots, and clustered heatmaps in your browser within seconds. It includes a built-in sample data generator so you can test immediately without real data.
This is a different framework from Lesson 4’s Dash application. Streamlit is simpler — fewer callbacks, less boilerplate, faster to prototype. It is increasingly popular in bioinformatics for exactly this kind of quick-exploration tool.
Accept tabular data → run statistical analysis → generate interactive plots → export significant results. This pattern works for any hypothesis testing workflow: A/B testing in marketing, clinical trial analysis, environmental monitoring. The statistics and plots change; the architecture does not.
If you are working on a remote server or HPC cluster, use a conda environment instead of venv for easier dependency management. For the Streamlit web interface, use SSH port forwarding (ssh -L 8501:localhost:8501 user@server) to view the dashboard in your local browser.
The showcase
The finished application will provide:
- CSV upload panel: drag-and-drop a count matrix (genes as rows, samples as columns) or click to generate simulated data.
- Sample data generator: creates a realistic mouse liver RNA-seq dataset with 2 conditions (treated/control), 3 replicates each, ~15,000 genes, and ~500 truly differentially expressed genes.
- Condition assignment: select which columns belong to each condition via multiselect dropdowns.
- Library size normalization: median-of-ratios normalization (the same approach DESeq2 uses internally).
- Statistical testing: per-gene t-test with Benjamini-Hochberg FDR correction. Optionally, a negative binomial approximation for more accurate RNA-seq modeling.
- Interactive volcano plot: log2 fold change (x-axis) vs. -log10(adjusted p-value) (y-axis), with color-coded significance thresholds. Hover any point to see the gene name, fold change, and p-value.
- MA plot: mean expression (x-axis) vs. log2 fold change (y-axis). Highlights genes that are significant at the selected FDR threshold.
- Clustered heatmap: top N differentially expressed genes, with hierarchical clustering on both genes and samples.
- Results table: sortable, filterable table of all genes with log2FC, p-value, adjusted p-value, mean expression, and significance flag.
- CSV export: download the significant gene list as a CSV ready for pathway analysis tools (DAVID, Enrichr, g:Profiler).
The prompt
Open your AI CLI tool (such as Claude Code, Gemini CLI, or your preferred tool) in an empty directory and paste:
Create a Python Streamlit application for RNA-seq differential expression analysis.Call it rnaseq-de-dashboard.
PROJECT STRUCTURE:rnaseq-de-dashboard/├── app.py # main Streamlit application├── de_analysis.py # normalization, statistical testing, FDR correction├── visualization.py # volcano plot, MA plot, heatmap with Plotly├── sample_data.py # simulated count matrix generator├── requirements.txt # streamlit, pandas, numpy, scipy, plotly, scikit-learn└── README.md
SAMPLE DATA GENERATOR (sample_data.py):Generate a realistic simulated RNA-seq count matrix:- 15,000 genes (named Gene_0001 through Gene_15000)- 6 samples: Control_1, Control_2, Control_3, Treated_1, Treated_2, Treated_3- Simulates mouse liver RNA-seq with realistic count distributions: - Per-gene base expression: mean drawn from log-normal (median ~200, range 5-50000), dispersion varying inversely with expression (0.005-0.035) to model realistic RNA-seq overdispersion - 300 genes upregulated in treated (multiply counts by 2x to 8x fold change) - 200 genes downregulated in treated (divide counts by 2x to 6x fold change) - 5% of genes have zero counts across all samples (not expressed) - Add biological variability: per-sample scaling factor (0.8 to 1.2) to simulate different library sizes- Return a pandas DataFrame with gene names as index, sample names as columns- Also return a ground truth DataFrame listing which genes are truly DE and their true fold changes (for benchmarking)
DE ANALYSIS MODULE (de_analysis.py):1. PREPROCESSING - Remove genes with zero counts across all samples - Filter low-count genes: keep genes with at least N counts in at least M samples (default N=10, M=2, configurable via sidebar) - Log a summary: genes before filtering, genes after filtering, genes removed
2. NORMALIZATION - Implement median-of-ratios normalization (DESeq2 method): a) Compute geometric mean of each gene across all samples b) For each sample, divide counts by the geometric mean c) Take the median of these ratios per sample = size factor d) Divide each sample's counts by its size factor - Also offer simple CPM (counts per million) as an alternative - Show a sidebar toggle between normalization methods
3. STATISTICAL TESTING - For each gene, run a two-sample t-test (scipy.stats.ttest_ind) between the two condition groups on log2(normalized_counts + 1) - Calculate log2 fold change: mean(log2(condition2 + 1)) - mean(log2(condition1 + 1)) - Calculate mean expression: mean of log2(all_samples + 1) - Return raw p-values for all genes
4. MULTIPLE TESTING CORRECTION - Apply Benjamini-Hochberg FDR correction (scipy.stats.false_discovery_control or statsmodels.stats.multitest.multipletests) - Mark genes as significant if adjusted p-value < threshold (default 0.05) AND abs(log2FC) > threshold (default 1.0) - Both thresholds configurable via sidebar sliders
APP LAYOUT (app.py):Use Streamlit with a dark theme. Layout:
1. HEADER - Title: "RNA-Seq Differential Expression Dashboard" - Subtitle with brief description
2. SIDEBAR - File upload widget (.csv files) - "Generate Sample Data" button - Condition assignment: two st.multiselect widgets for selecting which columns belong to Condition 1 vs Condition 2 - Analysis parameters: - Min count filter (slider, 1-50, default 10) - Min samples filter (slider, 1-6, default 2) - Normalization method (radio: "Median of Ratios" / "CPM") - FDR threshold (slider, 0.001-0.1, default 0.05) - log2FC threshold (slider, 0.5-3.0, default 1.0) - "Run Analysis" button
3. RESULTS TABS (main area, use st.tabs) Tab 1: Overview - Summary metrics in st.metric cards: total genes tested, significant up, significant down, not significant - Library size bar chart showing total counts per sample (before normalization) - Normalization factor bar chart
Tab 2: Volcano Plot - Plotly scatter plot: x = log2FC, y = -log10(adjusted p-value) - Color coding: red = significant up, blue = significant down, gray = NS - Horizontal dashed line at -log10(FDR threshold) - Two vertical dashed lines at +/- log2FC threshold - Hover: gene name, log2FC, adj p-value, mean expression - Top 10 significant genes labeled on the plot - Dark theme matching other tools
Tab 3: MA Plot - Plotly scatter plot: x = mean expression (log2), y = log2FC - Same color coding as volcano plot - Horizontal dashed line at y=0 - Hover: gene name, log2FC, adj p-value, mean expression
Tab 4: Heatmap - Show top N differentially expressed genes (slider: 20-100, default 50) - Sorted by adjusted p-value - Hierarchical clustering on both genes (rows) and samples (columns) using scipy.cluster.hierarchy - Z-score normalize each row for display - Color scale: blue (low) to white (mid) to red (high) - Plotly heatmap with gene names on y-axis, sample names on x-axis - Dendrogram on both axes if possible, otherwise just clustered order
Tab 5: Results Table - st.dataframe showing all tested genes with columns: Gene, log2FC, p-value, adjusted p-value, mean expression, significant (yes/no) - Sortable by any column - Filter: show all / significant only / up only / down only - "Download Significant Genes (CSV)" button - "Download Full Results (CSV)" button
DESIGN:- Streamlit dark theme (set in .streamlit/config.toml)- Create .streamlit/config.toml with: [theme] primaryColor = "#10b981" backgroundColor = "#0a0a0f" secondaryBackgroundColor = "#1a1a2e" textColor = "#e0e0e0"- Plotly charts use plotly_dark template- Professional, core-facility-ready layout
Generate all files with complete implementations. Include the .streamlit/config.toml.The app should work end-to-end: streamlit run app.py opens the dashboard ready to use.This tool uses scipy for statistical testing, pandas for data handling, and Streamlit for the web application. If you cannot install scipy, ask the LLM to implement the t-test and FDR correction from scratch using numpy only. The statistical logic is straightforward — scipy just makes it convenient.
What you get
After generation, set up the project:
cd rnaseq-de-dashboardpython -m venv .venvsource .venv/bin/activate # On Windows: .venv\Scripts\activatepip install -r requirements.txtstreamlit run app.pyStreamlit will open http://localhost:8501 in your browser automatically.
Expected project structure
rnaseq-de-dashboard/├── app.py (~300-400 lines)├── de_analysis.py (~150-200 lines)├── visualization.py (~200-250 lines)├── sample_data.py (~80-120 lines)├── .streamlit/│ └── config.toml├── requirements.txt└── README.mdFirst run walkthrough
- Click Generate Sample Data in the sidebar. The simulated count matrix loads with 15,000 genes and 6 samples.
- The condition assignment dropdowns should auto-populate: Control_1/2/3 in Condition 1, Treated_1/2/3 in Condition 2. If not, assign them manually.
- Leave the default parameters and click Run Analysis.
- Check the Overview tab. You should see approximately 300-500 significant genes (depending on filtering and the random seed). The library size chart should show slightly different total counts per sample (simulating real library size variation).
- Switch to the Volcano Plot tab. The classic volcano shape should appear:
- A cluster of gray dots in the center (not significant).
- Red dots in the upper-right (significantly upregulated).
- Blue dots in the upper-left (significantly downregulated).
- The top 10 most significant genes labeled with their names.
- Switch to the MA Plot tab. Significant genes should appear as colored dots distributed across the expression range, while non-significant genes cluster around log2FC = 0.
- Switch to the Heatmap tab. The top 50 DE genes should show a clear pattern: one block of high expression in treated samples and low in controls, and another block with the opposite pattern. The clustering should group the three control replicates together and the three treated replicates together.
- Switch to the Results Table tab. Filter to “Significant only” and click Download Significant Genes (CSV).
Common issues and fixes
| Problem | Follow-up prompt |
|---|---|
| Streamlit shows a blank page | The app is not rendering. Make sure app.py has the Streamlit imports at the top and uses st.set_page_config as the first Streamlit command. Also check that .streamlit/config.toml exists and has valid TOML syntax. |
| Volcano plot has no colored points | All points are gray. The significance thresholds might be too strict for the simulated data. Lower the default log2FC threshold to 0.5 and the FDR threshold to 0.1. Also verify that the adjusted p-values are being used, not the raw p-values. |
| Heatmap is all one color | The heatmap shows no contrast. Make sure you are z-score normalizing each row before plotting: for each gene, subtract the row mean and divide by the row standard deviation. This puts all genes on the same scale. |
| Download button produces empty CSV | The CSV download is empty. Check that the filtered DataFrame is not being overwritten before the download button is created. In Streamlit, the download button callback should reference the DataFrame directly, not a variable that gets reassigned. |
Worked example: Comparing treated vs. control liver samples
Here is a practical scenario for a graduate student running a drug treatment RNA-seq experiment.
Step 1. You submitted RNA from 6 mouse liver samples to your sequencing core: 3 controls (DMSO vehicle) and 3 treated with a drug candidate. The core returned FASTQ files, which you aligned with STAR and counted with featureCounts. The output is a count matrix CSV with gene names as the first column and one column per sample.
Step 2. Your count matrix looks like this:
Gene,Control_1,Control_2,Control_3,Drug_1,Drug_2,Drug_3Alb,245891,231420,258103,89432,95210,82104Cyp1a2,12450,13201,11892,45230,48102,42891Gapdh,89201,91034,87453,88921,90102,87234...Step 3. Upload the CSV to the dashboard. Assign Control_1/2/3 to Condition 1 and Drug_1/2/3 to Condition 2. Click Run Analysis.
Step 4. Examine the volcano plot. Look for:
- Albumin (Alb) should appear as a blue dot in the lower-left quadrant — it is the most abundant liver gene and your drug is suppressing it. A large negative log2FC confirms the drug effect.
- Cyp1a2 should appear as a red dot in the upper-right — this cytochrome P450 is being induced by the drug. A positive log2FC of ~2 (4-fold induction) is consistent with drug metabolism activation.
- Gapdh should be a gray dot near the center — housekeeping genes should not change significantly.
Step 5. Check the heatmap. The clustering should separate your control and treated samples cleanly. If the treated replicates do not cluster together, that is a red flag — it could indicate batch effects, sample mislabeling, or high biological variability.
Step 6. Download the significant gene list. Upload it to Enrichr (maayanlab.cloud/Enrichr) or g:Profiler (biit.cs.ut.ee/gprofiler) for pathway enrichment analysis. You should see pathways related to drug metabolism (cytochrome P450, xenobiotic metabolism) enriched in the upregulated genes.
RNA-seq is the most common next-generation sequencing application. University sequencing cores, gene expression centers, and genomics facilities process hundreds of RNA-seq samples per year. Every one of those experiments needs differential expression analysis. The tools you are building here give you a fast first look at results before investing time in a full DESeq2 or edgeR analysis.
If you are taking a bioinformatics course, this dashboard covers the same DE analysis concepts you encounter in class — but packaged as an interactive tool you can use on your own data immediately.
When Things Go Wrong
Use the Symptom → Evidence → Request pattern: describe what you see, paste the error, then ask for a fix.
Understanding the statistics
The analysis pipeline implements a simplified version of what DESeq2 does internally. Here are the key concepts:
Library size normalization: Different samples are sequenced to different depths. One sample might have 20 million reads, another 35 million. Without normalization, a gene appears “upregulated” in the deeper sample purely because of sequencing depth. Median-of-ratios normalization estimates a size factor per sample that accounts for both sequencing depth and RNA composition differences. This is the same algorithm DESeq2 uses (Anders and Huber, 2010).
Log2 fold change: The ratio of expression between conditions, on a log2 scale. A log2FC of 1 means 2-fold upregulation. A log2FC of -2 means 4-fold downregulation. Log2FC of 0 means no change. The log scale makes the distribution symmetric: a 2-fold increase (+1) and a 2-fold decrease (-1) are equidistant from zero.
Benjamini-Hochberg correction: When you test 15,000 genes simultaneously, you expect 750 false positives at p < 0.05 by chance alone. The BH procedure controls the false discovery rate (FDR) — the expected proportion of false positives among your significant results. An adjusted p-value (q-value) of 0.05 means that among all genes you call significant, at most 5% are expected to be false positives.
Volcano plot interpretation: The name comes from the shape. Points at the top are highly significant (small p-value). Points on the far left and right have large fold changes. The most interesting genes are in the upper corners — large fold change AND high statistical significance. Points in the lower center are noise.
🔍For Researchers: When to use this tool vs. DESeq2/edgeR
Use this dashboard for:
- First-pass exploration within minutes of getting your count matrix (“did the experiment work?”)
- Lab meeting presentations where you need quick, interactive plots
- Teaching RNA-seq analysis concepts to students (the interactive sliders make parameter effects visible)
- Comparing different normalization or threshold choices interactively
- Generating a candidate gene list to discuss with your PI before investing in a full analysis
Use DESeq2 or edgeR for:
- Publication-ready statistical analysis (reviewers will ask which tool you used)
- Proper negative binomial modeling of count data (t-tests on log-counts are an approximation)
- Experiments with complex designs (multiple factors, batch effects, paired samples)
- Small sample sizes (n=2 per condition) where the t-test lacks power and DESeq2’s information borrowing across genes is critical
- Interaction effects, time series, or dose-response experiments
The key difference: this dashboard uses a t-test on log-transformed normalized counts, which is a reasonable approximation when you have 3+ replicates per condition and the counts are not too low. DESeq2 uses a negative binomial generalized linear model with empirical Bayes shrinkage, which is more statistically principled but requires R and more setup. For exploratory analysis, the t-test approach is fast and usually identifies the same top hits.
Customize it
Add gene set enrichment analysis
Add a new tab called "Enrichment" to the dashboard. After running DE analysis,take the significant gene list and perform a simple over-representation analysisagainst Gene Ontology (GO) terms. Include a bundled GO term database (download aslim version with ~5000 terms and their associated gene lists for mouse). For eachGO term, run a Fisher's exact test comparing the overlap between the DE genes andthe GO term genes. Display the top 20 enriched GO terms as a bar chart (-log10p-value) with the number of overlapping genes as hover text. This is a basicversion of what DAVID or g:Profiler does, but runs entirely offline.Add PCA and sample correlation plots
Add a new tab called "Sample QC" that runs BEFORE the DE analysis. Include:1. PCA plot of all samples using the top 500 most variable genes. Color points by condition. The treated and control samples should separate on PC1 or PC2. If they do not, the experiment may have failed or there may be batch effects.2. Sample-to-sample correlation heatmap using Pearson correlation on log2(counts+1). Replicates within a condition should have correlation > 0.95. Low correlations suggest outlier samples.3. Cook's distance bar chart per sample to identify outlier samples.Show this tab first in the tab order so the user checks sample quality beforerunning DE analysis.Add batch effect visualization
Add batch effect detection to the Sample QC tab. Let the user assign a batchvariable (e.g., sequencing lane, RNA extraction date) in the sidebar. Re-runthe PCA and color by batch instead of condition. If samples cluster by batchrather than by condition, show a warning: "Batch effect detected -- considerusing a batch-corrected model." Also offer a simple batch correction usingComBat-style adjustment (subtract the batch mean from each gene's log2 counts,preserving the condition effect).Add interactive gene search
Add a search box at the top of the Results Table tab. When the user types agene name (or a comma-separated list), highlight those genes on the volcanoplot and MA plot with a distinct marker (larger size, star shape, labeled).This lets the user check whether their genes of interest are significantwithout scrolling through the full table. Also add a "Pathway Genes" textarea where the user can paste a list of genes from a pathway database andsee which ones are DE in their experiment.Here is how this tool fits into a real RNA-seq analysis workflow:
- Day 1: Count matrix arrives. Upload to this dashboard. Generate volcano plot. Answer: “Did the experiment work?” Share the plot with your PI.
- Day 2-3: Run the full DESeq2/edgeR analysis in R for publication-quality statistics. Compare the gene lists — the overlap between the quick dashboard and the full analysis should be >90% for the top hits.
- Day 4-5: Run pathway enrichment (Enrichr, g:Profiler) on the DESeq2 results. Use the dashboard’s heatmap to generate figures for your paper.
- Publication: Cite DESeq2/edgeR for the statistics. Use the dashboard plots for presentations and lab meetings.
The dashboard does not replace the formal analysis — it accelerates the exploration phase so you know where to focus.
Connecting to core facility workflows
Differential expression analysis is relevant to nearly every sequencing service a core facility offers:
RNA-seq — The direct application. Bulk RNA-seq from Illumina platforms produces the count matrices this dashboard consumes. Whether you are doing polyA-selected mRNA-seq or total RNA-seq with ribosomal depletion, the downstream count matrix has the same format.
Single-cell RNA-seq — 10X Genomics Chromium data can be aggregated to pseudo-bulk counts (sum counts per gene per condition across cells) and analyzed with this dashboard for a quick bulk-level comparison. This is a legitimate analysis strategy and is sometimes preferred for between-condition comparisons.
Spatial transcriptomics — Visium and MERFISH platforms produce spatially resolved expression data. Comparing expression between annotated tissue regions produces count matrices that fit this dashboard’s input format.
Gene expression arrays — While largely superseded by RNA-seq, some labs still generate microarray data. After RMA normalization, microarray expression matrices can be analyzed with the same t-test and volcano plot workflow. Ask the LLM to add a “pre-normalized” mode that skips the count normalization step.
If you are taking a genomics or bioinformatics course, this dashboard covers the same statistical concepts (fold change, multiple testing, FDR) that appear in lectures — but lets you manipulate the parameters interactively and see the effects in real time.
Key takeaways
- Normalization is not optional: raw counts cannot be compared across samples without accounting for library size differences. Median-of-ratios normalization is the standard for RNA-seq because it handles both sequencing depth and RNA composition effects.
- Multiple testing correction is the difference between 750 false positives and a reliable gene list: with 15,000 tests, a nominal p < 0.05 cutoff is meaningless. BH-corrected adjusted p-values control the false discovery rate.
- Both fold change AND statistical significance matter: a gene with log2FC = 5 but p = 0.3 might be a noisy gene with one outlier replicate. A gene with log2FC = 0.1 and p = 0.0001 is statistically significant but biologically trivial. The volcano plot captures both dimensions.
- Replicates determine power: with 2 replicates per condition, the t-test has very low power and you will miss many truly DE genes. With 3+ replicates, the dashboard’s t-test approach gives results comparable to DESeq2 for the top hits.
- This tool is for exploration, not publication: use it for quick answers and interactive plotting. Use DESeq2 or edgeR for the statistics you report in a paper.
Portfolio suggestion
The RNA-seq DE dashboard is directly relevant to anyone working in genomics or molecular biology. For your portfolio:
- Run the dashboard on the simulated data and save screenshots of the volcano plot, MA plot, and heatmap.
- If you have real data, run your own count matrix through the dashboard and include a de-identified volcano plot. A volcano plot from real data — with genes labeled, biological interpretation noted — demonstrates both technical and scientific competency.
- Compare to DESeq2 results: if you have both, show the overlap between the dashboard’s significant gene list and the DESeq2 gene list. High concordance (>90% for top 100 genes) validates the approach.
- Write a brief methods note describing when you would use this tool vs. DESeq2. This demonstrates mature scientific judgment about tool selection.
🔍Advanced: Adding DESeq2-style dispersion estimation
The main limitation of the t-test approach is that it does not model the mean-variance relationship of count data. In RNA-seq, variance increases with mean expression (heteroscedasticity). DESeq2 handles this by fitting a dispersion parameter per gene using a negative binomial model.
You can add a simplified version:
Add a negative binomial test option to the statistical testing module. For eachgene: (1) estimate the dispersion parameter using the method of moments(variance = mu + mu^2 * dispersion), (2) fit the dispersion-mean relationshipacross all genes using a loess curve, (3) shrink per-gene dispersions toward thefitted curve (empirical Bayes shrinkage), (4) use the shrunken dispersions in anegative binomial test. This is a simplified version of the DESeq2 algorithm.Add a sidebar toggle between "t-test (fast)" and "Negative binomial (more accurate)".This is more statistically appropriate than the t-test, especially for low-count genes and small sample sizes. However, it is significantly more complex to implement and debug. Start with the t-test version, validate it works, then add this as an upgrade.
You generate a volcano plot from your RNA-seq experiment. You see a gene at coordinates (log2FC = 3.5, -log10(adj p-value) = 8). Your significance thresholds are log2FC > 1 and adjusted p-value < 0.05. What can you conclude about this gene?
Try it yourself
- Generate the RNA-seq DE dashboard with the prompt above.
- Click Generate Sample Data and run the analysis with default parameters.
- Examine the volcano plot. Can you identify the upregulated and downregulated gene clusters?
- Adjust the log2FC threshold slider from 1.0 to 0.5. How does the number of significant genes change?
- Switch to the heatmap tab. Does the clustering separate the control and treated samples?
- Download the significant gene list as CSV. Open it in a spreadsheet and sort by adjusted p-value.
- If you have a real RNA-seq count matrix, upload it and compare the results to your previous DESeq2/edgeR analysis.
- Pick one customization from the list above and add it with a follow-up prompt.
What’s next
In Lesson 6, you will build a reproducible RNA-seq workflow orchestrator — a Python CLI tool that generates Snakemake pipelines to chain FASTQ QC, alignment, counting, and differential expression into a single reproducible workflow. It is the capstone lesson for this module, synthesizing the tools from Lessons 3 and 5 into an end-to-end pipeline.