Species Detection Heatmap
What you'll learn
~25 min- Build a standalone species detection heatmap tool with a single AI prompt
- Visualize presence/absence data across sampling sites with interactive sorting and filtering
- Troubleshoot common issues with heatmap rendering and CSV data formatting
- Customize the heatmap with read-count opacity, clustering, and export features
What you’re building
Imagine uploading a species detection CSV and instantly seeing a color-coded heatmap showing exactly which species were found at which sites — sortable, filterable, and click-to-highlight. No R, no ggplot, no Python seaborn. Just one HTML file you can open on any lab computer or project in a conference talk.
That is what you will build in the next 20 minutes.
Presence/absence heatmaps are one of the most common figures in eDNA survey papers. Building your own means you control the formatting, color scheme, and sorting — instead of fighting with R plot margins or asking a bioinformatician to re-run a script every time you add a site. This is the figure you include in your thesis, your monitoring report, or your agency presentation.
By the end of this lesson you will have a standalone species detection heatmap that runs entirely in the browser. It accepts a CSV of species detections by site, renders an interactive heatmap with CSS-styled cells, provides summary bar charts for species richness and detection frequency (Chart.js), and lets you click any species to highlight its distribution across all sites.
This pattern applies to any matrix dataset you want to visualize: gene expression across samples, survey responses across demographics, sensor readings across locations. The heatmap is a universal data visualization tool.
🔍Domain Primer: Key eDNA survey terms you'll see in this lesson
Here are the terms you will encounter in this lesson:
- Presence/absence matrix — A table where rows are species, columns are sampling sites, and cells are either 1 (detected) or 0 (not detected). The simplest way to summarize a biodiversity survey.
- Species richness — The number of distinct species detected at a site. A site with 10 species has higher richness than one with 3. It is the most basic measure of biodiversity.
- Detection frequency — How many sites a species was detected at, expressed as a count or a proportion. A species detected at 7 out of 8 sites has a detection frequency of 87.5%.
- Occupancy — The proportion of sites where a species is present. In eDNA studies, “naive occupancy” is based on raw detections; “model-based occupancy” accounts for imperfect detection probability.
- Sampling site — A specific geographic location where an eDNA sample was collected. Each site usually has GPS coordinates and a name or code.
- Heatmap — A matrix visualization where cell color encodes a value. For presence/absence, it is binary (detected vs. not). For read counts, color intensity can scale with abundance.
- Biodiversity survey — A systematic effort to catalog which species are present in an area. eDNA surveys are increasingly used alongside traditional methods (electrofishing, trapping, visual surveys).
- Rare species — A species detected at very few sites. Rare detections are the most interesting findings in biodiversity surveys but also the most likely to be false positives, which is why the contamination QC from the previous lesson matters.
- Ubiquitous species — A species detected at all or nearly all sites. These are the common, widespread organisms in the ecosystem.
You do not need to memorize these — the tool makes them visible through the visualization itself.
Who this is for
- eDNA survey researchers who need a fast, shareable visualization of detection results across sites.
- Natural resource managers who want to see species distributions across monitoring sites without learning R.
- Graduate students presenting eDNA survey results at lab meetings or conferences who want an interactive figure they can click through.
eDNA core labs and environmental monitoring programs often deliver results as CSV tables. PIs and agency partners need visualizations, not spreadsheets. A self-built heatmap tool that runs in any browser lets you deliver results as an interactive HTML file alongside the raw data — no software installation required on the recipient’s end.
The showcase
Here is what the finished tool looks like once you open the HTML file in a browser:
- Header with a file upload area for your detection matrix CSV (or a textarea for pasting).
- Heatmap grid — rows are species, columns are sites, cells are colored (green = detected, gray = not detected, with opacity scaling if read counts are provided instead of binary values).
- Sorting controls — sort rows by species name (alphabetical), total detections (most to fewest), or detection frequency. Sort columns by site name or species richness.
- Species richness bar chart (Chart.js) — one bar per site showing how many species were detected there.
- Detection frequency bar chart (Chart.js) — one bar per species showing at how many sites it was found.
- Click-to-highlight — click any species name to highlight all sites where it was found. Click a site header to highlight all species found there.
- Summary stats — total species, total sites, average richness, rarest species, most common species.
Everything runs client-side. Your detection data never leaves your browser.
The prompt
Open your terminal Terminal The app where you type commands. Mac: Cmd+Space, type "Terminal". Windows: open WSL (Ubuntu) from the Start menu.
Full lesson →
, navigate to a project folder project folder A directory on your computer where the tool lives. Create one with "mkdir my-project && cd my-project".
Full lesson →
, start your AI CLI tool AI CLI tool Claude Code, Gemini CLI, or Codex CLI — a command-line AI that reads files, writes code, and runs commands.
Full lesson →
(e.g., by typing claude), and paste this prompt:
Build a single self-contained HTML file called species-heatmap.html that servesas an interactive species detection heatmap for eDNA survey data. Requirements:
1. DATA INPUT - File upload button accepting .csv files plus a textarea for pasting CSV data - CSV format: first column is species name, remaining columns are site names, cell values are either 0/1 (presence/absence) or integer read counts - If read counts are provided (values > 1), treat any value > 0 as "detected" for the heatmap but use the count for opacity scaling - Include a "Load Example" button with this embedded dataset:
Species,Devils_Lake,Mirror_Lake,Lake_Mendota,Trout_Creek,Spring_Creek,Yahara_River,Pheasant_Branch,Lake_Wingra Salvelinus_fontinalis,1,0,0,1,1,0,0,0 Micropterus_salmoides,1,1,1,0,1,1,1,1 Oncorhynchus_mykiss,0,0,0,1,1,0,0,0 Lithobates_catesbeianus,1,1,1,1,1,1,1,1 Chelydra_serpentina,0,0,1,0,0,1,0,1 Esox_lucius,1,1,0,0,0,1,0,0 Salmo_trutta,0,0,0,1,1,0,0,0 Ambloplites_rupestris,1,0,1,0,0,1,1,0 Cyprinus_carpio,0,0,1,0,0,1,0,1 Notemigonus_crysoleucas,1,0,0,0,1,0,0,0 Ictalurus_punctatus,0,0,1,1,0,0,0,1 Notropis_hudsonius,1,0,1,0,0,1,0,0 Catostomus_commersonii,0,1,0,0,1,0,1,0 Perca_flavescens,1,0,1,1,0,0,0,1 Lepomis_macrochirus,1,1,1,0,1,1,1,1
2. HEATMAP GRID - Render as an HTML table with CSS-styled cells - Detected cells: background #10b981 (green), not detected: #374151 (dark gray) - When input has read counts > 1, scale the green opacity from 0.3 (low reads) to 1.0 (highest reads) so abundance differences are visible - Species names in the first column, site names as column headers rotated 45 degrees for readability - Hover tooltip on each cell showing: species name, site name, and value (detected/not detected, or read count if provided) - Cell size uniform, roughly 36x36 pixels
3. SORTING AND FILTERING - Row sort buttons: Alphabetical (A-Z), By Total Detections (descending), By Detection Frequency (descending, same as total for binary data) - Column sort buttons: Alphabetical (A-Z), By Species Richness (descending) - Text filter: type a species name to filter rows (partial match, case-insensitive) - Detection filter slider: show only species detected at >= N sites
4. SUMMARY CHARTS (Chart.js from CDN) - Vertical bar chart: Species Richness per Site (one bar per site column, y-axis = number of species detected, bar color #10b981) - Horizontal bar chart: Detection Frequency per Species (one bar per species row, x-axis = number of sites, bar color #38bdf8) - Both charts update when filtering or sorting changes the visible data
5. CLICK-TO-HIGHLIGHT - Click a species name (row header) to highlight that entire row plus all column headers where it was detected. Second click deselects. - Click a site name (column header) to highlight that entire column plus all row headers for species detected there. Second click deselects. - Highlight color: border #facc15 (yellow), 2px solid
6. SUMMARY STATS (top panel, updates with filters) - Total species shown / total in dataset - Total sites - Average species richness (mean across sites) - Rarest species: name and detection count (fewest sites, >0) - Most common species: name and detection count - Most species-rich site: name and richness count
7. DESIGN - Dark theme: background #0f172a, cards #1e293b, text #e2e8f0, accent #38bdf8 - Clean sans-serif font (Inter from Google Fonts CDN) - Responsive layout: heatmap scrolls horizontally on narrow screens - Include a Clear button to reset everything - Add an "Export PNG" button that uses html2canvas (CDN) to capture the heatmap as a downloadable image
8. TECHNICAL - Pure HTML/CSS/JS in one file, no build step - Chart.js loaded from CDN (https://cdn.jsdelivr.net/npm/chart.js) - html2canvas loaded from CDN (https://cdn.jsdelivr.net/npm/html2canvas) - CSV parsing handles quoted fields - All processing client-side, no data uploaded anywhereThat entire block is the prompt. Paste it as-is. The sample dataset uses real Wisconsin water body names and realistic freshwater species. Lithobates catesbeianus (American bullfrog) and Micropterus salmoides (largemouth bass) are detected at nearly all sites (ubiquitous), while Oncorhynchus mykiss (rainbow trout) and Salmo trutta (brown trout) only appear at cold-water sites (rare/specialized).
What you get
After the LLM finishes (typically 60-90 seconds), you will have a single file: species-heatmap.html. Open it in any browser.
Expected output structure
species-heatmap.html (~500-700 lines)Click Load Example and you should see:
- A 15 × 8 heatmap grid with green cells (detected) and gray cells (not detected).
- Two ubiquitous species with green cells across most or all columns: Lithobates catesbeianus (8/8 sites) and Micropterus salmoides (7/8 sites).
- Two rare species with green cells in only 2 columns: Oncorhynchus mykiss (Trout_Creek and Spring_Creek) and Salmo trutta (same two sites — both are cold-water species restricted to trout streams).
- The species richness chart showing Devils_Lake and Lake_Mendota with the highest richness (8-9 species each).
- The detection frequency chart showing Lithobates catesbeianus at the top with 8 sites and Oncorhynchus mykiss / Salmo trutta at the bottom with 2 sites each.
- Click highlighting — clicking Esox lucius should highlight Devils_Lake, Mirror_Lake, and Yahara_River.
If something is off
| Problem | Follow-up prompt |
|---|---|
| Heatmap cells are all the same color despite different read counts | The opacity scaling for read counts is not working. Make sure you are dividing each cell's count by the maximum count in the dataset to get a 0-1 opacity value, then using rgba(16, 185, 129, opacity) for the cell background. |
| Site headers overlap or are unreadable | The rotated column headers are overlapping. Increase the header height to 120px and add white-space: nowrap to the header cells. Also add a bottom margin to the heatmap container so the rotated text does not get clipped. |
| Charts do not update when filters change | The charts are only rendered once on initial load. Can you add a function that re-renders both charts whenever the heatmap data changes (filtering, sorting, or detection threshold slider)? |
When Things Go Wrong
Use the Symptom → Evidence → Request pattern: describe what you see, paste the error, then ask for a fix.
How it works (the 2-minute explanation)
You do not need to understand every line of the generated code, but here is the mental model:
- CSV parsing reads the first row as site headers and the first column of each subsequent row as a species name. Every other cell becomes a detection value (0 or a read count).
- The heatmap is an HTML
<table>with CSS styling. Each cell gets a background color based on its value. For binary data, it is green or gray. For read counts, the green opacity scales linearly with the count. - Sorting rearranges the table rows or columns by rebuilding the
<table>from the sorted data array. JavaScript sorts the underlying data, then re-renders. - Click-to-highlight attaches an event listener to each row header and column header. When clicked, it adds a CSS class to the relevant cells. A second click removes it.
- Chart.js renders the two bar charts. When filters change, the chart data is updated and the charts re-render.
The heatmap is not just a pretty picture — it reveals ecological patterns. Species that cluster in the same columns (sites) may share habitat preferences. Sites that cluster with similar species compositions are ecologically similar. The two trout species (Oncorhynchus mykiss and Salmo trutta) appearing only at Trout_Creek and Spring_Creek is not a coincidence — those are cold-water species restricted to streams with suitable temperatures. The heatmap makes these patterns visible at a glance in a way that a CSV table never can.
The heatmap shows detections, not confirmed presence. A species absent from a cell may still be present at the site — eDNA detection is probabilistic. Low eDNA concentration, DNA degradation, primer mismatch, or PCR stochasticity can all cause false negatives. Occupancy models account for imperfect detection, but require replicate sampling data. Treat empty cells as “not detected,” not “not present.”
Customize it
The base heatmap is useful as-is, but here are extensions that make it more powerful for publication and analysis:
Add read-count gradient mode
Add a toggle between "Presence/Absence" mode (binary green/gray) and"Abundance" mode (green gradient scaled by read count). In abundance mode,show the read count number inside each cell in small white text. Add acolor scale legend showing the gradient range from the minimum to maximumread count.🔍Read counts are semi-quantitative
Read counts in metabarcoding are semi-quantitative — they do not reliably reflect organism abundance due to primer binding efficiency, PCR amplification bias, and sequencing depth differences between samples. Use read-count gradients to identify strong vs. weak detections within a sample, not to compare abundance across species or sites. A species with 500 reads and another with 5,000 reads in the same sample are not necessarily present at a 1:10 ratio in the environment.
Add site metadata row
Add a row at the top of the heatmap for site metadata. Let me upload a secondCSV with columns: Site, Latitude, Longitude, Habitat_Type, Date_Sampled.Display Habitat_Type as a colored bar above each site column (e.g., lake=blue,river=cyan, pond=teal, stream=green). Show the other metadata in the hovertooltip.Add similarity clustering
Add a "Cluster" button that reorders both rows and columns by similarityusing Jaccard distance. For rows: species with similar site distributionsshould be adjacent. For columns: sites with similar species compositionsshould be adjacent. Use a simple hierarchical clustering approach(nearest-neighbor linkage). Add dendrograms on the left side (species)and top (sites) showing the clustering tree.Start with the working heatmap. Add one feature at a time. The clustering extension turns a simple presence/absence table into a proper ecological analysis figure. But the base version is already useful for 90% of presentations and reports.
Try it yourself
- Open your CLI tool in an empty folder.
- Paste the main prompt from above.
- Open the generated
species-heatmap.htmlin your browser. - Click Load Example and verify the heatmap renders correctly.
- Try sorting by detection frequency — the trout species should drop to the bottom.
- Click Lithobates catesbeianus to see it highlighted across all 8 sites.
- If you have real eDNA detection data (or output from the contamination checker in the previous lesson), paste it in and see your own heatmap.
Key takeaways
- Presence/absence heatmaps are the standard visualization for eDNA survey data — and building your own gives you full control over formatting and interactivity.
- Sorting reveals ecological patterns: sorting by detection frequency separates ubiquitous species from rare ones; sorting sites by richness highlights biodiversity hotspots.
- Click-to-highlight makes the heatmap interactive in a way that static R or Python plots cannot match — ideal for presentations and exploratory analysis.
- The heatmap connects directly to the contamination QC checker from the previous lesson: clean your data first, then visualize the results.
- Export to PNG makes the heatmap immediately usable in papers, reports, and slide decks without screenshots.
Your heatmap shows that Site A has 12 species detected and Site B has 3 species detected. What does this difference in species richness suggest?
You notice that Oncorhynchus mykiss (rainbow trout) and Salmo trutta (brown trout) are detected at exactly the same two sites and nowhere else. What is the most likely explanation?
What’s next
In the next lesson, you will build a Single-Cell Expression Explorer — a UMAP/t-SNE viewer for pre-computed coordinate CSVs with cluster coloring. Same single-file HTML pattern, but applied to single-cell genomics data. If eDNA tells you which species are where, single-cell tells you which genes are active in which cells — a different scale of biodiversity.