Core Service Intake Validator

What you’re building

Every core facility has the same bottleneck: a researcher emails a sample submission sheet, and someone on staff has to open it, scan every row for missing PI names, invalid billing codes, duplicate sample IDs, and dates in three different formats. It takes 10-15 minutes per sheet, and mistakes slip through anyway.

You are going to build a tool that does this check in under one second.

💬This solves a real staffing problem

Core facility managers spend hours every week chasing down incomplete submissions. A validator that catches errors before samples reach the bench saves staff time, reduces re-runs, and keeps researchers happy because they fix problems once instead of getting an email three days later asking for corrections.

By the end of this lesson you will have a standalone sample sheet validator that runs entirely in the browser. Drag a CSV onto the page (or click to upload), and it instantly flags every row with a missing field, bad billing code, duplicate sample ID, or malformed date. No server, no database, no installation — just one HTML file you can bookmark on the intake workstation.

ℹSoftware pattern: Upload, validate, report

Upload → parse → validate against rules → display errors. This pattern works for any structured data intake: purchase orders, timesheets, expense reports, inventory logs. The techniques here transfer directly to non-lab contexts.

🔍Domain Primer: Key terms you'll see in this lesson

New to core facility operations? Here are the terms you’ll encounter:

Sample sheet / Sample manifest — A spreadsheet (usually CSV or Excel) that researchers submit to a core facility listing every sample they want processed. Each row is one sample with metadata like species, quantity, and billing information.
PI (Principal Investigator) — The lead researcher on a project. Every sample submission must be tied to a PI for billing and accountability.
Billing code — An alphanumeric code that maps a service to a price. Core facilities use these to charge grants for instrument time and consumables. Example: SEQ-001 for standard Illumina sequencing.
Grant number — The funding source that pays for the work. Federal grants follow formats like R01-GM123456. Every billable service must be tied to a valid grant.
Requisition form — The formal request document for core facility services. The sample sheet is often attached to or part of the requisition.
Sample ID — A unique identifier for each sample, often following a facility-specific format like CF-2026-0001. Duplicates cause tracking nightmares downstream.

You don’t need to memorize these — the tool handles the validation logic. You just need to know what the fields represent.

Who this is for

Core facility managers who review incoming sample sheets daily and want to catch errors before processing begins.
Lab coordinators who need a quick sanity check before forwarding submissions to the sequencing or proteomics queue.
Researchers who want to self-validate their submission sheets before sending them to the core, avoiding the back-and-forth email cycle.

ℹCore Facility Context

UW-Madison operates dozens of core facilities — from the Biotechnology Center’s DNA Sequencing Facility to the Mass Spectrometry Facility and the Genome Center. Each has its own submission format, but the validation problems are universal: missing fields, bad codes, duplicates. A configurable validator handles all of them.

The showcase

Here is what the finished validator looks like once you open the HTML file in a browser:

Drag-and-drop zone at the top where you drop a CSV file (or click to browse). Visual feedback on dragover.
Validation rules panel showing the active rules: required fields, valid billing codes, sample ID format, date format.
Summary bar showing total samples, errors found, warnings, and clean rows.
Color-coded error report with a table where:
- Clean rows have a green left border.
- Rows with errors have a red left border and the offending cells are highlighted.
- Rows with warnings (non-critical issues) have a yellow left border.
Error detail panel listing every issue by row number, column, and a human-readable description.
Export button that downloads a copy of the report as a printable HTML page.

Everything runs client-side. The CSV data never leaves the browser. You can use this on an air-gapped workstation.

The prompt

Open your terminal , navigate to a project folder , start your AI CLI tool (e.g., by typing claude), and paste this prompt:

Build a single self-contained HTML file called intake-validator.html that validates
core facility sample submission sheets. Requirements:

1. FILE INPUT
   - A drag-and-drop zone (dashed border, changes color on dragover) for CSV files
   - Also a click-to-browse fallback button
   - Parse the CSV client-side (handle quoted fields, commas inside quotes)
   - Show the filename and row count after upload

2. SAMPLE DATA (embed as a "Load Example" button)
   Include this sample CSV data with deliberate errors for testing:
   Sample_ID,PI_Name,Email,Grant_Number,Species,Sample_Type,Billing_Code,Date_Submitted,Quantity,Notes
   CF-2026-0001,Dr. Sarah Chen,chen.lab@wisc.edu,R01-GM134522,Mus musculus,gDNA,SEQ-001,2026-03-15,12,Rush processing requested
   CF-2026-0002,,johnson.k@wisc.edu,R01-GM134522,Homo sapiens,RNA,SEQ-002,2026-03-15,8,
   CF-2026-0003,Dr. James Rivera,rivera.j@wisc.edu,R01-HG009876,Drosophila melanogaster,Protein,PROT-001,2026/03/16,5,Need results by Friday
   CF-2026-0001,Dr. Sarah Chen,chen.lab@wisc.edu,R01-GM134522,Mus musculus,gDNA,SEQ-001,2026-03-15,12,Duplicate of row 1
   CF-2026-0005,Dr. Anika Patel,patel.anika@wisc.edu,P30-CA014520,Arabidopsis thaliana,gDNA,INVALID-99,2026-03-17,3,
   CF-2026-0006,Dr. James Rivera,rivera.j@wisc.edu,,Danio rerio,Total RNA,SEQ-003,March 18 2026,20,New collaboration
   CF-2026-0007,Dr. Lisa Yamamoto,yamamoto.l@wisc.edu,U54-AI170856,Saccharomyces cerevisiae,Plasmid,SEQ-001,2026-03-18,0,
   WRONG_FORMAT,Dr. Marcus Brown,brown.marcus,R21-NS112340,Mus musculus,FFPE,HIST-001,2026-03-19,4,Paraffin blocks
   CF-2026-0009,Dr. Sarah Chen,chen.lab@wisc.edu,R01-GM134522,Caenorhabditis elegans,smRNA,SEQ-004,2026-03-19,15,Small RNA library prep
   CF-2026-0010,Dr. Emily Foster,foster.e@wisc.edu,T32-GM008349,Xenopus laevis,mRNA,,2026-03-20,6,Training grant samples
   CF-2026-0011,Dr. Raj Krishnan,krishnan.r@wisc.edu,R01-EB029234,Mus musculus,Crosslinked chromatin,SEQ-005,2026-03-20,24,Two conditions x 3 reps x 4 antibodies
   CF-2026-0012,Dr. Anika Patel,patel.anika@wisc.edu,P30-CA014520,Homo sapiens,Isolated nuclei,SEQ-006,20260321,10,Sorted cell populations
   CF-2026-0013,Dr. Lisa Yamamoto,yamamoto.l@wisc.edu,U54-AI170856,Saccharomyces cerevisiae,gDNA,SEQ-001,2026-03-21,2,Whole genome sequencing
   CF-2026-0014,Dr. Marcus Brown,mbrown@medicine.wisc.edu,R21-NS112340,Rattus norvegicus,cDNA,SEQ-002,2026-03-22,-3,Negative quantity
   CF-2026-0015,Dr. Emily Foster,,T32-GM008349,Drosophila melanogaster,Total RNA,PROT-002,2026-03-22,8,Wrong billing code for RNA

3. VALIDATION RULES (apply all of these)
   - Required fields: Sample_ID, PI_Name, Email, Grant_Number, Billing_Code, Date_Submitted
   - Email: must contain an @ sign and a valid domain format (e.g., user@wisc.edu)
   - Sample_ID format: must match regex /^CF-\d{4}-\d{4}$/
   - Billing_Code: must be one of SEQ-001 through SEQ-006, PROT-001, PROT-002, HIST-001
   - Date_Submitted: must be in YYYY-MM-DD format (flag other formats as warnings)
   - Quantity: must be a positive integer (flag zero or negative as errors)
   - Duplicate detection: flag rows with the same Sample_ID
   - Show which specific validation rule failed for each flagged cell

4. ERROR REPORT
   - Summary bar at top: total rows, errors, warnings, clean rows (with color-coded badges)
   - Full table showing all rows with color-coded left borders (green=clean, red=error, yellow=warning)
   - Highlight individual cells that failed validation in red or yellow
   - Below the table, a detailed error list: "Row 4: Sample_ID 'CF-2026-0001' is a duplicate of Row 1"
   - Clicking an error in the list scrolls to and briefly highlights that row in the table

5. EXPORT
   - "Export Report" button that opens a new window with a print-friendly version of the
     validation report (white background, no drop zone, includes filename and timestamp)

6. DESIGN
   - Dark theme: background #0f172a, cards #1e293b, text #e2e8f0, accent #10b981
   - Clean sans-serif font (Inter from Google Fonts CDN)
   - Responsive layout, single column
   - Drag zone should be prominent with a file icon and "Drop CSV here" text
   - Green/red/yellow color coding consistent throughout

7. TECHNICAL
   - Pure HTML/CSS/JS in one file, no build step, no dependencies beyond Google Fonts
   - No Chart.js needed for this tool (it's a validation tool, not a charting tool)
   - CSV parser must handle quoted fields correctly

💡Copy-paste ready

That entire block is the prompt. Paste it as-is. The embedded sample data has deliberate errors in rows 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, and 15 — so you can immediately verify the validator is catching them all.

What you get

After the LLM finishes (typically 60-90 seconds), you will have a single file: intake-validator.html. Open it in any browser.

Expected output structure

intake-validator.html    (~500-700 lines)

Click Load Example and you should see:

A summary bar showing 15 total rows, approximately 8-10 errors, 2-3 warnings, and the remaining clean rows.
Row 2 flagged red: missing PI_Name (required field).
Row 3 flagged yellow: date in YYYY/MM/DD format instead of YYYY-MM-DD.
Row 4 flagged red: duplicate Sample_ID (same as Row 1).
Row 5 flagged red: INVALID-99 is not a recognized billing code.
Row 6 flagged red: missing Grant_Number; flagged yellow: date in non-standard format.
Row 7 flagged red: quantity is 0 (must be positive).
Row 8 flagged red: Sample_ID WRONG_FORMAT does not match the CF-YYYY-NNNN pattern; Email brown.marcus is missing @ sign and domain.
Row 10 flagged red: missing Billing_Code.
Row 12 flagged yellow: date format 20260321 is non-standard.
Row 14 flagged red: negative quantity.
Row 15 flagged red: missing Email.

ℹWhat about Excel files?

Most researchers submit Excel files (.xlsx), not CSVs. Ask them to “Save As → CSV (Comma delimited)” before uploading, or add Excel support as a customization (see the SheetJS extension prompt below in the Customize section).

If something is off

LLMs occasionally produce code with small bugs. Here are the most common issues and one-line fix prompts:

Problem	Follow-up prompt
Drag-and-drop doesn’t work	`The drag-and-drop zone isn't responding to file drops. Make sure you're calling e.preventDefault() on both dragover and drop events, and reading the file from e.dataTransfer.files[0].`
CSV with quoted fields breaks	`My CSV has fields with commas inside quotes, like "Chen, Sarah" and the parser is splitting on those commas. Fix the CSV parser to handle quoted fields correctly.`
All rows show as errors	`Every row is flagged as an error even though some are valid. Check that the validation is comparing against the actual cell values and not the header row.`

🔧

When Things Go Wrong

Use the Symptom → Evidence → Request pattern: describe what you see, paste the error, then ask for a fix.

Symptom

Dropping a file on the page downloads it instead of loading it

Evidence

When I drag a CSV onto the drop zone, the browser opens a download dialog or navigates to the file instead of reading it

What to ask the AI

"The drag-and-drop handler isn't preventing the default browser behavior. Add e.preventDefault() and e.stopPropagation() to both the 'dragover' and 'drop' event listeners on the drop zone. Also add them to the document body to prevent the browser from handling the file."

Symptom

Duplicate detection flags the first occurrence instead of the second

Evidence

Row 1 with Sample_ID CF-2026-0001 is flagged as a duplicate, but Row 4 (the actual duplicate) shows as clean. The first occurrence should be clean.

What to ask the AI

"The duplicate detection is flagging the wrong row. It should keep track of which Sample_ID was seen first and only flag subsequent rows with the same ID as duplicates. Row 1 is the original, Row 4 is the duplicate."

Symptom

Export report opens a blank page

Evidence

Clicking 'Export Report' opens a new browser tab but it's completely empty

What to ask the AI

"The export function is opening a new window but not writing content to it. Make sure you're using newWindow.document.write() with the full HTML content including styles, and calling newWindow.document.close() after writing."

Symptom

Date validation rejects all dates including correctly formatted ones

Evidence

Every row shows a date format warning, even rows with dates in YYYY-MM-DD format like '2026-03-15'

What to ask the AI

"The date validation regex is not matching YYYY-MM-DD correctly. Use the pattern /^\d{4}-\d{2}-\d{2}$/ and make sure you're trimming whitespace from the cell value before testing."

How it works (the 2-minute explanation)

You do not need to read every line of the generated code, but here is the mental model:

CSV parsing splits each line by commas, but respects quoted fields (a field like "Chen, Sarah" stays as one value). The first row becomes the header, and every subsequent row becomes a data object with named properties.
Validation rules are a list of functions, each checking one condition. Required-field checks look for empty strings. Regex checks test the pattern. Billing code checks compare against a whitelist array. Duplicate checks use a Set to track seen IDs.
Color coding assigns a severity to each row based on the worst issue found: red for errors (blocks processing), yellow for warnings (can proceed but should review), green for clean.
Export clones the report HTML into a new window with print-friendly styles. The data never goes to a server — it stays in your browser.

🔍For Facility Managers: Why client-side validation matters

Client-side validation is the first line of defense, not the only one. Your LIMS or billing system should still enforce rules on its end. But catching errors before data enters the system saves everyone time. A researcher who gets instant feedback at submission fixes their sheet in two minutes. A researcher who gets an email three days later has to context-switch back to a project they have already mentally moved on from. Front-loading validation is a service improvement that costs nothing to deploy.

Customize it

The base validator is useful as-is, but every facility has unique requirements. Each of these is a single follow-up prompt:

Add facility-specific billing codes

Update the billing code validation to use this complete list from our rate schedule:
SEQ-001 (Standard Illumina), SEQ-002 (Low-input), SEQ-003 (Single-cell),
SEQ-004 (Small RNA), SEQ-005 (ChIP-seq), SEQ-006 (ATAC-seq),
PROT-001 (LC-MS/MS), PROT-002 (TMT labeling), HIST-001 (H&E staining),
HIST-002 (IHC), FLOW-001 (Cell sorting), FLOW-002 (Analysis only).
Show the full service name next to each billing code in the validation output.

Add species whitelist

Add a species validation rule. Valid species are: Homo sapiens, Mus musculus,
Rattus norvegicus, Drosophila melanogaster, Caenorhabditis elegans,
Saccharomyces cerevisiae, Danio rerio, Xenopus laevis, Arabidopsis thaliana.
Flag any other species as a warning (not an error) with the message
"Uncommon species — verify with facility staff."

Add Excel (.xlsx) support

Add support for uploading Excel files (.xlsx) in addition to CSV. Use SheetJS from
CDN (https://cdn.sheetjs.com/xlsx-0.20.3/package/dist/xlsx.full.min.js) to parse
the workbook. Read the first sheet, convert it to an array of arrays, and feed it
into the same validation pipeline. If the file has multiple sheets, show a dropdown
to select which sheet to validate. Keep CSV support working as before.

Add batch summary email draft

Add a "Generate Email" button that creates a pre-formatted email summary of the
validation results. Include: filename, date, total samples, error count, and a
bulleted list of all errors. Format it so I can copy-paste it into Outlook as a
reply to the researcher who submitted the sheet. Keep the tone professional and
helpful.

ℹThe customization loop

Start with the working validator, then add your facility’s specific rules one prompt at a time. Each prompt builds on what exists. You never need to plan the entire tool upfront — iterate from a solid foundation.

Try it yourself

Open your CLI tool in an empty folder.
Paste the main prompt from above.
Open the generated intake-validator.html in your browser.
Click Load Example to see the validation in action on the embedded test data.
Export a real sample sheet from your facility’s LIMS or shared drive (as CSV) and drop it on the validator.
Pick one customization from the list above and add it.

If you manage a core facility, put this HTML file on the shared drive next to the submission template. Link to it in your intake instructions. You just eliminated the most tedious part of the intake process.

Key takeaways

One prompt, one tool: a detailed prompt with embedded sample data produces a working intake validator in under 2 minutes.
Client-side validation catches errors before they enter your pipeline — researchers get instant feedback instead of a correction email days later.
Embedding test data with deliberate errors in the prompt guarantees you can verify the tool works immediately, without needing a separate test file.
Configurable rules (billing codes, ID formats, required fields) mean one validator pattern serves every core facility — just update the whitelists.
Drag-and-drop file handling has browser quirks (you must preventDefault on both dragover and drop) — specifying this in the prompt prevents the most common bug.

KNOWLEDGE CHECK

A researcher submits a sample sheet with 50 rows and your validator flags 3 errors. Why should you validate before processing rather than fixing errors as you encounter them during the run?

KNOWLEDGE CHECK

Row 4 in the test data has the same Sample_ID as Row 1. What downstream problem does a duplicate Sample_ID cause?

What’s next

In the next lesson, you will build a Usage & Billing Summary Dashboard that takes instrument usage logs and automatically groups them by PI and grant number, producing bar charts and billing totals — the other half of core facility administration.

Core Service Intake Validator

What you'll learn

What you’re building

Who this is for

The showcase

The prompt

What you get

Expected output structure

If something is off

When Things Go Wrong

How it works (the 2-minute explanation)

Customize it

Add facility-specific billing codes

Add species whitelist

Add Excel (.xlsx) support

Add batch summary email draft

Try it yourself

Key takeaways

What’s next