Testing Strategy for "The Sys." Blog

Context: Terminal-only development environment (SSH to EC2, no GUI). Need to verify frontend changes work correctly without breaking existing functionality.

Problem: How do we test web frontend changes when we can't view a browser during development?

Requirements

Terminal-only execution - Must run via SSH, no GUI required
Fast feedback - Quick smoke tests for rapid iteration
Regression detection - Catch when changes break existing functionality
Visual confidence - Some way to verify UI didn't break (even if delayed)
Minimal dependencies - Keep it simple, avoid over-engineering

Considered Options

Option 1: Curl-Based Smoke Tests

What it is: Shell script using curl to make HTTP requests and validate responses with grep.

Pros:

Extremely fast (< 5 seconds)
No dependencies (curl already installed)
Easy to write and understand
Catches obvious breakage immediately

Cons:

Can't test JavaScript functionality
Can't verify CSS/layout
Can't catch visual regressions
Only validates HTML structure and text content

Example:

curl -s http://localhost:3000 | grep -q "The Sys" || echo "FAIL"

Verdict: ✅ Use as first line of defense

Option 2: Headless Browser Tests (Puppeteer/Playwright)

What it is: Node.js libraries that control a real browser (Chrome/Firefox) without GUI. Runs in terminal.

Pros:

Full browser environment (JavaScript, CSS, rendering)
Can click, navigate, fill forms
Can take screenshots for later review
Catches real browser issues
Still runs in terminal

Cons:

Slower than curl (5-30 seconds depending on tests)
Requires installing Chrome/Chromium
More complex to write
Uses more memory

Example:

const page = await browser.newPage();
await page.goto('http://localhost:3000');
const title = await page.title();
assert(title === 'The Sys.');
await page.screenshot({ path: 'homepage.png' });

Verdict: ✅ Use as comprehensive validation before committing changes

Option 3: Visual Regression Testing (Percy/BackstopJS)

What it is: Tools that take screenshots, compare pixel-by-pixel differences between versions.

Pros:

Catches visual regressions automatically
Generates diff reports
Perfect for CSS changes

Cons:

Overkill for simple blog
Requires baseline maintenance
Adds complexity
Pixel-perfect comparisons can be brittle

Example:

backstop test  # Compares current vs baseline

Verdict: ❌ Too heavy for this use case. Manual screenshot review is sufficient.

Option 4: Integration Tests with Jest + Puppeteer

What it is: Full test suite using Jest framework + Puppeteer for browser automation.

Pros:

Industry standard approach
Great test organization
Easy to extend
Clear pass/fail reporting

Cons:

More setup overhead
Another dependency (Jest)
Might be overkill for 7 markdown files

Example:

test('homepage loads', async () => {
  await page.goto('http://localhost:3000');
  expect(await page.title()).toBe('The Sys.');
});

Verdict: 🤔 Good for future if complexity grows. Start simpler for now.

Recommended Approach

Layered testing strategy:

Layer 1: Smoke Tests (Always Run)

Tool: Bash script with curl When: After every change, before anything else Runtime: ~5 seconds Purpose: Catch obvious breakage immediately

Tests:

Server responds (200 OK)
Homepage has correct title
All 7 docs are listed
Markdown files render as HTML
Access logging works

Why:

Fastest feedback loop
No dependencies
Catches 80% of issues
Can run continuously during development

Layer 2: Browser Tests (Before Committing)

Tool: Puppeteer (headless Chrome) When: Before committing changes or when making UI changes Runtime: ~15-30 seconds Purpose: Comprehensive validation including JavaScript and CSS

Tests:

Navigation works (clicking links)
Back button functionality
Markdown rendering with proper styling
CSS loads correctly
No console errors
Take screenshot for manual review later

Why:

Real browser environment
Catches JavaScript errors
Validates full user flow
Screenshots provide visual confirmation (can review on work Mac)

Layer 3: Manual Review (As Needed)

Tool: Work Mac browser When: After deploying, or for significant UI changes Purpose: Human visual validation

Process:

SSH to EC2, make changes, run tests
If tests pass, access from work Mac browser
Visually verify everything looks right
If broken, revert and iterate

Why:

Ultimate source of truth
Catches subtle visual issues
No tooling complexity
Browser is already available on work Mac

Implementation Plan

File Structure

/home/ubuntu/yap/
├── server.js
├── smoke-test.sh          # Layer 1: Fast curl tests
├── browser-test.js        # Layer 2: Puppeteer tests
├── package.json           # npm scripts for easy running
├── screenshots/           # Generated screenshots for review
└── design-doc.md          # This file

Scripts to Add

package.json:

{
  "scripts": {
    "test": "npm run test:smoke && npm run test:browser",
    "test:smoke": "./smoke-test.sh",
    "test:browser": "node browser-test.js"
  }
}

Usage:

# Quick check during development
npm run test:smoke

# Full validation before commit
npm test

# Just browser tests
npm run test:browser

Workflow

During Development:

# 1. Make changes
vim server.js

# 2. Restart server
sudo systemctl restart blog

# 3. Quick validation (5 sec)
npm run test:smoke

# 4. Keep iterating if smoke tests fail
# Repeat 1-3 until smoke tests pass

Before Committing:

# 1. Run full test suite
npm test

# 2. Review any screenshots generated
ls screenshots/

# 3. If tests pass, commit
git add .
git commit -m "Update styling"

After Deploying:

# 1. From work Mac, visit http://[EC2-IP]:3000
# 2. Visual review
# 3. If broken, SSH back in and revert

Why This Works for "The Sys."

Terminal-only compatible - Everything runs via SSH
Fast iteration - Smoke tests give instant feedback
Confidence before commit - Browser tests catch real issues
No unnecessary complexity - No S3, no CI/CD, no over-engineering
Fits the philosophy - Simple, automatic, removes guesswork

Trade-offs We're Accepting

Not doing:

Visual regression pixel diffing (overkill)
CI/CD pipeline (just one developer)
Screenshot upload to S3 (can view on work Mac)
Cross-browser testing (only using Chrome/work Mac Safari)
Mobile testing (not a priority)
Performance testing (traffic is minimal)

Why: These add complexity without proportional value for a personal blog with 7 markdown files.

Keep it simple. Make it automatic. Remove options.

Future Considerations

If the blog grows significantly:

Add Jest for better test organization
Add visual regression testing for critical pages
Add API tests if we add dynamic features
Consider GitHub Actions for automated testing

For now: Two layers (smoke + browser) are sufficient.

Success Metrics

We'll know this is working if:

Smoke tests run in < 5 seconds and catch obvious errors
Browser tests run in < 30 seconds and provide confidence
We catch regressions before they hit work Mac browser
Testing doesn't slow down development iteration
We actually run the tests (not too complex/annoying)

Last updated: October 2025