Back to all docs

Testing Strategy for Repository Chat App

Context: iMessage-like chat interface for executing bash commands in git repositories. Terminal-only development environment (SSH to EC2, no GUI during dev).

Problem: How do we test interactive chat functionality (click contact → type command → see output) when developing via SSH without a GUI?

Requirements

Terminal-only execution - Must run via SSH, no GUI required
Fast feedback - Quick smoke tests for rapid iteration
Interactive validation - Verify chat UI works (send message, receive response)
Regression detection - Catch when changes break existing flows
API validation - Ensure backend correctly executes commands in repos
Minimal dependencies - Keep it simple, vanilla approach

Current State

What we have:

✅ Layer 1: Smoke tests (smoke-test.sh) - 23 tests
- HTTP endpoints respond
- Files exist and load
- Backend API returns data
- Service worker caching strategy

What we're missing:

❌ Layer 2: Interactive flow tests
- Click contact → Opens chat
- Type command → Sends to backend
- Receive output → Displays in UI
- Navigation (back button)

Gap: Our smoke tests validate structure but not user interactions.

Layered Testing Strategy

Layer 1: Smoke Tests (Current - Fast Validation)

Tool: Bash script with curl Runtime: ~5 seconds When: After every change, in pre-commit hook Coverage: HTTP, HTML structure, API endpoints

What it catches:

Server not running
Files missing or moved
API endpoints broken
Invalid JSON responses
Missing CSS/JS files

What it misses:

JavaScript not executing
UI elements not clickable
Forms not submitting
Navigation broken
Command execution broken in UI

File: smoke-test.sh (23 tests)

Layer 2: Browser Integration Tests (Needed)

Tool: Puppeteer (headless Chrome) Runtime: ~15-30 seconds When: Before committing feature changes Coverage: Full user flows, JavaScript execution, UI interactions

Critical flows to test:

Flow 1: Contact List → Chat

test('navigate from contacts to chat', async () => {
  await page.goto('http://localhost:8000/contacts.html');

  // Wait for repos to load
  await page.waitForSelector('.contact-item');

  // Click first contact
  await page.click('.contact-item');

  // Should navigate to chat
  await page.waitForSelector('.message-input');

  // Should show repo name in header
  const repoName = await page.$eval('.contact-name', el => el.textContent);
  expect(repoName).toBeTruthy();
});

Flow 2: Execute Command → See Output

test('execute bash command and see output', async () => {
  // Navigate to a specific repo chat
  const contactId = 'L2hvbWUvdWJ1bnR1L3dvcmtwbGFjZS9BaGFpYUFwcC9pZGU=';
  await page.goto(`http://localhost:8000/index.html?contact=${contactId}`);

  // Wait for chat to load
  await page.waitForSelector('.message-input');

  // Type a command
  await page.type('.message-input', 'ls');

  // Press Enter
  await page.keyboard.press('Enter');

  // Should show typing indicator
  await page.waitForSelector('.typing-indicator[style*="flex"]', { timeout: 1000 });

  // Wait for response message
  await page.waitForSelector('.message-group.received .message-bubble', { timeout: 5000 });

  // Response should contain output
  const output = await page.$eval('.message-group.received:last-of-type .message-bubble', el => el.textContent);
  expect(output).toContain('package.json'); // Should list files
});

test('back button returns to contacts list', async () => {
  const contactId = 'L2hvbWUvdWJ1bnR1L3dvcmtwbGFjZS9BaGFpYUFwcC9pZGU=';
  await page.goto(`http://localhost:8000/index.html?contact=${contactId}`);

  // Click back button
  await page.click('.back-button');

  // Should navigate to contacts list
  await page.waitForSelector('.contacts-list');
  expect(page.url()).toContain('contacts.html');
});

Flow 4: Error Handling

test('show stderr in red bubble', async () => {
  const contactId = 'L2hvbWUvdWJ1bnR1L3dvcmtwbGFjZS9BaGFpYUFwcC9pZGU=';
  await page.goto(`http://localhost:8000/index.html?contact=${contactId}`);

  await page.waitForSelector('.message-input');

  // Type invalid command
  await page.type('.message-input', 'nonexistentcommand123');
  await page.keyboard.press('Enter');

  // Wait for error response
  await page.waitForSelector('.error-bubble', { timeout: 5000 });

  // Error bubble should be red
  const bgColor = await page.$eval('.error-bubble', el =>
    window.getComputedStyle(el).backgroundColor
  );
  expect(bgColor).toContain('255, 59, 48'); // iOS red
});

Flow 5: Multiple Commands

test('execute multiple commands in sequence', async () => {
  const contactId = 'L2hvbWUvdWJ1bnR1L3dvcmtwbGFjZS9BaGFpYUFwcC9pZGU=';
  await page.goto(`http://localhost:8000/index.html?contact=${contactId}`);

  await page.waitForSelector('.message-input');

  // First command
  await page.type('.message-input', 'pwd');
  await page.keyboard.press('Enter');
  await page.waitForSelector('.message-group.received', { timeout: 5000 });

  // Second command
  await page.type('.message-input', 'git status');
  await page.keyboard.press('Enter');
  await page.waitForFunction(() =>
    document.querySelectorAll('.message-group.received').length >= 2,
    { timeout: 5000 }
  );

  // Should have both responses
  const messageCount = await page.$$eval('.message-group.received', els => els.length);
  expect(messageCount).toBeGreaterThanOrEqual(2);
});

File: browser-test.js (to be created)

Layer 3: Manual Testing

Tool: iPhone Safari / Desktop browser When: After deploying or for major UI changes Purpose: Visual validation, native feel, iOS-specific issues

Checklist:

Contacts list loads and shows all repos
Tap contact opens chat smoothly
Keyboard appears when tapping input
Enter key sends message
Output displays correctly
Code blocks formatted properly
Back button works
Dark mode looks good
PWA installs correctly
Offline mode works (service worker)

Implementation Plan

File Structure

/home/ubuntu/workplace/AhaiaApp/ide/
├── smoke-test.sh          # Layer 1: Curl tests (23 tests) ✅
├── browser-test.js        # Layer 2: Puppeteer tests (5 flows) ❌ TO ADD
├── package.json           # npm scripts
├── screenshots/           # Generated screenshots for review
└── TESTING.md            # Testing guide for developers

Scripts to Add

package.json:

{
  "scripts": {
    "test": "npm run test:smoke && npm run test:browser",
    "test:smoke": "./smoke-test.sh",
    "test:browser": "node browser-test.js",
    "test:watch": "nodemon --exec npm run test:browser"
  },
  "devDependencies": {
    "puppeteer": "^21.0.0"
  }
}

Comparison: Chat App vs Yap Blog

Aspect	Yap Blog	Chat App	Difference
Complexity	Static markdown rendering	Interactive chat UI + backend	Higher
JavaScript	Minimal (mostly server-side)	Significant (chat logic, API calls)	More critical
User Flows	View doc, click link, go back	Click contact, type command, see output	More interactive
Backend	Simple Express file server	API with command execution	More complex
Testing Gap	Smoke tests catch most issues	Smoke tests miss UI interactions	Needs Layer 2

Conclusion: Chat app needs Layer 2 (browser tests) more than yap blog because:

More JavaScript-dependent
Interactive forms (message input)
Real-time API calls
State management (current contact, messages)
Navigation between views

Recommended Approach for Chat App

Phase 1: Add Browser Tests (Now)

Priority flows:

✅ Contact list loads
✅ Click contact → Opens chat
✅ Type command + Enter → See output
✅ Back button navigation
✅ Error handling (stderr in red)

Why now:

Chat interactivity is core feature
Can't validate with curl alone
Prevents shipping broken UX

Phase 2: Expand Coverage (Later)

Additional tests:

Multiple repos (test directory isolation)
Long-running commands (timeout handling)
Command history (if we add it)
PWA offline mode
Service worker updates

Workflow

During Development:

# 1. Make changes
vim app-repo.js

# 2. Quick validation (5 sec)
npm run test:smoke

# 3. If UI changes, run browser tests (30 sec)
npm run test:browser

# 4. Iterate until tests pass

Before Committing:

# 1. Run full test suite
npm test

# 2. Review screenshots if generated
ls screenshots/

# 3. Commit if tests pass
git add .
git commit -m "Add feature"
# → Pre-commit hook runs smoke tests automatically

Before Deploying to iPhone:

# 1. Ensure all tests pass
npm test

# 2. Deploy (systemd restart happens automatically)

# 3. Test on iPhone
# - Open Safari to http://YOUR_IP:8000/contacts.html
# - Click through flows manually
# - Check dark mode, PWA install

Trade-offs

What we're doing:

✅ Smoke tests (curl) - Fast, catches structure issues
✅ Browser tests (Puppeteer) - Validates interactions
✅ Manual iPhone testing - Final UX validation

What we're NOT doing (and why):

❌ Visual regression pixel diffing - Overkill, manual screenshots sufficient
❌ CI/CD pipeline - Single developer, local testing is fine
❌ Cross-browser testing - Only targeting iOS Safari + modern browsers
❌ Performance testing - Not critical yet
❌ Unit tests for individual functions - Integration tests cover it
❌ E2E tests with real Claude Code - Too complex for now, mock responses

Philosophy: Vanilla approach extends to testing. Use simple, proven tools (bash + Puppeteer) rather than heavy frameworks.

Success Metrics

We'll know this works if:

✅ Smoke tests run in < 5 seconds
✅ Browser tests run in < 30 seconds
✅ We catch broken interactions before pushing
✅ Tests don't slow down iteration
✅ We actually run them (not too annoying/complex)
✅ Confidence to refactor without breaking things

Next Steps

Install Puppeteer: npm install --save-dev puppeteer
Create browser-test.js with 5 critical flows
Update package.json with test scripts
Run tests and fix any failures
Update pre-commit hook to run both layers
Document in TESTING.md for future reference

Why This Matters

Current risk without Layer 2:

Can merge code that passes smoke tests but has broken UI
Input field might not send commands
Navigation might be broken
Error handling might fail silently
Only discover issues when testing on iPhone (too late)

With Layer 2:

Catch interaction bugs immediately
Confidence to refactor UI code
Faster iteration (no waiting for iPhone test)
Better commit history (working code only)

Last updated: November 2025 Approach based on proven strategy from ~/yap/design-doc.md