# 🏗️ Architecture Deep-Dive

This document walks through every technical decision in `speed-reader.html` with annotated code excerpts. Written for engineers and hiring managers who want to understand the *why* behind each choice.

---

## Table of Contents

1. [Design Philosophy](#1-design-philosophy)
2. [State Management](#2-state-management)
3. [The ORP Algorithm](#3-the-orp-algorithm)
4. [The Word Renderer](#4-the-word-renderer)
5. [The Playback Scheduler](#5-the-playback-scheduler)
6. [File Parsing Pipeline](#6-file-parsing-pipeline)
7. [Event Architecture](#7-event-architecture)
8. [CSS Architecture](#8-css-architecture)
9. [Performance Decisions](#9-performance-decisions)
10. [What I'd Do Differently at Scale](#10-what-id-do-differently-at-scale)

---

## 1. Design Philosophy

### Single-file architecture
The app is one `.html` file. This is a **deliberate trade-off**:

| Pro | Con |
|---|---|
| Zero install friction | CSS/JS not separately cacheable |
| Share by emailing one file | No tree-shaking |
| Works from file:// protocol | Harder to unit test in isolation |
| No build step, no toolchain rot | — |

For a productivity tool that users might download and run locally, zero-install wins. A deployed SaaS version would split into proper modules.

### Zero runtime dependencies
PDF.js and Mammoth are loaded from a CDN but are **optional enhancers** — the core read/display loop has no dependencies. This means:
- The app works offline for plain text
- There's no supply chain attack surface for core logic
- Bundle size for the critical path is effectively zero

---

## 2. State Management

```javascript
// ── Global state (intentionally minimal) ─────────────
let words   = [];    // Tokenized word array from parsed input
let idx     = 0;     // Current position in words[]
let playing = false; // Playback flag
let timer   = null;  // setTimeout handle (for cancellation)
let wpm     = 300;   // Words per minute (mirrors slider value)
```

**Why flat variables instead of an object?**  
For an app this size, a state object adds indirection without benefit. No component tree means no need for reactive state. The entire app re-renders from scratch on each word anyway.

**Why `let timer = null` as a handle?**  
`setTimeout` returns an ID that must be passed to `clearTimeout`. Holding it in module scope lets any function (`stopPlayback`, `wpmSlider.input`) cancel the in-flight timeout without a global event bus.

**State transitions:**
```
IDLE ──[loadText]──► READY ──[play]──► PLAYING
                       ▲                  │
                       └──────[pause]─────┘
                       ▲
                       └──[restart]
```

---

## 3. The ORP Algorithm

```javascript
function getFocalIndex(word) {
  // Strip non-alphanumeric to measure "real" word length
  const clean = word.replace(/[^a-zA-Z0-9]/g, '');
  const len = clean.length;

  // ORP lookup table (research-derived breakpoints)
  if (len <= 1) return 0;
  if (len <= 5) return 1;
  if (len <= 9) return 2;
  if (len <= 13) return 3;
  return 4;
}
```

**Why strip non-alphanumeric first?**  
Words from real documents arrive with punctuation attached: `"running,"` or `"end."`. Measuring `.length` on the raw string would shift the focal point right for no reason — the comma isn't a letter the eye needs to decode.

**Why these specific breakpoints?**  
They derive from the [Spritz patent](https://patents.google.com/patent/US20140270580) and subsequent research on saccadic fixation. The ORP typically falls at ~30% of word length for short words, drifting toward ~25% for long words, because peripheral vision covers more letters as the fixation point moves right.

**The rendering consequence:**

```javascript
function renderWord(word) {
  const focalIdx = getFocalIndex(word);
  let alphaCount = -1;
  let focalPos = 0;

  // Map the nth alpha-character back to its raw string position
  // This handles words like "don't" where position 1 = 'o', not "'"
  for (let i = 0; i < word.length; i++) {
    if (/[a-zA-Z0-9]/.test(word[i])) alphaCount++;
    if (alphaCount === focalIdx) { focalPos = i; break; }
  }
  // focalPos is now the index in the ORIGINAL string
  // Build spans: focal letter gets class="letter focal", others get "letter"
}
```

This two-pass approach (get alpha-count, map back to raw position) handles edge cases like `"it's"`, `"42nd"`, and `"e.g."` correctly.

---

## 4. The Word Renderer

```javascript
function renderWord(word) {
  // ... focalPos calculation above ...

  let html = '';
  for (let i = 0; i < word.length; i++) {
    if (i === focalPos) {
      html += `<span class="letter focal">${word[i]}</span>`;
    } else {
      html += `<span class="letter">${word[i]}</span>`;
    }
  }

  wordDisplay.innerHTML = html; // Single DOM write

  // Flash animation: remove class, force reflow, re-add
  if (optFlash.checked) {
    wordDisplay.classList.remove('flash');
    void wordDisplay.offsetWidth;  // ← Forces browser reflow
    wordDisplay.classList.add('flash');
  }
}
```

**The `void wordDisplay.offsetWidth` trick:**  
CSS animations only replay if the element leaves and re-enters the animated state. Removing a class and immediately re-adding it in the same JS microtask doesn't trigger a new animation — the browser batches style changes. Reading `offsetWidth` forces a **synchronous layout flush**, which commits the class removal to the render tree, so the subsequent `classList.add` triggers a fresh animation.

**Why `innerHTML` instead of individual `textContent` writes?**  
Building the string first and writing once is faster than N individual DOM manipulations. At 1,000 WPM the browser gets ~60ms per word; a single innerHTML write costs <1ms.

**Focal guide positioning:**
```javascript
requestAnimationFrame(() => {
  const focal = wordDisplay.querySelector('.focal');
  const stage = wordDisplay.parentElement;
  if (focal && stage) {
    const stageRect = stage.getBoundingClientRect();
    const focalRect = focal.getBoundingClientRect();
    const relX = focalRect.left - stageRect.left + focalRect.width / 2;
    focalGuide.style.left = relX + 'px';
  }
});
```

The vertical guide line must be positioned *after* the new word renders. Using `requestAnimationFrame` defers until the browser has painted, guaranteeing accurate `getBoundingClientRect()` measurements. Without this, the guide would lag by one word.

---

## 5. The Playback Scheduler

```javascript
function showWord() {
  if (idx >= words.length) {
    stopPlayback();
    return;
  }

  renderWord(words[idx]);
  renderContext();
  updateProgress();

  const delay = getDelay(words[idx]);
  idx++;
  timer = setTimeout(showWord, delay);  // Schedule next word
}

function getDelay(word) {
  const base = 60000 / wpm;  // ms per word at current WPM

  // Punctuation pause: ending punctuation → 1.8× longer display
  if (optPausePunct.checked && /[.!?;:,—–]$/.test(word)) {
    return base * 1.8;
  }

  // Length bonus: each character over 8 adds 4% of base delay
  const extra = Math.max(0, word.length - 8) * (base * 0.04);
  return base + extra;
}
```

**Why recursive `setTimeout` instead of `setInterval`?**  
`setInterval` fires at a fixed rate regardless of how long the callback takes. If `renderWord` + `renderContext` takes 12ms and the interval is 10ms, calls stack up. Recursive `setTimeout` schedules the *next* call only after the *current* one completes — self-correcting by nature.

**Why increment `idx` before scheduling?**  
If `stopPlayback()` is called mid-flight (e.g., user hits pause), clearing `timer` prevents the next `showWord()` from firing. But `idx` should already reflect "I showed word N" so that pause/resume continues at N+1, not re-shows N. The increment-then-schedule order guarantees this.

**WPM change during playback:**
```javascript
wpmSlider.addEventListener('input', () => {
  wpm = parseInt(wpmSlider.value);
  if (playing) {
    clearTimeout(timer);
    timer = setTimeout(showWord, 60000 / wpm);  // Reschedule with new rate
  }
});
```
Changing speed mid-playback: cancel the pending timeout and immediately reschedule at the new rate. The word currently displayed stays visible for the remaining old-rate time (imperceptible to users), then new rate kicks in.

---

## 6. File Parsing Pipeline

```
User drops file
      │
      ▼
  handleFile(file)
      │
      ├─ .pdf  ──► pdfjsLib.getDocument() ──► page.getTextContent() ──► text
      │
      ├─ .docx ──► mammoth.extractRawText() ──────────────────────────► text
      │
      └─ .txt/.md ► file.text() ─────────────────────────────────────► text
                                                                          │
                                                                          ▼
                                                                    loadText(text)
                                                                          │
                                                                    .replace(/\s+/g,' ')
                                                                          │
                                                                    .split(' ')
                                                                          │
                                                                    .filter(w => w.length)
                                                                          │
                                                                    words[] ← ready
```

**PDF parsing detail:**
```javascript
const pdf = await pdfjsLib.getDocument({ data: buf }).promise;
let text = '';
for (let i = 1; i <= pdf.numPages; i++) {
  const page = await pdf.getPage(i);
  const tc = await page.getTextContent();
  // tc.items is an array of text spans; join with spaces
  text += tc.items.map(s => s.str).join(' ') + ' ';
}
```
PDF.js returns text as positioned spans — `items[].str`. Joining with spaces handles PDFs where words are stored as individual glyphs (common in typeset documents).

**Text normalization:**
```javascript
const cleaned = text.replace(/\s+/g, ' ').trim();
words = cleaned.split(' ').filter(w => w.length > 0);
```
Collapsing all whitespace (newlines, tabs, multiple spaces) before splitting eliminates empty tokens from headers, line breaks, and indentation in the source document.

---

## 7. Event Architecture

The app uses direct DOM `addEventListener` — no framework event system. Events fall into four categories:

```
Input Events           → trigger loadText()
  fileInput.change
  dropZone.drop
  textInput.input (debounced by 10-char threshold)

Playback Controls      → mutate playing/idx state
  btnPlay.click
  btnPause.click
  btnRestart.click
  btnClear.click

Configuration Events   → update wpm / re-render context
  wpmSlider.input
  optContext.change
  optPausePunct.change
  optFlash.change

Keyboard Shortcuts     → map to playback/config actions
  document.keydown
```

**Keyboard input guard:**
```javascript
document.addEventListener('keydown', e => {
  const tag = document.activeElement.tagName;
  if (tag === 'TEXTAREA' || tag === 'INPUT') return;
  // ... handle shortcuts
});
```
Without this guard, pressing `Space` while typing in the textarea would both insert a space and toggle playback. Checking `activeElement` prevents shortcut capture when the user is in a text field.

---

## 8. CSS Architecture

**Custom properties as design tokens:**
```css
:root {
  --bg:       #0a0a0f;   /* Page background */
  --surface:  #111118;   /* Card/panel background */
  --surface2: #1a1a24;   /* Elevated surface */
  --border:   #2a2a38;   /* All borders */
  --accent:   #e8c547;   /* Primary action color */
  --accent2:  #ff6b6b;   /* Destructive action color */
  --text:     #e8e8f0;   /* Body text */
  --muted:    #5a5a72;   /* Labels, secondary text */
  --highlight:#e8c547;   /* Focal letter color (same as accent) */
}
```

Every color in the app references a variable — zero hardcoded hex values in component styles. This makes theming a 10-line change.

**Responsive font scaling:**
```css
.word-display {
  font-size: clamp(2.5rem, 6vw, 4rem);
}
```
`clamp(min, preferred, max)` means the word display scales smoothly from mobile (2.5rem) through tablets (6vw proportional) to desktop (capped at 4rem). No media query needed.

**CSS-only animation with the reflow trick:**
```css
@keyframes wordFlash {
  0%   { opacity: 0; transform: scale(0.97); }
  100% { opacity: 1; transform: scale(1); }
}
.word-display.flash {
  animation: wordFlash 0.08s ease-out;
}
```
The JS `void el.offsetWidth` forces the browser to reflow (see Section 4), which allows the animation to restart on every word.

**Toggle component (pure CSS):**
```css
.option-toggle input[type=checkbox] { display: none; }
.toggle-track { ... }
.toggle-thumb { ... transition: all 0.2s; }
.option-toggle input:checked + .toggle-track { background: ... }
.option-toggle input:checked + .toggle-track .toggle-thumb { left: 17px; }
```
The visible toggle is a `<div>` adjacent to a hidden checkbox. The `input:checked +` CSS sibling combinator drives the visual state — zero JavaScript for toggle appearance.

---

## 9. Performance Decisions

| Decision | Why |
|---|---|
| Single `innerHTML` write per word | Avoid N individual DOM writes |
| `requestAnimationFrame` for guide position | Read layout *after* paint, never before |
| Recursive `setTimeout` over `setInterval` | Self-correcting; never stacks |
| Tokenize once at load, not per-display | O(1) word access at playback time |
| No virtual DOM | Overkill for a single-element update loop |
| CDN libraries, not bundled | PDF.js is 400KB; only needed for PDFs |

**Theoretical throughput:**  
At 1,000 WPM (1 word per 60ms), the scheduler must complete `renderWord + renderContext + updateProgress` within 60ms. In practice this runs in <5ms on mid-range hardware, leaving 55ms of headroom.

---

## 10. What I'd Do Differently at Scale

This section exists because production engineering is about knowing trade-offs, not just making things work.

**If this were a deployed web app:**

1. **Module system** — Split into `state.js`, `renderer.js`, `scheduler.js`, `parsers/pdf.js` etc. Enables unit testing each module in isolation.

2. **TypeScript** — The ORP function has implicit contracts (`word: string → number`). Types make these explicit and catch refactoring bugs.

3. **Web Workers for parsing** — PDF.js and Mammoth can block the main thread on large documents (>100 pages). Moving file parsing to a Worker keeps the UI responsive.

4. **Lazy-load CDN libraries** — Import PDF.js only when a `.pdf` file is detected, not on page load.

5. **`localStorage` bookmarks** — Remember position in long documents across sessions.

6. **Service Worker** — Cache the app shell for true offline support, including CDN assets.

7. **Accessibility** — Add `aria-live="polite"` to the word display so screen readers announce words (for users who want audio + visual reading).

8. **Testing** — Jest unit tests for `getFocalIndex`, `getDelay`, `loadText`. Playwright E2E for file upload and playback flows.

---

*Written as a transparent record of engineering decisions — the kind of documentation that makes code maintainable by future-you and new teammates.*