Design Systems

设计系统的排版测试与质量保证

Updated 二月 24, 2026
如何在不同浏览器、设备和屏幕尺寸上测试排版——截断、溢出、多语言和视觉回归测试。

Typography Testing and QA for Design Systems

Typography bugs are among the most commonly overlooked issues in design system QA. A component can pass all its functional tests — state transitions, keyboard navigation, ARIA attributes — while still shipping with text that overflows its container on Android Chrome, renders with a fallback font on Windows Firefox, or breaks into awkward single-word lines in German. These are not edge cases; they are predictable failure modes that structured testing can catch before they reach production.

This guide covers the specific testing scenarios that typography in a design system requires, and the tooling and processes to address each one systematically.


Cross-Browser Font Rendering Differences

The same CSS and the same font file renders noticeably differently across browsers and operating systems. Understanding why helps you test for the right things.

macOS vs. Windows font rendering

On macOS, browsers use Apple's Core Text rendering engine with greyscale anti-aliasing. The result is slightly lighter-weight rendering, especially at 1x display density (non-Retina). On Windows, Chrome and Edge use DirectWrite with ClearType sub-pixel anti-aliasing, which renders fonts slightly bolder and may show color fringing on some displays.

The practical difference: a design validated exclusively on macOS Retina will look heavier on Windows and lighter on macOS 1x. Test your primary font at your most common sizes on both platforms before finalizing the type scale.

The -webkit-font-smoothing difference

Without explicit font smoothing settings, macOS renders text with sub-pixel anti-aliasing by default in some contexts. Adding:

body {
  -webkit-font-smoothing: antialiased;
  -moz-osx-font-smoothing: grayscale;
}

switches to greyscale anti-aliasing on macOS, which most designers prefer for UI text — it looks lighter and crisper at medium sizes. This CSS has no effect on Windows or Linux. Include this in your baseline styles and test with and without it on macOS.

Font loading and fallback rendering

When a custom font loads after the initial render, browsers handle the transition according to the font-display descriptor. The most common issue: font-display: swap causes a flash where fallback text (in the system font) reflows when the custom font arrives. If your fallback font is significantly wider or taller than your custom font, this reflow can cause layout shifts large enough to affect Core Web Vitals (Cumulative Layout Shift).

Test this by throttling your network in DevTools to "Slow 3G" and observing the font load transition. The fallback font that Chrome uses when font-display: swap is in effect is the browser's default serif or sans-serif — verify this is configured appropriately with a fallback declaration:

body {
  font-family: 'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI',
               Roboto, 'Helvetica Neue', Arial, sans-serif;
}

Browser-specific testing matrix

For a design system, test typography rendering at minimum across:

Browser OS Display type
Chrome Windows 11 1x / 1.5x DPI
Chrome macOS 2x Retina
Firefox Windows 11 1x DPI
Safari macOS 2x Retina
Safari iOS 3x OLED
Chrome Android 2x / 3x

Firefox on Windows is historically the most revealing test. Its rendering engine and ClearType implementation can expose hinting issues and weight inconsistencies that Chrome and Safari both hide.


Responsive Typography Testing

Responsive typography testing is not just checking that text is readable at mobile breakpoints. It requires verifying that the entire hierarchy works correctly at every point in the responsive range.

Breakpoint verification

For each defined breakpoint, verify:

  1. Font sizes match the defined token values (use DevTools Computed styles)
  2. Line-heights are appropriate for the sizes at this breakpoint
  3. Heading hierarchy is visually clear (heading levels look distinct from each other and from body text)
  4. No text overflows its container horizontally
  5. No unintended word breaks or hyphenation on single words

Fluid typography range testing

If your system uses clamp() for fluid type, test across the full viewport range rather than just at breakpoints. The minimum and maximum are tested at narrow and wide viewports, but the intermediate range needs spot-checking. Create a browser test that resizes the viewport from minimum to maximum in 50px increments and captures screenshots of the key typography components.

/* Test this declaration visually from 375px to 1440px */
--font-size-display: clamp(2.25rem, 5vw + 1rem, 3.815rem);

At 375px: 5vw = 18.75px, so preferred = 18.75 + 16 = 34.75px. But minimum is 2.25rem = 36px, so display is 36px. Correct. At 800px: 5vw = 40px, preferred = 40 + 16 = 56px. Maximum is 3.815rem = 61px. Preferred wins: 56px. Verify this is acceptable mid-range. At 1440px: 5vw = 72px, preferred = 88px. Maximum clamp applies: 61px. Correct.

This arithmetic exercise, done for each fluid token, catches cases where the clamp values produce unexpected intermediate sizes.

The line length test

Optimal reading measure is 55–75 characters per line for body text. Test this at each breakpoint by counting characters per line in a standard paragraph. Use a ruler character count string:

<p class="measure-test" style="outline: 1px solid red">
  The quick brown fox jumps over the lazy dog. The five boxing wizards jump quickly.
  Pack my box with five dozen liquor jugs.
</p>

Count the characters visible on one line in the rendered output at each viewport. If you are consistently seeing fewer than 45 or more than 90 characters per line, the font size or container width needs adjustment.


Multi-Language and RTL Text Testing

Typography that works correctly in English often fails in other languages due to text expansion, character set coverage, and directionality.

Text expansion

UI text translated to other languages is typically longer than the English source. Design systems that contain components with fixed-size text containers or white-space: nowrap will break when labels or button text is translated.

Typical expansion rates: - German: +25–35% longer than English - French: +15–25% - Finnish: +25–50% - Japanese: -20% (shorter, but uses different metrics) - Arabic: -15% (shorter, but requires RTL layout)

Test all text-containing components with German or Finnish strings, which produce the most expansion. If a button label designed for "Submit" (6 characters) breaks at "Bestätigen" (10 characters), the component needs a fix — either wrapping permitted, container resizing, or a max-width that accommodates the longest expected translation.

Right-to-left (RTL) layout

Arabic and Hebrew are written right-to-left. RTL support requires not just flipping text alignment but mirroring the entire layout, icons, and directional elements. From a typography-specific perspective, test:

  1. text-align is not hard-coded to left. Use text-align: start (which adapts to text direction) or logical properties.
  2. Letter-spacing does not apply to Arabic text (Arabic script uses contextual connections between letters; tracking destroys readability).
  3. Line-height is sufficient. Arabic has tall diacritic marks above and below letters that require more vertical space than Latin text.
/* Use logical properties for directional-neutral CSS */
.text-block {
  text-align: start;           /* 'left' in LTR, 'right' in RTL */
  padding-inline-start: 1rem;  /* 'padding-left' in LTR */
  margin-inline-end: 0.5rem;   /* 'margin-right' in LTR */
}

/* Disable letter-spacing for RTL Arabic */
[dir="rtl"] .text-block {
  letter-spacing: normal;
}

Test RTL by adding dir="rtl" to the <html> element and inspecting the layout. For Arabic specifically, use actual Arabic text rather than reversed Latin — the shaping rules (how letters connect to each other based on position) are critical to rendering and cannot be tested with Latin placeholders.

Character set gaps

Test your primary font with characters from each language your product supports. Missing characters produce tofu — the empty rectangle that browsers render when a glyph is not found in the font. Run a character coverage check against your supported language list:

  • Extended Latin: ě ą ő ș ñ ü ø å — most modern system fonts cover these
  • Vietnamese: ắ ậ ồ ứ — specifically check these diacritic stacks
  • Greek: α β γ δ — verify if your font includes Greek
  • Cyrillic: а б в г д — verify if your font includes Cyrillic

A font coverage test can be automated: render a string containing known characters from each script in your target language list, and check that no tofu rectangles appear in the screenshot.


Truncation and Overflow Edge Cases

Text truncation is one of the most common sources of typography bugs. The combination of variable content length, fixed container dimensions, and CSS truncation rules creates many failure modes.

Single-line truncation

.truncate {
  white-space: nowrap;
  overflow: hidden;
  text-overflow: ellipsis;
}

Test cases for single-line truncation: 1. Very short text (1–2 characters): should not truncate prematurely 2. Text that exactly fills the container: no ellipsis should appear 3. Text one character over the limit: ellipsis appears and hides the overflow character 4. Very long text (URL, unbroken string): truncates correctly at any position 5. RTL text: ellipsis should appear on the left side in RTL contexts

Multi-line truncation (line clamp)

.clamp-3-lines {
  display: -webkit-box;
  -webkit-line-clamp: 3;
  -webkit-box-orient: vertical;
  overflow: hidden;
}

Test cases: 1. Text with exactly the clamped number of lines: no truncation 2. Text with one more line than clamped: truncation appears on the last visible line 3. Text with very long words that do not break: may cause horizontal overflow before line clamp triggers 4. RTL text: line clamp should still function correctly 5. After font size change (responsive): line count changes — verify clamping still works

Overflow in flex and grid containers

Text in flex children is a common overflow source. A flex child without min-width: 0 will not shrink below its minimum content size, causing overflow:

/* Problem: flex child does not shrink */
.flex-child {
  flex: 1;
  /* missing min-width: 0 */
}

/* Fix: allow flex child to shrink below content size */
.flex-child {
  flex: 1;
  min-width: 0; /* critical for text truncation in flex containers */
  overflow: hidden;
  text-overflow: ellipsis;
  white-space: nowrap;
}

Test all components where text lives inside a flex container by feeding them progressively longer text until overflow behavior can be observed.


Visual Regression Testing for Typography

Manual inspection catches many typography bugs, but it does not scale across a large component library. Visual regression testing automates the comparison between known-good screenshots and current renders.

Tools

Chromatic integrates with Storybook and captures pixel-accurate screenshots for every component story. When a token changes — say, --font-size-body shifts from 1rem to 0.9375rem — Chromatic shows a diff for every story that renders body text. The team can approve expected changes and flag unexpected ones.

Percy offers similar capabilities and integrates with a broader range of testing frameworks.

BackstopJS is an open-source alternative that captures and compares screenshots of web pages without requiring Storybook.

Setting up typography-specific stories

For visual regression testing of the type scale itself, create a dedicated "Typography" story that renders every type style in your system:

// Typography.stories.js
export const TypeScale = () => `
  <div class="type-specimen">
    <p class="display">Display — The quick brown fox</p>
    <p class="heading-lg">Heading Large — The quick brown fox</p>
    <p class="heading-md">Heading Medium — The quick brown fox</p>
    <p class="heading-sm">Heading Small — The quick brown fox</p>
    <p class="body-lg">Body Large — The quick brown fox jumps over the lazy dog.</p>
    <p class="body">Body — The quick brown fox jumps over the lazy dog.</p>
    <p class="caption">Caption — The quick brown fox</p>
    <p class="label">Label — The quick brown fox</p>
  </div>
`;

This story becomes the canary for type scale changes. Any token update that changes font sizes, weights, or line-heights will show as a diff in this story before it propagates to individual component stories.

Font loading in visual regression tests

Visual regression tools must wait for custom fonts to load before capturing screenshots, or tests will be inconsistent. In Chromatic, fonts served from Google Fonts or self-hosted may not load reliably in the test environment. The most reliable approach: include fonts directly in the Storybook configuration as preloaded assets and set font-display: block for test environments to ensure the font is always present when the screenshot is taken.

What to snapshot

Beyond the type scale story, configure visual regression snapshots for: - All typographic component variants (button in all sizes, input in all states, badge variants) - Long text edge cases (truncated card titles, multi-line badges) - RTL variants of components containing text - Responsive viewports at each defined breakpoint (375px, 768px, 1280px minimum)

Typography QA is ongoing work, not a one-time setup. As the design system grows and tokens evolve, the regression tests catch what manual review misses — and over time they build a documented record of every typographic change the system has made.

Typography Terms

Try These Tools

Fonts Mentioned

Related Articles