Font Subsetting: Cut Font File Sizes by 90%

A full-featured font file contains glyphs for every character the type designer included: Latin letters, Greek and Cyrillic alphabets, mathematical symbols, currency signs, ligatures, historical letterforms, and often thousands of glyphs that will never appear in your content. When you load Inter from Google Fonts without any configuration, you're potentially downloading support for characters your users will never see.

Font subsetting is the process of stripping out all the glyphs you don't need, keeping only the characters your content actually uses. Done well, it transforms a 300KB font into a 20KB one — a 90% size reduction that has an outsized impact on page load performance.

Why Font Files Are So Large

To understand subsetting, you first need to understand why modern font files are so heavy.

A professionally designed typeface like Inter contains approximately 2,500 glyphs in its full release. That includes:

Basic Latin (A–Z, a–z, digits, punctuation) — roughly 95 glyphs
Extended Latin for European languages — several hundred additional glyphs
Cyrillic characters — 200+ glyphs
Greek characters — 70+ glyphs
Currency symbols from around the world
Mathematical and technical symbols
Arrows and dingbats
Ligatures (fi, fl, ffi, etc.)
Discretionary alternates and stylistic sets
OpenType feature glyphs for swashes, small caps, and more

An English-language website needs exactly none of the Cyrillic, Greek, mathematical, or extended Latin characters. Yet without subsetting, you download all of them.

Variable fonts compound the problem. A variable font encodes the entire design space — every point along every axis — into a single file. The full Inter variable font WOFF2 file is around 330KB. The equivalent subsetted Latin-only version is about 75KB. A further subsetting pass targeting only the specific characters used on your site can bring this below 20KB.

Glyph Count vs. File Size

The relationship between glyph count and file size is roughly linear for static fonts: fewer glyphs means proportionally smaller files. For variable fonts, it's somewhat different — the axis data takes a fixed overhead regardless of glyph count — but subsetting still produces dramatic reductions.

The OpenType specification allows fonts to contain up to 65,535 glyphs. Most professional typefaces use a small fraction of this capacity, but even 2,000–3,000 glyphs represents a significant payload when you only need 200.

unicode-range: Browser-Level Subsetting

The unicode-range descriptor in @font-face declarations is a built-in CSS mechanism for subsetting at the browser level. It tells the browser which Unicode code points a particular font file covers, allowing the browser to download the file only when it encounters matching characters in the page content.

@font-face {
  font-family: 'Inter';
  src: url('/fonts/inter-latin.woff2') format('woff2');
  font-weight: 400;
  unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC,
                 U+02C6, U+02DA, U+02DC, U+2000-206F, U+2074,
                 U+20AC, U+2122, U+2191, U+2193, U+2212, U+2215,
                 U+FEFF, U+FFFD;
}

@font-face {
  font-family: 'Inter';
  src: url('/fonts/inter-latin-ext.woff2') format('woff2');
  font-weight: 400;
  unicode-range: U+0100-024F, U+0259, U+1E00-1EFF, U+2020,
                 U+20A0-20AB, U+20AD-20CF, U+2113, U+2C60-2C7F,
                 U+A720-A7FF;
}

@font-face {
  font-family: 'Inter';
  src: url('/fonts/inter-cyrillic.woff2') format('woff2');
  font-weight: 400;
  unicode-range: U+0400-045F, U+0490-0491, U+04B0-04B1, U+2116;
}

With this setup, a browser rendering an English-only page downloads only inter-latin.woff2. If a Russian word appears in the content, the browser also downloads inter-cyrillic.woff2. The Cyrillic file is never requested for an English-only page.

This is exactly how Google Fonts serves Inter, Roboto, and every other hosted typeface. The single fonts.googleapis.com CSS URL returns multiple @font-face declarations with unicode-range descriptors, and the browser downloads only what it needs.

Reading Unicode Range Values

The U+ prefix denotes a Unicode code point in hexadecimal. Ranges use a hyphen: U+0000-00FF covers the first 256 Unicode code points, which are the Basic Latin and Latin-1 Supplement blocks — the core characters for English and Western European languages. Wildcards are also valid: U+26?? covers all emoji in the Miscellaneous Symbols block.

Limitations

unicode-range is conditional loading, not file-level subsetting. The font file itself still contains all its glyphs — the browser simply decides whether to download it based on what characters appear in the DOM. For true size reduction, you need to physically remove glyphs from the font file, which requires manual subsetting tools.

Manual Subsetting with pyftsubset

pyftsubset is a command-line tool from the fonttools Python library. It physically removes glyphs from a font file, producing a smaller output that contains only the characters you specify. This is the most powerful subsetting approach available.

Installation

pip install fonttools brotli
# brotli enables WOFF2 output

Basic Usage

Subset to Basic Latin only (English characters):

pyftsubset Inter-Regular.ttf \
  --output-file=inter-regular-latin.woff2 \
  --flavor=woff2 \
  --unicodes="U+0000-00FF,U+0131,U+0152-0153,U+02BB-02BC,U+02C6,U+02DA,U+02DC,U+2000-206F,U+20AC,U+2122,U+FEFF,U+FFFD"

Subset to exactly the characters used in your content:

pyftsubset Inter-Regular.ttf \
  --output-file=inter-regular-custom.woff2 \
  --flavor=woff2 \
  --text-file=all-page-text.txt

The --text-file option accepts a plain text file containing every character that appears anywhere in your site. pyftsubset extracts only the glyphs needed to render those specific characters.

OpenType Feature Flags

By default, pyftsubset removes OpenType layout tables (GSUB, GPOS) that aren't referenced by the retained glyphs. You can explicitly preserve features:

pyftsubset Inter-Regular.ttf \
  --output-file=inter-regular-latin.woff2 \
  --flavor=woff2 \
  --layout-features="kern,liga,calt,rlig" \
  --unicodes="U+0000-00FF"

kern is kerning pairs (important for quality typography), liga is standard ligatures (fi, fl), calt is contextual alternates, and rlig is required ligatures. For most web use cases, including kern and liga while dropping decorative OpenType features is the right balance between quality and file size.

Subsetting Variable Fonts

Variable fonts require additional flags:

pyftsubset Inter[wght].ttf \
  --output-file=inter-variable-latin.woff2 \
  --flavor=woff2 \
  --layout-features="kern,liga,calt" \
  --unicodes="U+0000-00FF,U+0131,U+0152-0153" \
  --no-hinting \
  --desubroutinize

--no-hinting removes hinting data, which is largely irrelevant at modern screen resolutions but can add significant file size. --desubroutinize simplifies the font's internal glyph description structures, sometimes producing smaller output at the cost of marginal visual quality at very small sizes.

Google Fonts' Automatic Subsetting

Google Fonts handles subsetting automatically based on the subset parameter and unicode-range CSS. When you request a font via the Google Fonts API, the returned CSS contains multiple @font-face declarations — one per Unicode block — each pointing to a pre-generated subset file.

<!-- This request returns subsetted fonts automatically -->
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;700&display=swap" rel="stylesheet">

The returned CSS looks something like:

/* cyrillic-ext */
@font-face {
  font-family: 'Inter';
  font-style: normal;
  font-weight: 400;
  font-display: swap;
  src: url(https://fonts.gstatic.com/s/inter/v18/UcCO3FwrK3iLTeHuS_...cyrillic-ext.woff2) format('woff2');
  unicode-range: U+0460-052F, U+1C80-1C88, ...;
}

/* latin */
@font-face {
  font-family: 'Inter';
  font-style: normal;
  font-weight: 400;
  font-display: swap;
  src: url(https://fonts.gstatic.com/s/inter/v18/UcCO3FwrK3iLTeHuS_...latin.woff2) format('woff2');
  unicode-range: U+0000-00FF, U+0131, ...;
}

An English-only page downloads only the latin.woff2 file — typically 15–25KB per weight. The Cyrillic and extended Latin files are never fetched.

The `text` Parameter for Extreme Optimization

Google Fonts supports a text parameter that subsets the font to exactly the characters you specify — ideal for display text that uses only a few glyphs:

<!-- Subset to just the characters in "Hello World" -->
<link href="https://fonts.googleapis.com/css2?family=Playfair+Display&text=HeloWrd" rel="stylesheet">

This produces a font file containing only the glyphs H, e, l, o, W, r, d — a file that might be 3–5KB instead of 30KB. It's a powerful optimization for heading-only fonts where the character set is predictable and small.

The limitation: the font is served with Cache-Control: max-age=31536000 but bound to the exact character set. If your headings ever use a character not in the text parameter, that character will fall back to the system font.

Subsetting Strategies by Use Case

Different types of websites have different optimal subsetting approaches.

English-Only Marketing Sites

Use Google Fonts with the default unicode-range subsetting, or self-host fonts subsetted to Basic Latin + Latin-1 Supplement. The target is a single WOFF2 file per weight under 20KB.

pyftsubset Inter-Regular.ttf \
  --output-file=inter-400-latin.woff2 \
  --flavor=woff2 \
  --layout-features="kern,liga" \
  --unicodes="U+0020-007E,U+00A0-00FF,U+0131,U+0152-0153,U+02BB-02BC,U+20AC,U+2122,U+2014,U+2013,U+201C,U+201D,U+2018,U+2019"

This covers standard English text including smart quotes, em/en dashes, the euro sign, and the trademark symbol — essentially everything a marketing copywriter will put in content.

Multilingual Applications

Use unicode-range splitting with separate font files per script. Load the base Latin file eagerly; load other script files conditionally based on unicode-range. This ensures that a Japanese user's browser downloads the CJK font file, while an English user's browser never requests it.

@font-face {
  font-family: 'Noto Sans';
  src: url('/fonts/noto-sans-latin.woff2') format('woff2');
  font-weight: 400;
  unicode-range: U+0000-00FF;
}

@font-face {
  font-family: 'Noto Sans';
  src: url('/fonts/noto-sans-cjk.woff2') format('woff2');
  font-weight: 400;
  unicode-range: U+4E00-9FFF, U+3400-4DBF, U+20000-2A6DF;
}

Display Headings with Decorative Fonts

When using an expressive display face for headings only, subset aggressively using the text parameter or by generating a font file containing only the characters that appear in your actual headings.

# Extract unique characters from your heading content
echo "The Quick Brown Fox Jumps Over The Lazy Dog" | \
  python3 -c "import sys; print(''.join(sorted(set(sys.stdin.read().strip()))))"

# Subset to exactly those characters
pyftsubset PlayfairDisplay-Bold.ttf \
  --output-file=playfair-headings.woff2 \
  --flavor=woff2 \
  --text=" ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"

E-commerce Product Pages

Product pages often include user-generated content — product names, reviews, descriptions — that may contain characters you can't fully predict. The safe approach is Basic Latin + Latin Extended-A for Western European markets:

pyftsubset Roboto-Regular.ttf \
  --output-file=roboto-400-latin-ext.woff2 \
  --flavor=woff2 \
  --layout-features="kern,liga,calt" \
  --unicodes="U+0000-024F,U+0259,U+1E00-1EFF,U+20AC,U+2122,U+2014,U+2013,U+201C-201D,U+2018-2019"

This covers English, French, German, Spanish, Portuguese, Italian, Polish, and most other Western European languages — a reasonable default for international e-commerce without the weight of full multilingual coverage.

Automating Subsetting in Your Build Pipeline

Manual subsetting works for static sites, but dynamic applications benefit from automated subsetting that runs during the build process.

A simple Node.js build script can extract unique characters from rendered HTML and generate perfectly tailored font subsets:

# 1. Crawl your site and extract all text content
wget --recursive --level=3 --quiet --output-file=/dev/null \
     --execute robots=off https://yoursite.com \
     --directory-prefix=./crawl

# 2. Extract unique characters
grep -r --include="*.html" -oh "." ./crawl | sort -u | tr -d '\n' > chars.txt

# 3. Generate subset
pyftsubset Inter-Regular.ttf \
  --output-file=inter-400-custom.woff2 \
  --flavor=woff2 \
  --text-file=chars.txt \
  --layout-features="kern,liga"

# 4. Check file size
ls -lh inter-400-custom.woff2

This approach produces the smallest possible font file — one containing only the glyphs that actually appear in your content. For a typical English marketing site, the result is often under 15KB per font file, compared to 50–75KB for a standard Latin subset and 300KB+ for a full font file.

Subsetting is the highest-impact single optimization available for web font performance. Combined with WOFF2 format and proper font-display settings, it makes high-quality typography genuinely compatible with fast page loads.

Validating Your Subsets

After generating a subset font, verify it actually contains the characters you need. Loading a subsetted font that's missing characters produces invisible character "holes" — spaces where glyphs should appear — that can be extremely hard to debug in production.

Using fonttools to Inspect Glyphs

python3 -c "
from fontTools.ttLib import TTFont
font = TTFont('inter-400-latin.woff2')
cmap = font.getBestCmap()
# Check if specific characters are present
test_chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
missing = [c for c in test_chars if ord(c) not in cmap]
print(f'Missing characters: {missing if missing else \"None\"}')
print(f'Total glyphs: {len(font.getGlyphOrder())}')
"

Browser Rendering Test

Create a test HTML page that renders every character in your expected character set using the subsetted font. View it in Chrome DevTools with the Network tab open to confirm only the expected font file is requested:

<!DOCTYPE html>
<html>
<head>
  <style>
    @font-face {
      font-family: 'Inter-Subset';
      src: url('/fonts/inter-400-latin.woff2') format('woff2');
      font-display: block;
    }
    body { font-family: 'Inter-Subset', monospace; font-size: 24px; }
  </style>
</head>
<body>
  <p>AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz</p>
  <p>0123456789!@#$%^&*().,;:'"/?-_+=</p>
  <p>€£¥©®™—–""''</p>
</body>
</html>

Inspect the rendered output for any characters rendering in a fallback font (they'll look visually different from the surrounding Inter text). Address missing characters by adding their Unicode code points to your subset and regenerating.

Subsetting in JavaScript Toolchains

For teams working in JavaScript/TypeScript environments, subsetting can be integrated directly into the build pipeline without requiring Python.

Using `subset-font` npm package

npm install subset-font

// subset-fonts.mjs
import { subsetFont } from 'subset-font';
import { readFileSync, writeFileSync } from 'fs';

const fontBuffer = readFileSync('./fonts/Inter-Regular.ttf');

// Subset to a specific text string
const subsetBuffer = await subsetFont(fontBuffer, 'AaBbCcDdEe', {
  targetFormat: 'woff2',
});

writeFileSync('./fonts/inter-subset.woff2', subsetBuffer);
console.log(`Original: ${fontBuffer.length} bytes`);
console.log(`Subset: ${subsetBuffer.length} bytes`);
console.log(`Reduction: ${((1 - subsetBuffer.length / fontBuffer.length) * 100).toFixed(1)}%`);

Vite Plugin Integration

For Vite-based projects, font subsetting can run as a build hook:

// vite.config.js
import { subsetFont } from 'subset-font';
import { readFileSync, writeFileSync } from 'fs';
import { glob } from 'glob';

function fontSubsetPlugin() {
  return {
    name: 'font-subset',
    async buildEnd() {
      // Collect all text content from built HTML files
      const htmlFiles = await glob('dist/**/*.html');
      let allText = '';
      for (const file of htmlFiles) {
        const content = readFileSync(file, 'utf-8');
        // Strip HTML tags, keep text content
        allText += content.replace(/<[^>]+>/g, '');
      }
      const uniqueChars = [...new Set(allText)].join('');

      // Subset each font
      const fontFiles = await glob('dist/fonts/*.woff2');
      for (const fontFile of fontFiles) {
        const buffer = readFileSync(fontFile);
        const subsetted = await subsetFont(buffer, uniqueChars, {
          targetFormat: 'woff2'
        });
        writeFileSync(fontFile, subsetted);
        console.log(`Subsetted ${fontFile}: ${buffer.length} → ${subsetted.length} bytes`);
      }
    }
  };
}

This approach produces perfectly tailored fonts for each build, automatically adapting as content changes. The font files in the production build contain only the glyphs that appear in the actual HTML output — the tightest possible subset without any manual character enumeration.

Font Subsetting: Cut Font File Sizes by 90%

Embed This Widget

Font Subsetting: Cut Font File Sizes by 90%

Why Font Files Are So Large

Glyph Count vs. File Size

unicode-range: Browser-Level Subsetting

Reading Unicode Range Values

Limitations

Manual Subsetting with pyftsubset

Installation

Basic Usage

OpenType Feature Flags

Subsetting Variable Fonts

Google Fonts' Automatic Subsetting

The `text` Parameter for Extreme Optimization

Subsetting Strategies by Use Case

English-Only Marketing Sites

Multilingual Applications

Display Headings with Decorative Fonts

E-commerce Product Pages

Automating Subsetting in Your Build Pipeline

Validating Your Subsets

Using fonttools to Inspect Glyphs

Browser Rendering Test

Subsetting in JavaScript Toolchains

Using `subset-font` npm package

Vite Plugin Integration

Font Performance Playbook

排版术语

试试这些工具

提及的字体

相关文章

Font Subsetting: Cut Font File Sizes by 90%

Why Font Files Are So Large

Glyph Count vs. File Size

unicode-range: Browser-Level Subsetting

Reading Unicode Range Values

Limitations

Manual Subsetting with pyftsubset

Installation

Basic Usage

OpenType Feature Flags

Subsetting Variable Fonts

Google Fonts' Automatic Subsetting

The text Parameter for Extreme Optimization

Subsetting Strategies by Use Case

English-Only Marketing Sites

Multilingual Applications

Display Headings with Decorative Fonts

E-commerce Product Pages

Automating Subsetting in Your Build Pipeline

Validating Your Subsets

Using fonttools to Inspect Glyphs

Browser Rendering Test

Subsetting in JavaScript Toolchains

Using subset-font npm package

Vite Plugin Integration

Font Performance Playbook

排版术语

试试这些工具

提及的字体

相关文章

The `text` Parameter for Extreme Optimization

Using `subset-font` npm package