Technical SEO Engineering English: Structured Data, Core Web Vitals, and Crawl Budget Vocabulary

Learn the precise English vocabulary engineers use when discussing technical SEO: structured data, Core Web Vitals, crawl budget, and search console findings.

Technical SEO sits at the intersection of engineering and search marketing. If you work on a web platform, you will eventually sit in a meeting where someone says “our LCP is killing us” or “we need to fix canonicalization before the migration.” This post builds the English vocabulary you need to participate confidently in those conversations.

Structured Data Terms

Structured data — machine-readable markup added to a webpage to help search engines understand its content. The most common format is JSON-LD, following schema.org vocabulary.

“The engineering team added structured data to all product pages last sprint — we should start seeing rich results in Search Console within a few weeks.”

JSON-LD (JavaScript Object Notation for Linked Data) — the Google-recommended format for embedding structured data in a <script> tag, separate from the visible HTML.

“We chose JSON-LD over microdata because it’s easier to maintain — you don’t have to touch the HTML markup itself.”

Rich results — enhanced search result listings that appear when Google successfully validates your structured data. Examples include star ratings, FAQ accordions, and event dates.

“After implementing FAQPage schema, our click-through rate increased because our listing now shows an FAQ accordion in the SERP.”

Schema.org — a collaborative vocabulary created by Google, Microsoft, Yahoo, and Yandex that defines the types and properties used in structured data.

“Before adding markup, check schema.org to make sure you’re using the correct property names — ‘price’ and ‘offers’ are not interchangeable.”

Core Web Vitals Terms

Core Web Vitals — a set of three user-experience metrics that Google uses as ranking signals: LCP, CLS, and INP.

“Our Core Web Vitals are failing in field data even though our lab scores look fine — the real-user network conditions are much slower.”

LCP (Largest Contentful Paint) — measures how long it takes for the largest visible element (usually an image or heading) to load. Good LCP is under 2.5 seconds.

“The hero image was our LCP element. Switching to a CDN-served WebP with preload dropped it from 4.1s to 1.8s.”

CLS (Cumulative Layout Shift) — measures how much page elements unexpectedly shift during loading. A score above 0.1 is considered poor.

“The cookie banner was injecting into the DOM without reserved space, causing a massive CLS score. We fixed it with a min-height placeholder.”

INP (Interaction to Next Paint) — measures the latency of all user interactions on the page. It replaced FID as a Core Web Vital in 2024.

“Our INP is failing because the filter component runs a synchronous loop on click — we need to move that to a web worker.”

Crawling and Indexing Terms

Crawl budget — the number of URLs Googlebot will crawl on a site within a given timeframe. Large sites need to manage this carefully to ensure important pages are discovered.

“After removing 50,000 thin pages, our crawl budget improved — Googlebot now reaches our deep category pages much faster.”

Canonicalization — the process of specifying the preferred URL when duplicate or near-duplicate content exists at multiple URLs. Done via the rel="canonical" tag or HTTP headers.

“We have the same product available at three URL patterns. Without proper canonicalization, we’re splitting our ranking signals across all three.”

hreflang — an HTML attribute that tells search engines which language and regional version of a page to serve to users in different locales.

“Our hreflang implementation was incorrect — the UK and US pages were pointing to each other but forgetting to include the x-default tag.”

Robots.txt directives — instructions in a text file at the root of a domain that tell crawlers which paths they may or may not access.

“We accidentally blocked the /api/ path in robots.txt, which was fine, but the wildcard pattern also matched /apiaries/ — a real content section.”

Index coverage — a report in Google Search Console that shows which pages have been indexed, which are excluded, and why. Key for diagnosing discovery problems.

“The index coverage report showed 3,000 pages in ‘Crawled, currently not indexed’ status — that’s a thin content signal we need to address.”

Real IT Context Phrases

These phrases come from engineering-SEO team discussions, Slack threads, and PR reviews:

  • “We need to implement canonical tags before we launch the pagination.” — pre-launch checklist phrasing
  • “The sitemap is returning 404s for 200 URLs — those pages were deleted but not removed from the sitemap.” — code review finding
  • “Search Console is showing a spike in crawl errors after the deploy on Thursday.” — incident correlation phrasing
  • “Let’s add the BreadcrumbList schema to the category templates, not just the product pages.” — scope decision in a planning meeting
  • “The INP regression is coming from the new analytics script loading synchronously in the head.” — root cause statement in a postmortem

Key Collocations

CollocationExample
implement structured data”We’ll implement structured data for the FAQ section this sprint.”
pass Core Web Vitals”We can’t launch until all pages pass Core Web Vitals in field data.”
exhaust crawl budget”Faceted navigation is exhausting our crawl budget on unnecessary URLs.”
serve the canonical version”Make sure the CDN always serves the canonical version, not the www variant.”
submit a sitemap”Submit the updated sitemap in Search Console after the redirect migration.”
block with robots.txt”Block staging subdomains with robots.txt, not just meta noindex.”
trigger a rich result”You need at least one review to trigger the star rating rich result.”

Practice

Open Google Search Console for a project you work on (or use a public demo account). Find one page in the “Index Coverage” report with an error status. Write a three-sentence English explanation of: what the error means, what likely caused it, and what engineering change would fix it. Use at least three vocabulary terms from this post. Share your explanation with a teammate and ask if it’s clear without additional context.