Vocabulary for Augmented Reality Development: 24 Terms Every AR Developer Should Know
Learn the essential English vocabulary of augmented reality development — anchors, plane detection, occlusion, tracking, and more for AR and spatial computing engineers.
Augmented reality development sits at the intersection of computer vision, 3D graphics, and mobile or wearable engineering, and it comes with a vocabulary that is easy to get almost right — which is worse than getting it clearly wrong. Saying “the model is floating” instead of “the anchor lost tracking” can send a teammate looking at the wrong subsystem entirely. This guide covers the 24 most important AR development terms with clear definitions and usage examples for standups, design reviews, and bug reports.
Core Tracking Concepts
1. Tracking
The continuous process by which an AR system estimates the device’s position and orientation in the real world, frame by frame, so that virtual content stays aligned with the physical scene.
Usage: “Tracking is stable in well-lit rooms but degrades quickly when the user points the camera at a blank white wall.”
2. Anchor
A fixed point in the real world, tied to a specific position and orientation, that virtual content is attached to so it appears to stay in place as the user moves.
Usage: “We placed an anchor on the detected table surface, and the virtual lamp stays locked to it even as the user walks around.”
3. Plane detection
The process of identifying flat surfaces — floors, tables, walls — in the camera feed so virtual objects can be placed realistically on top of them.
Usage: “Plane detection is picking up the coffee table but not the glass surface next to it, which is expected since reflective surfaces are hard to detect.”
4. Six degrees of freedom (6DOF)
Tracking that captures both position (moving forward/back, left/right, up/down) and orientation (pitch, yaw, roll), as opposed to 3DOF, which tracks orientation only.
Usage: “This headset only supports 3DOF, so users can look around but can’t physically walk toward the virtual object.”
5. Drift
Gradual accumulation of small tracking errors over time, causing virtual content to slowly slide out of its correct real-world position.
Usage: “After about two minutes of walking, we’re seeing noticeable drift — the anchored object has shifted about 15 centimeters from its original spot.”
6. Re-localization
The process of recovering accurate tracking after it has been lost, often by matching the current camera view against previously recorded visual features.
Usage: “When the user re-enters the room, re-localization kicks in and the anchors snap back into their original positions.”
7. Visual-inertial odometry (VIO)
A tracking technique that fuses camera images with inertial sensor data (accelerometer and gyroscope) to estimate motion more robustly than either source alone.
Usage: “VIO handles quick head turns well, but it still struggles in low-texture environments like empty hallways.”
Rendering and Scene Understanding
8. Occlusion
The effect where a real-world object correctly blocks the view of virtual content behind it, making the augmentation feel physically grounded.
Usage: “Without occlusion, the virtual character walks in front of the real couch instead of behind it, which breaks the illusion immediately.”
9. Depth map
A per-pixel estimate of distance from the camera to real-world surfaces, used to enable realistic occlusion and physics interactions.
Usage: “The depth map is noisy at the edges of objects, so we’re seeing flickering occlusion around the character’s silhouette.”
10. Mesh reconstruction
Building a 3D geometric representation of the physical environment from sensor data, allowing virtual objects to collide with or rest on real surfaces.
Usage: “The mesh reconstruction of the room updates in real time as the user scans more of the space with the device.”
11. Light estimation
Analyzing the real-world lighting conditions from the camera feed to render virtual objects with matching shadows, highlights, and color temperature.
Usage: “Light estimation lets the virtual chair cast a shadow in the same direction as the real light source in the room.”
12. World space vs. screen space
World space refers to coordinates fixed relative to the real environment; screen space refers to coordinates fixed relative to the device’s display, regardless of where the camera points.
Usage: “That UI panel should stay in screen space so it’s always visible, not pinned to world space where the user could turn away from it.”
13. Marker-based tracking
An AR approach that relies on recognizing a specific, predefined visual pattern (like a QR-style image marker) to anchor content, as opposed to markerless tracking of arbitrary surfaces.
Usage: “We fell back to marker-based tracking for this kiosk demo since lighting conditions there are too inconsistent for markerless tracking.”
14. Field of view (FOV)
The angular extent of the observable world that is visible through the AR device’s camera or display at any given moment.
Usage: “The headset’s field of view is narrower than a phone’s camera, so users notice virtual objects clipping at the edges more often.”
Interaction and Input
15. Hand tracking
Estimating the position and pose of a user’s hands and fingers in real time, enabling gesture-based interaction with virtual content without physical controllers.
Usage: “Hand tracking loses fidelity when fingers overlap, so the pinch gesture sometimes misfires during fast movements.”
16. Gaze tracking
Estimating where a user’s eyes are pointed, often used as an input method for selecting or highlighting virtual objects.
Usage: “We use gaze tracking to highlight the object the user is looking at, then confirm the selection with a pinch gesture.”
17. Raycasting
Projecting an invisible line from a point (such as the camera or a controller) into the 3D scene to determine what it intersects with — commonly used for placing objects or selecting targets.
Usage: “We raycast from the center of the screen to find the nearest detected plane when the user taps ‘place object.‘“
18. Haptic feedback
Physical, tactile feedback — usually vibration — provided through a controller or device to reinforce a virtual interaction, such as touching a virtual button.
Usage: “We added a short haptic pulse when the user’s virtual hand collides with an object, to make the interaction feel more physical.”
Platform and Pipeline Terms
19. AR Core / AR Kit
The two dominant platform SDKs (Google’s ARCore for Android, Apple’s ARKit for iOS) that provide the underlying tracking, plane detection, and light estimation primitives.
Usage: “We’re building this feature on top of ARKit’s scene reconstruction API, so it’s currently iOS-only.”
20. Persistent anchor / cloud anchor
An anchor whose position is saved to a server and can be resolved again later, or by a different device, enabling shared or multi-session AR experiences.
Usage: “We use cloud anchors so a second device in the same room can see the exact same virtual furniture placement.”
21. Foveated rendering
A rendering optimization that reduces detail in the peripheral areas of the display (outside where the user is looking) to save GPU performance, based on gaze tracking.
Usage: “Foveated rendering bought us back nearly 20% of our GPU budget on this headset.”
22. Passthrough
A mode where a headset shows a live camera feed of the real world (rather than a fully opaque virtual environment), forming the basis for many AR experiences on VR-capable hardware.
Usage: “Passthrough quality varies a lot between devices — this one has noticeable latency that causes mild discomfort during fast head movement.”
23. Latency (motion-to-photon)
The delay between a user’s physical movement and the corresponding update appearing on the display — a critical metric for comfort and immersion in AR/VR.
Usage: “Our motion-to-photon latency is around 18 milliseconds, which is within the comfortable range, but any regression here needs to be caught immediately.”
24. Digital content creation (DCC) pipeline
The workflow and tools (such as Blender or Maya) used to author 3D models, textures, and animations before importing them into an AR engine like Unity or Unreal.
Usage: “The DCC pipeline exports the model at too high a polygon count for mobile — we need to add a decimation step before import.”
Key Takeaways
- Tracking is the foundational concept — most AR bugs trace back to tracking loss, drift, or re-localization, so name the specific failure precisely.
- Occlusion, depth maps, and mesh reconstruction are what make virtual content feel physically grounded in a real space.
- Distinguish world space from screen space clearly when discussing where UI or objects should live.
- Hand tracking, gaze tracking, and raycasting are the core interaction vocabulary for input without physical controllers.
- Platform-specific terms (ARCore, ARKit, cloud anchors, passthrough) come up constantly in cross-platform discussions — know which platform each term belongs to.
Mastering this vocabulary lets you describe AR bugs and design decisions with the same precision your teammates expect, whether you’re debugging drift in a standup or discussing occlusion quality in a design review.