EYE-TRACKING SEAT BUCK

SPEC-02.01 · MBR&D · INTERNAL · 2023

Cabin-scale rudimentary eye tracking for driver-distraction research. Mercedes R&D.

Schematic — Tobii's small screen-mounted frustum vs. the full-cockpit seat-buck volume we needed.

Mercedes R&D · 2023 · internal · no public repo

A research seat buck — partial car cockpit on a rig — used to study where drivers actually look while operating UI inside the cabin. The hard problem wasn't the rig; it was that consumer eye-tracking hardware doesn't cover the cabin scale, and we couldn't get research-grade eye trackers at our research scale. Solved it with a bootstrap rig — Unity scene, Arduino-driven peripherals, and a calibration step that fingerprinted each user's eye geometry to a hash so we could infer gaze direction without dedicated eye-tracking hardware.

Stack

Unity: Scene matched the seat buck geometry — every dashboard surface, head-up display position, and rear-view target had a known world-space coordinate so a gaze ray could be hit-tested.
Arduino: Drove the physical peripherals around the buck — indicator LEDs, calibration target lights, button responses. The Unity build talked to the Arduino over serial.
Tobii eye trackers (disqualified): First attempt. Tobii's consumer trackers only operate inside a small frustum a fixed distance in front of a screen — fine for desktop research, useless when the "screen" is a whole cockpit and the driver looks at the windshield, instrument cluster, center stack, and over their shoulder. Tracker dropped lock the moment the driver turned their head.
Calibration eye hashes: Per-subject calibration step computed a compact fingerprint of each user's eye geometry — pupil offset, inter-ocular distance, head-anchored landmarks — hashed and stored. At runtime, that hash plus head pose drove the gaze ray. Crude compared to research-grade hardware; sufficient to discriminate which UI region a driver was looking at, which was the actual experimental question.
OpenCV / camera array: Off-the-shelf cameras placed around the buck for the calibration capture and runtime head/eye detection.

Process

The constraint was geometric, not algorithmic. Tobii's screen-mounted trackers work because they assume your face stays inside a roughly 60×40 cm volume ~60 cm from a screen — that's their tracking frustum. A seat buck breaks that assumption immediately: a driver's head moves laterally, leans forward to inspect a button, twists for shoulder checks. The moment the head leaves the frustum, the tracker has no signal. Research-grade head-worn or scene-mounted eye trackers exist that would have worked — Pupil Labs, SR Research, Smart Eye — but procurement plus seat-buck-scale calibration wasn't on the project timeline. The research couldn't wait.

The bootstrap. What we had: cameras, an Arduino, Unity, and the experimental question — "did the driver look at the center stack while the secondary task was running?" — which doesn't require pixel-accurate gaze, only region-level discrimination (windshield / cluster / center stack / mirror). So we built a coarser pipeline. Each subject ran a one-time calibration: look at a sequence of known targets while a camera array captures their face. We extracted a per-subject "eye hash" — a small feature vector of pupil offset, eye-corner geometry, and head landmarks — and stored it. At runtime, the same camera array tracked the subject's head pose; the eye hash plus pose produced a gaze ray cast into the Unity scene. Region hit-tests gave us the answer the protocol needed.

Validation. Hash-driven gaze was clearly noisier than commercial trackers. We validated by running a subset of trials with researchers using a known fixation protocol and confirming the system identified the right region above the noise floor the protocol could tolerate. It wasn't a hardware replacement; it was an experimental instrument scoped to the experimental question.

What this is really about. The project is a small case study in matching tool fidelity to the question being asked. Tobii is excellent at what it does and disqualified itself by geometry, not by accuracy. The replacement was deliberately lower-fidelity and that was fine — the research output didn't depend on the rejected dimension.