The First Drives Were Rougher Than the Demo Videos

Experimental settings, optimistic expectations, and the quick realization that a good research car still needs a disciplined baseline.

The first serious drives were humbling. I did what many curious people do with a new system: I tried the interesting settings early. Experimental mode, dynamic end-to-end control, model choices, different personalities, and the whole menu of knobs that promise some glimpse of the future.

Some moments were impressive. Others were rough. The car could feel smooth on one stretch, then hesitate, overreact, or make a poor low-speed decision a few minutes later. Stop-and-go behavior was especially revealing. In slow traffic or near stops, the system could slow, creep, re-accelerate too early, or fail to commit to the stop soon enough. At higher approach speeds into lights or stopped cars, the driver sometimes had to brake hard because the system did not appear to begin slowing in time.

Feel is useful, then it runs out

A drive can feel bad for three different reasons. The model may misunderstand intent. The planner may choose an awkward trajectory. The controller may be trying to execute a plan with the wrong vehicle assumptions. Those failures can look similar from the driver seat, especially when the driver is also managing safety and traffic.

That made the first lesson obvious: I needed baselines. Same roads when possible. Same settings when possible. Bookmarks for moments that felt wrong. Notes about traffic, battery state, experimental settings, and driver interventions. Otherwise every conclusion would be vibes wearing a lab coat.

Bookmark the moments that felt uncomfortable or unsafe to let continue.
Separate steering issues from longitudinal issues instead of treating the drive as one score.
Compare settings against similar routes before declaring a win.
Keep safety language boring and strict: supervised testing only, driver ready the entire time.

The bad drives were valuable

The rough early drives gave the project shape. They showed that the Tucson PHEV could be a useful test platform, but only if I stopped treating each drive like a verdict. Each route became another sample in a growing catalog. Each failure became a sharper question for the next patch or setting change.

That was the moment the project started to feel less like tuning and more like research.