Netflix invented a new role for their data team : the Media ML Data Engineer.

Unstructured data is fundamentally different. It’s multimodal & contains derived fields like embeddings, captions, & transcriptions. It’s also at least 80% of the world’s data & essential for the field of AI.

This new role highlights how one of the most important companies within the data ecosystem has evolved to promote multimodal data as core. Software engineering & data engineering are fusing.

lancedb-multimodal-platform

Netflix’s Team Multimodal Architecture

Different data producers send their data to a media machine learning data engineer who then supplies it for analytics, data science, & applied AI.

At the core of this role is a technology : the media data lake. In addition to access, metadata management, & data preparation, the new media data lake becomes an essential component of AI. Powering all of this is a portfolio company LanceDB.

Screenshot 2025-08-26 at 11.25.42 AM

We wrote about this type of architecture in 2022 in 9 Predictions for Data in 2023 & it’s thrilling to see it come to life at Netflix.

The demand for engineers who understand both traditional data infrastructure & multimodal AI will only grow.

Companies like LanceDB are building the next generation of data platforms to support this evolution. If you’re ready to work at this intersection, check out their open positions.