Data is the lifeblood of modern AI, but people are increasingly wary of sharing their information with model builders. A new architecture could get around the problem by letting data owners control how training data is used even after a model has been built.

The impressive capabilities of today’s leading AI models are the result of an enormous data-scraping operation that hoovered up vast amounts of publicly available information. This has raised thorny questions around consent and whether people have been properly compensated for the use of their data. And data owners are increasingly looking for ways to protect their data from AI companies.

A new architecture from researchers at the Allen Institute for AI (Ai2) called FlexOlmo could present a potential workaround. FlexOlmo allows models to be trained on private datasets without owners ever having to share the raw data. It also lets owners remove their data, or limit its use, after training has finished.

“FlexOlmo opens the door to a new paradigm of collaborative AI development,” the Ai2 researchers wrote in a blog post describing the new approach. “Data owners who want to contribute to the open, shared language model ecosystem but are hesitant to share raw data or commit permanently can now participate on their own terms.”

The team developed the new architecture to solve several problems with the existing approach to model training. Currently, data owners must make a one-time and essentially irreversible decision about whether or not to include their information in a training dataset. Once this data has been publicly shared there’s little prospect of controlling who uses it. And if a model is trained on certain data there’s no way to remove it later on, short of completely retraining the model. Given the cost of cutting-edge training runs, few model developers are likely to agree to this.

FlexOlmo gets around this by allowing each data owner to train a separate model on their own data. These models are then merged to create a shared model, building on a popular approach called “mixture of experts” (MoE), in which multiple smaller expert models are trained on specific tasks. A routing model is then trained to decide which experts to engage to solve specific problems.

Training expert models on very different datasets is tricky, though, because the resulting models diverge too far to effectively merge with each other. To solve this, FlexOlmo provides a shared public model pre-trained on publicly available data. Each data owner that wants to contribute to a project creates two copies of this model and trains them side-by-side on their private dataset, effectively creating a two-expert MoE model.