Researchers At Stanford Have Developed An Artificial Intelligence (AI) Approach Called ‘MEND’ For Fast Model Editing At Scale
... A large language model trained in 2019 might assign a higher probability to Theresa May than Boris Johnson when prompted. Who is the Prime Minister of the United Kingdom? ...
... fine-tuning on a single sample tends to overfit, even when the distance between the pre-and post-fine-tuning parameters is limited. ...
Researchers present a bi-level meta-learning objective for determining a model initialization for which standard fine-tuning on a single edit example yields valuable modifications.
While practical, the computational requirements of learning such an editable representation make scaling to large models difficult, where fast, effective edits are most required. Researchers describe a computationally efficient learning-based alternative, but their experiments fail to edit huge models. As a result, they devise a method for producing reliable, local, and general edits while efficiently scaling to models with over 10 billion parameters. When given the standard fine-tuning gradient of a given correction as input, their approach trains lightweight model editor networks to produce edits to a pre-trained model’s weights, leveraging the gradient as an information-rich starting point for editing. ...
This work’s main contribution is a scalable algorithm for fast model editing that can edit huge pre-trained language models by leveraging the low-rank structure of fine-tuning gradients. They conduct empirical evaluations on various language-related tasks and transformer models, demonstrating that MEND is the only algorithm capable of consistently editing the most significant GPT-style and T5 language models. Finally, their ablation experiments illustrate the impact of MEND’s key components, demonstrating that MEND variants are likely to scale to models with hundreds of billions of parameters. The code implementation is freely available on GitHub.
See the full article here: https://www.marktechpost.com/2022/11/09/researchers-at-stanford-have-developed-an-artificial-intelligence-ai-approach-called-mend-for-fast-model-editing-at-scale/
Pages
- About Philip Lelyveld
- Mark and Addie Lelyveld Biographies
- Presentations and articles
- Trustworthy AI – A Market-Driven approach
- Tufts Alumni Bio