ELECTRA: Pre-Training Text Encoders as Discriminators rather than Generators. What is the difference between ELECTRA and BERT?
BERT (Devlin et al., 2018) is the baseline of NLP tasks recently. There are a lot of new models released based on BERT architecture such as RoBERTA (Liu et al. 2019) and ALBERT (Lan et al., 2019). Clark et al. released ELECTRA (Clark et al., 2020) which target to reduce computation time and resource while maintaining high-quality performance. The trick is introducing the generator for Masked Langauge Model (MLM) prediction and forwarding the generator result to the discriminator
.MLM is one of the training objectives in BERT (Devlin et al., 2018). However, it is being criticized because of misaligned between the training phase and the fine-tuning phase. In short, the MLM mask token by [MASK] and model will predict the real world in order to learn the word representation. On the other hand, ELECTRA (Clark et al., 2020) contains two models which are generator and discriminator. The masked token will be sent to the generator and generating alternative inputs for discriminator (i.e. ELECTRA model). After the training phase, the generator will be thrown away while we only keep the discriminator for fine-tuning and inference.
Clark et al. named this method as replaced token detection. In the following sections, we will cover how does ELECTRA (Clark et al., 2020) works.
Keeping up in the new silicon-based survival of the fittest
Innovations in Artificial Intelligence - Various sectors in which AI is remarkably used & has brought changes in humanity - Education, Healthcare,automobile
Explore to understand how AI artificial intelligence has advanced and presently serves as a roadmap to augment your business in 2020.
Lemonade is one of this year’s hottest IPOs and a key reason for this is the company’s heavy investments in AI (Artificial Intelligence). The company has used this technology to develop bots to handle the purchase of policies and the managing of claims. In this post, you'll see How to Create an Artificial Intelligence (AI) Model
Every week we bring to you the best AI research papers, articles and videos that we have found interesting, cool or simply weird that week. Have fun!