To address this issue, numerous methods are proposed recently, such as the sparse attention matrix (Zaheer et al., 2020; Beltagy et al., 2020; Tay et al., 2020a; Kitaev et al., 2019; Child et al., 2019),lowrank representations (Wang et al., 2020) or kernel-based methods (Peng et al., 2020; Choromanski et al., 2020; Katharopoulos et al., 2020), among many others. These methods achieve reduced computational complexity with comparable performances when compared with the vanilla attention architecture on several selected tasks or corpus
'인공지능 공부 > NLP 연구' 카테고리의 다른 글
(NLP 연구) The Long-Document Transformer 03.31 (LSH) (0) | 2022.04.01 |
---|---|
(NLP 연구) The Long-Document Transformer 03.28 (0) | 2022.04.01 |
(NLP 연구) The Long-Document Transformer 03.21 (0) | 2022.03.29 |
(NLP 연구) The Long-Document Transformer 03.18 (0) | 2022.03.29 |
(NLP 연구) The Long-Document Transformer 03.17 (0) | 2022.03.28 |