- Atomic Sparse Attention
- Compound Sparse Attention (위의 Atomic 이 모여 구성)
- Star Transformer (NAACL, 2019, 123회 인용)
- Longformer: The Long-Document Transformer ****(2020, 708회 인용 / AllenAI)
- ETC: Encoding Long and Structured Inputs in Transformers (EMNLP, 2020, 87회 인용 / Google Research)
- BigBird (NeurIPS, 2020, 358회 인용 / Google Research)
- Sparse Transformer(2019, 353회 인용/ OpenAI)
- Extended Sparse Attention (Non-Text Dataset)
- BP - Transformer (2019,35회 인용/ AWS AI LAB)
- Image Transformer (ICML, 2018, 712회 인용)
- Axial Transformer (2019, 119회 인용 / Google Brain)
- Content-Based Sparse Attention (Query와 Key Token의 유사도를 통한 Attention)
- Reformer (ICLR, 2020, 641회 인용/ Google Research)
- Routing Transformer (TACL, 2020, 137회 인용 / Google Research)
- Sparse Adaptive Connection (90회 인용), Sparse Sinkhorn Attention (85회 인용)
- Attention is not Explanation
- What Does BERT Look At? An Analysis of BERT's Attention