(NLP 연구) The Long-Document Transformer 03.17

성능 확인
Full attention
- 테스트 데이터 25000개의 정답률：0.8719
Sparse_attention (only window)
- window 16 → 테스트 데이터 25000개의 정답률：0.8591
- window 32 → 테스트 데이터 25000개의 정답률：0.8609
- window 64 → 테스트 데이터 25000개의 정답률：0.8154
- window 128 → 테스트 데이터 25000개의 정답률：0.8591
Sparse_attention ( window + global)
- window32 + global 1 : 테스트 데이터 25000개의 정답률：0.8561
- window32 + global 32 : 테스트 데이터 25000개의 정답률：0.8633
num_epochs = 10
- window 32 → window 64 → window 128 → window128 + global 1 → window128 + global 2 → window128 + global 4 : 테스트 데이터 25000개의 정답률：0.8609
- window 32 + global 32 → window 64 + global 32 → window 128 + global 32 → window128 + global 32 → window 64 + global 32 → window32 + global 32 : 테스트 데이터 25000개의 정답률：0.8672
- window 32 + global 32 → window 64 + global 32 → window 128 + global 32 → window128 + global 32 → Full_attention → Full_attention : 테스트 데이터 25000개의 정답률：0.8632
- Full_attention → window 256 + global 32 → window 128 + global 32 → window 64 + global 32 → window 32 + global 32 → window 32 + global 32

(NLP 연구) The Long-Document Transformer 03.21 (0)	2022.03.29
(NLP 연구) The Long-Document Transformer 03.18 (0)	2022.03.29
(NLP 연구) The Long-Document Transformer 03.16 (0)	2022.03.28
(NLP 연구) The Long-Document Transformer 03.15 (0)	2022.03.28
(NLP 연구) The Long-Document Transformer 03.14 (0)	2022.03.28

티스토리툴바