FlashAttention-T: Towards Tensorized Attention

72 points | by matt_d 5 hours ago

33 comments