Open
Description
Line 178 in 9755682
E.g compared with the section Positional Encoding of the following article:
http://jalammar.github.io/illustrated-transformer/
So, did I miss something?
Is this an overlook, or simplification? And how does this affect the training result?
Anyone can help explain?
Thanks.
Metadata
Assignees
Labels
No labels
Activity