Techno Blender
Digitally Yours.
Browsing Tag

GPT2Small

How to Interpret GPT2-Small

Mechanistic Interpretability on prediction of repeated tokensThe development of large-scale language models, especially ChatGPT, has left those who have experimented with it, myself included, astonished by its remarkable linguistic prowess and its ability to accomplish diverse tasks. However, many researchers, including myself, while marveling at its capabilities, also find themselves perplexed. Despite knowing the model’s architecture and the specific values of its weights, we still struggle to comprehend why a…