Language Model Scaling Laws and GPT-3 | by Cameron Wolfe | Dec, 2022
Understanding why LLMs like GPT-3 work so well…(Photo by Waldemar Brandt on Unsplash)Language models (LMs) are incredibly generic–they take text as input and produce text as output. Recent research has revealed that this generic text-to-text structure can be exploited to solve a variety of tasks without task-specific adaptation (i.e., no fine-tuning or architectural modifications) by using prompting techniques to perform accurate zero and few-shot inference. Put simply, we can pre-train the LM over a large, unlabeled text…