When Should You Fine-Tune LLMs?. There has been a flurry of exciting… | by Skanda Vivek | May, 2023
The problem of giving all the necessary information to the model to answer questions is now offloaded from the model architecture to a database, containing document chunks.The documents of relevance can then be found by computing similarities between the question and the document chunks. This is done typically by converting the chunks and question into word embedding vectors, and computing cosine similarities between chunks and question, and finally choosing only those chunks above a certain cosine similarity as relevant…