Posted January 31, 20241 yr Quantization is a technique for making machine learning models smaller and faster. We quantize Llama2-70B-Chat, producing an equivalent-quality model that generates 2.2x more... View the full article
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.