Great read! Thanks for doing this writeup. My company Neural Magic is working on making open-source LLMs even more efficient with sparsity, so they can be deployed on ordinary CPUs without GPUs. We'll make sure to share our progress with you!
It's impressive to see how LLaMA and other models have improved the performance of these language models and paved the way for more open-source research. Looking forward to your next installment on fine-tuning applications.
Great read! Thanks for doing this writeup. My company Neural Magic is working on making open-source LLMs even more efficient with sparsity, so they can be deployed on ordinary CPUs without GPUs. We'll make sure to share our progress with you!
Definitely familiar with Neural Magic. Keep up the awesome work!
It's impressive to see how LLaMA and other models have improved the performance of these language models and paved the way for more open-source research. Looking forward to your next installment on fine-tuning applications.