Quantized Llama models with increased speed and a reduced memory footprint

📅 2024-10-24    ⚓ Hacker News    🌐 Source    🖼️ Load Image