Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference
📅 2024-11-19 ⚓ Hacker News 🌐 Source 🖼️ Load Image