Serving 70B-scale LLMs efficiently on low-resource edge devices [pdf]
📅 2024-10-03 ⚓ Hacker News 🌐 Source 🖼️ Load Image