Serving 70B-scale LLMs efficiently on low-resource edge devices [pdf]

📅 2024-10-03    ⚓ Hacker News    🌐 Source    🖼️ Load Image