biomistral q2k q3km q8 comparison

Patrick Devaney · Intermediate ·🧠 Large Language Models ·2y ago
Demo of performance for a medical scenario prompt with biomistral7b-Q_2_K, Q_K_M, and Q8. Q3_K_M is best for time and output quality in this test. In a production environment a hospital might use GPT-4, BLOOM, or a larger parameter Mistral model. In the near future text gen, computer vision, and multi-modal models will approach 100% accuracy and instantaneous response time. Speed and accuracy won't be problems. Local hardware will be adequate for text gen, whereas cloud models will be necessary for digital twinning and other spatial use cases.
Watch on YouTube ↗ (saves to browser)
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Next Up
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)