Snowflake Arctic, quantized to FP8 by FriendliAI, is now available on Friendli!
We’re excited to announce that Snowflake Arctic Instruct, quantized to FP8 by FriendliAI is now available on Friendli! With Friendli, even without advanced optimizations, Arctic can run at about 100 tokens per second with a batch size of 1. Additionally, Friendli allows Arctic to run with a large batch size on a single node. 🚀
Try now 👉 https://suite.friendli.ai
#FriendliAI #Friendli #Arctic #Snowflake #LLM #Inference #Serving
Watch on YouTube ↗
(saves to browser)
DeepCamp AI