Wolfram Ravenwolf: I've converted Qwen QVQ to EXL2 format and uploaded the 4.65bpw version. 32K context with 4-bit cache in less than 48 GB VRAM. Benchmarks are still running. Looking forward to find out how it compares to QwQ which was the best local model in my recent mass benchmark. huggingface.co/wolfram/QVQ-...

See full post

Wolfram Ravenwolf wolfram.ravenwolf.ai
I've converted Qwen QVQ to EXL2 format and uploaded the 4.65bpw version. 32K context with 4-bit cache in less than 48 GB VRAM. Benchmarks are still running. Looking forward to find out how it compares to QwQ which was the best local model in my recent mass benchmark. huggingface.co/wolfram/QVQ-...
wolfram/QVQ-72B-Preview-4.65bpw-h6-exl2 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co
Dec 26, 2024 00:10
0 reposts 0 quotes 0 likes

View on Bluesky Show all post labels

An unhandled error has occurred. Reload 🗙