Gemma 3 270m 4-bit generates text at over 650 (!) tok/sec on an M4 Max with mlx-lm and uses < 200MB: Not sped up:
23,42K