Local inference performance analysis
System
macOS 26.2
CPU
Apple M4 Pro
RAM
48 GB
GPU
Apple M4 Pro GPU (unified memory)
Total Runs
39
Best Avg F1
98.8%
Ollama - 30B
Fastest Avg
6.9s
MLX - 30B
Lowest Avg Memory
4.5 GB
MLX - 8B
Bubble size = memory usage
| Timestamp | Model | Backend | Time (s) | Memory (GB) | Tokens | F1 | Precision | Recall | Prompt | Status |
|---|