Local LLM Benchmark Report

Local inference performance analysis

39 runs

System

macOS 26.2

CPU

Apple M4 Pro

RAM

48 GB

GPU

Apple M4 Pro GPU (unified memory)

Last run: 2026-01-30 22:41 Models tested: 7

Total Runs

39

Best Avg F1

98.8%

Ollama - 30B

Fastest Avg

6.9s

MLX - 30B

Lowest Avg Memory

4.5 GB

MLX - 8B

Metrics Over Time

Model Comparison

Bubble size = memory usage

All Results

Timestamp Model Backend Time (s) Memory (GB) Tokens F1 Precision Recall Prompt Status