Case
find-file
Success rate 81.7% across all recorded iterations.
- Total
- 60
- Passed
- 49
- Failed
- 11
- Errors
- 0
Model Comparison
| # | Model | Score | Basic File Reading | Basic Skills | Avg turns | Avg tools | Avg tokens | Context avg / max |
|---|---|---|---|---|---|---|---|---|
| 1 | qwen/qwen3.5-9b | 100.0% | 100.0% | n/a | 4.9 | 3.9 | 7,600 | 5.2% / 5.5% |
| 2 | qwen/qwen3.6-35b-a3b | 100.0% | 100.0% | n/a | 4.4 | 4.3 | 6,771 | 5.2% / 5.5% |
| 3 | google/gemma-4-e4b | 100.0% | 100.0% | n/a | 4.7 | 3.7 | 6,386 | 2.5% / 2.7% |
| 4 | granite-4.1-8b | 100.0% | 100.0% | n/a | 5.3 | 4.3 | 6,854 | 4.2% / 4.3% |
| 5 | google/gemma-4-e2b | 90.0% | 90.0% | n/a | 4.9 | 3.9 | 7,616 | 5.5% / 6.2% |
| 6 | lfm2.5-350m | 0.0% | 0.0% | n/a | 2.1 | 1.3 | 2,485 | 3.7% / 4.0% |
Iterations
60 matching iterations
| Iteration | Model | Category | Variant | Status | Duration | Tools | Tokens | Context |
|---|---|---|---|---|---|---|---|---|
| find-file / 001 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 392ms | 1 | 2,374 | 3.8% |
| find-file / 002 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 232ms | 1 | 2,323 | 3.6% |
| find-file / 003 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 215ms | 1 | 2,318 | 3.6% |
| find-file / 004 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 313ms | 1 | 2,345 | 3.7% |
| find-file / 005 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 310ms | 3 | 2,420 | 3.8% |
| find-file / 006 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 428ms | 2 | 3,663 | 4.0% |
| find-file / 007 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 338ms | 1 | 2,397 | 3.8% |
| find-file / 008 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 252ms | 1 | 2,324 | 3.6% |
| find-file / 009 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 307ms | 1 | 2,353 | 3.7% |
| find-file / 010 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 273ms | 1 | 2,335 | 3.6% |
| find-file / 001 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 2,280ms | 5 | 7,892 | 4.3% |
| find-file / 002 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,649ms | 4 | 6,419 | 4.2% |
| find-file / 003 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,605ms | 4 | 6,403 | 4.2% |
| find-file / 004 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,644ms | 4 | 6,415 | 4.2% |
| find-file / 005 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,885ms | 4 | 6,485 | 4.2% |
| find-file / 006 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,728ms | 4 | 6,434 | 4.2% |
| find-file / 007 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,751ms | 4 | 6,435 | 4.2% |
| find-file / 008 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 2,125ms | 5 | 7,837 | 4.3% |
| find-file / 009 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 2,022ms | 5 | 7,807 | 4.3% |
| find-file / 010 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,734ms | 4 | 6,413 | 4.2% |
| find-file / 001 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 5,412ms | 4 | 7,860 | 5.5% |
| find-file / 002 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 3,846ms | 4 | 6,992 | 4.8% |
| find-file / 003 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | failed | 6,927ms | 3 | 6,962 | 6.2% |
| find-file / 004 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 4,873ms | 4 | 7,582 | 5.3% |
| find-file / 005 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 6,685ms | 4 | 8,597 | 6.1% |
| find-file / 006 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 4,571ms | 4 | 7,412 | 5.2% |
| find-file / 007 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 5,274ms | 4 | 7,828 | 5.5% |
| find-file / 008 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 4,659ms | 4 | 7,264 | 5.2% |
| find-file / 009 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 2,842ms | 4 | 7,098 | 5.0% |
| find-file / 010 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 6,404ms | 4 | 8,562 | 5.9% |
| find-file / 001 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 7,241ms | 4 | 7,219 | 2.7% |
| find-file / 002 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 6,600ms | 4 | 7,010 | 2.6% |
| find-file / 003 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 4,123ms | 3 | 5,105 | 2.2% |
| find-file / 004 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 4,696ms | 4 | 6,971 | 2.5% |
| find-file / 005 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 6,328ms | 4 | 6,743 | 2.5% |
| find-file / 006 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 5,453ms | 4 | 6,638 | 2.4% |
| find-file / 007 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 7,401ms | 4 | 7,252 | 2.7% |
| find-file / 008 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 2,716ms | 3 | 5,063 | 2.2% |
| find-file / 009 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 4,480ms | 3 | 5,137 | 2.3% |
| find-file / 010 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 5,792ms | 4 | 6,725 | 2.5% |
| find-file / 001 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 18,885ms | 4 | 7,631 | 5.2% |
| find-file / 002 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 13,681ms | 3 | 5,953 | 4.9% |
| find-file / 003 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 13,257ms | 4 | 6,170 | 5.1% |
| find-file / 004 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 14,456ms | 5 | 7,785 | 5.3% |
| find-file / 005 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 12,194ms | 4 | 6,029 | 5.0% |
| find-file / 006 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 14,390ms | 6 | 7,953 | 5.5% |
| find-file / 007 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 11,838ms | 4 | 6,064 | 5.0% |
| find-file / 008 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 12,688ms | 4 | 6,220 | 5.2% |
| find-file / 009 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 11,479ms | 4 | 6,063 | 5.1% |
| find-file / 010 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 15,038ms | 5 | 7,844 | 5.4% |
| find-file / 001 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 5,592ms | 4 | 7,990 | 5.4% |
| find-file / 002 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 5,736ms | 4 | 8,095 | 5.5% |
| find-file / 003 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 4,749ms | 4 | 8,061 | 5.5% |
| find-file / 004 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 3,955ms | 4 | 7,674 | 5.2% |
| find-file / 005 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 3,779ms | 4 | 7,607 | 5.1% |
| find-file / 006 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 4,054ms | 4 | 7,651 | 5.2% |
| find-file / 007 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 3,607ms | 3 | 5,982 | 5.0% |
| find-file / 008 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 4,077ms | 4 | 7,579 | 5.1% |
| find-file / 009 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 4,837ms | 4 | 7,651 | 5.2% |
| find-file / 010 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 4,069ms | 4 | 7,706 | 5.3% |