Case
read-exact-file-with-at-reference
Success rate 86.7% across all recorded iterations.
- Total
- 60
- Passed
- 52
- Failed
- 8
- Errors
- 0
Model Comparison
| # | Model | Score | Basic File Reading | Basic Skills | Avg turns | Avg tools | Avg tokens | Context avg / max |
|---|---|---|---|---|---|---|---|---|
| 1 | qwen/qwen3.5-9b | 100.0% | 100.0% | n/a | 3.1 | 2.1 | 4,641 | 4.8% / 5.0% |
| 2 | qwen/qwen3.6-35b-a3b | 100.0% | 100.0% | n/a | 3.0 | 2.0 | 4,375 | 4.7% / 4.8% |
| 3 | google/gemma-4-e4b | 100.0% | 100.0% | n/a | 3.2 | 2.2 | 3,873 | 2.0% / 2.3% |
| 4 | google/gemma-4-e2b | 100.0% | 100.0% | n/a | 3.0 | 2.0 | 3,653 | 3.9% / 4.3% |
| 5 | granite-4.1-8b | 100.0% | 100.0% | n/a | 3.0 | 2.0 | 3,761 | 4.0% / 4.0% |
| 6 | lfm2.5-350m | 20.0% | 20.0% | n/a | 2.4 | 1.4 | 2,813 | 3.7% / 3.7% |
Iterations
60 matching iterations
| Iteration | Model | Category | Variant | Status | Duration | Tools | Tokens | Context |
|---|---|---|---|---|---|---|---|---|
| read-exact-file-with-at-reference / 001 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 601ms | 1 | 2,325 | 3.6% |
| read-exact-file-with-at-reference / 002 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 208ms | 1 | 2,320 | 3.6% |
| read-exact-file-with-at-reference / 003 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 259ms | 1 | 2,332 | 3.6% |
| read-exact-file-with-at-reference / 004 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 313ms | 2 | 3,518 | 3.7% |
| read-exact-file-with-at-reference / 005 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 238ms | 1 | 2,329 | 3.6% |
| read-exact-file-with-at-reference / 006 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 301ms | 1 | 2,353 | 3.7% |
| read-exact-file-with-at-reference / 007 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 290ms | 2 | 3,531 | 3.7% |
| read-exact-file-with-at-reference / 008 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | passed | 318ms | 2 | 3,529 | 3.7% |
| read-exact-file-with-at-reference / 009 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | passed | 362ms | 2 | 3,530 | 3.7% |
| read-exact-file-with-at-reference / 010 | lm-studio / lfm2.5-350m | Basic File Reading | Baseline (/skills) | failed | 296ms | 1 | 2,358 | 3.7% |
| read-exact-file-with-at-reference / 001 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,759ms | 2 | 3,775 | 4.0% |
| read-exact-file-with-at-reference / 002 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,079ms | 2 | 3,753 | 3.9% |
| read-exact-file-with-at-reference / 003 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,098ms | 2 | 3,753 | 3.9% |
| read-exact-file-with-at-reference / 004 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,111ms | 2 | 3,753 | 3.9% |
| read-exact-file-with-at-reference / 005 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,268ms | 2 | 3,766 | 4.0% |
| read-exact-file-with-at-reference / 006 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,374ms | 2 | 3,773 | 4.0% |
| read-exact-file-with-at-reference / 007 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,077ms | 2 | 3,753 | 3.9% |
| read-exact-file-with-at-reference / 008 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,211ms | 2 | 3,762 | 4.0% |
| read-exact-file-with-at-reference / 009 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,285ms | 2 | 3,766 | 4.0% |
| read-exact-file-with-at-reference / 010 | lm-studio / granite-4.1-8b | Basic File Reading | Baseline (/skills) | passed | 1,092ms | 2 | 3,753 | 3.9% |
| read-exact-file-with-at-reference / 001 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 2,614ms | 2 | 3,936 | 4.3% |
| read-exact-file-with-at-reference / 002 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 1,828ms | 2 | 3,658 | 4.0% |
| read-exact-file-with-at-reference / 003 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 1,819ms | 2 | 3,625 | 3.9% |
| read-exact-file-with-at-reference / 004 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 1,979ms | 2 | 3,747 | 4.0% |
| read-exact-file-with-at-reference / 005 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 1,499ms | 2 | 3,618 | 3.8% |
| read-exact-file-with-at-reference / 006 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 1,679ms | 2 | 3,578 | 3.9% |
| read-exact-file-with-at-reference / 007 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 32ms | 2 | 3,598 | 3.7% |
| read-exact-file-with-at-reference / 008 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 1,212ms | 2 | 3,545 | 3.7% |
| read-exact-file-with-at-reference / 009 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 1,825ms | 2 | 3,641 | 4.0% |
| read-exact-file-with-at-reference / 010 | lm-studio / google/gemma-4-e2b | Basic File Reading | Baseline (/skills) | passed | 1,301ms | 2 | 3,584 | 3.7% |
| read-exact-file-with-at-reference / 001 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 3,152ms | 3 | 4,810 | 2.0% |
| read-exact-file-with-at-reference / 002 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 4,850ms | 3 | 5,076 | 2.3% |
| read-exact-file-with-at-reference / 003 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 1,952ms | 2 | 3,660 | 2.1% |
| read-exact-file-with-at-reference / 004 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 2,264ms | 2 | 3,587 | 2.0% |
| read-exact-file-with-at-reference / 005 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 2,979ms | 2 | 3,606 | 2.1% |
| read-exact-file-with-at-reference / 006 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 2,499ms | 2 | 3,572 | 2.0% |
| read-exact-file-with-at-reference / 007 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 2,201ms | 2 | 3,549 | 1.9% |
| read-exact-file-with-at-reference / 008 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 2,962ms | 2 | 3,657 | 2.0% |
| read-exact-file-with-at-reference / 009 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 2,534ms | 2 | 3,573 | 2.0% |
| read-exact-file-with-at-reference / 010 | lm-studio / google/gemma-4-e4b | Basic File Reading | Baseline (/skills) | passed | 2,758ms | 2 | 3,635 | 2.0% |
| read-exact-file-with-at-reference / 001 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 12,859ms | 2 | 4,359 | 4.7% |
| read-exact-file-with-at-reference / 002 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 11,772ms | 2 | 4,356 | 4.7% |
| read-exact-file-with-at-reference / 003 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 11,733ms | 2 | 4,343 | 4.7% |
| read-exact-file-with-at-reference / 004 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 12,463ms | 2 | 4,438 | 4.8% |
| read-exact-file-with-at-reference / 005 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 11,876ms | 2 | 4,379 | 4.7% |
| read-exact-file-with-at-reference / 006 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 12,225ms | 2 | 4,373 | 4.7% |
| read-exact-file-with-at-reference / 007 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 11,379ms | 2 | 4,444 | 4.8% |
| read-exact-file-with-at-reference / 008 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 11,178ms | 2 | 4,346 | 4.7% |
| read-exact-file-with-at-reference / 009 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 12,529ms | 2 | 4,350 | 4.7% |
| read-exact-file-with-at-reference / 010 | lm-studio / qwen/qwen3.6-35b-a3b | Basic File Reading | Baseline (/skills) | passed | 11,505ms | 2 | 4,363 | 4.7% |
| read-exact-file-with-at-reference / 001 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 7,378ms | 2 | 4,443 | 4.8% |
| read-exact-file-with-at-reference / 002 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 12,447ms | 2 | 4,492 | 4.9% |
| read-exact-file-with-at-reference / 003 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 15,484ms | 3 | 6,036 | 4.9% |
| read-exact-file-with-at-reference / 004 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 11,164ms | 2 | 4,452 | 4.8% |
| read-exact-file-with-at-reference / 005 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 8,983ms | 2 | 4,492 | 4.8% |
| read-exact-file-with-at-reference / 006 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 8,935ms | 2 | 4,462 | 4.8% |
| read-exact-file-with-at-reference / 007 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 9,969ms | 2 | 4,638 | 5.0% |
| read-exact-file-with-at-reference / 008 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 9,858ms | 2 | 4,468 | 4.8% |
| read-exact-file-with-at-reference / 009 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 8,293ms | 2 | 4,484 | 4.9% |
| read-exact-file-with-at-reference / 010 | lm-studio / qwen/qwen3.5-9b | Basic File Reading | Baseline (/skills) | passed | 9,733ms | 2 | 4,445 | 4.8% |