lm-studio

qwen/qwen3.5-9b

Run score 92.9% across 70 completed iterations.

Score
92.9%
Passed
65
Failed
5
Errors
0
Started
5/11/2026, 1:46:42 PM
Ended
5/11/2026, 2:03:24 PM
Duration
1002s

Category Breakdown

Category Total Passed Failed Errors Score
Basic File Reading 40 40 0 0 100.0%
Basic Skills 30 25 5 0 83.3%

Cases

Case iterations Passed Failed Errors Score
find-file 10 total 10 0 0 100.0%
read-exact-file 10 total 10 0 0 100.0%
read-exact-file-with-at-reference 10 total 10 0 0 100.0%
read-file 10 total 10 0 0 100.0%
use-skill 10 total 7 3 0 70.0%
use-skill-with-refs 10 total 8 2 0 80.0%
use-skill-with-scripts 10 total 10 0 0 100.0%

Iterations

70 matching iterations

IterationModelCategoryVariantStatusDurationToolsTokensContext
find-file / 001lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed5,592ms47,9905.4%
find-file / 002lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed5,736ms48,0955.5%
find-file / 003lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed4,749ms48,0615.5%
find-file / 004lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed3,955ms47,6745.2%
find-file / 005lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed3,779ms47,6075.1%
find-file / 006lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed4,054ms47,6515.2%
find-file / 007lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed3,607ms35,9825.0%
find-file / 008lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed4,077ms47,5795.1%
find-file / 009lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed4,837ms47,6515.2%
find-file / 010lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed4,069ms47,7065.3%
read-exact-file / 001lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed2,850ms24,6635.0%
read-exact-file / 002lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed3,613ms48,2615.5%
read-exact-file / 003lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed1,973ms24,6204.9%
read-exact-file / 004lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed2,994ms24,6315.0%
read-exact-file / 005lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed2,280ms24,6515.0%
read-exact-file / 006lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed2,217ms24,6575.0%
read-exact-file / 007lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed2,202ms24,6465.0%
read-exact-file / 008lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed2,356ms24,6495.0%
read-exact-file / 009lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed3,058ms36,4095.3%
read-exact-file / 010lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed2,136ms24,6304.9%
read-exact-file-with-at-reference / 001lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed7,378ms24,4434.8%
read-exact-file-with-at-reference / 002lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed12,447ms24,4924.9%
read-exact-file-with-at-reference / 003lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed15,484ms36,0364.9%
read-exact-file-with-at-reference / 004lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed11,164ms24,4524.8%
read-exact-file-with-at-reference / 005lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed8,983ms24,4924.8%
read-exact-file-with-at-reference / 006lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed8,935ms24,4624.8%
read-exact-file-with-at-reference / 007lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed9,969ms24,6385.0%
read-exact-file-with-at-reference / 008lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed9,858ms24,4684.8%
read-exact-file-with-at-reference / 009lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed8,293ms24,4844.9%
read-exact-file-with-at-reference / 010lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed9,733ms24,4454.8%
read-file / 001lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed17,162ms47,7045.3%
read-file / 002lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed14,207ms47,6425.2%
read-file / 003lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed5,792ms47,6515.1%
read-file / 004lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed3,813ms47,6345.2%
read-file / 005lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed3,944ms47,6185.2%
read-file / 006lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed3,382ms47,5685.1%
read-file / 007lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed4,640ms47,9205.4%
read-file / 008lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed2,947ms35,9314.9%
read-file / 009lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed4,398ms47,7205.3%
read-file / 010lm-studio / qwen/qwen3.5-9bBasic File ReadingBaseline (/skills)passed4,175ms47,6275.2%
use-skill / 001lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed2,866ms24,7245.0%
use-skill / 002lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)failed399,910ms1903,229,06452.6%
use-skill / 003lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed2,607ms24,7255.0%
use-skill / 004lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed1,979ms24,6745.0%
use-skill / 005lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)failed22,314ms1231,39610.9%
use-skill / 006lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed2,683ms24,7895.1%
use-skill / 007lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)failed9,388ms817,1117.2%
use-skill / 008lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed2,045ms24,6735.0%
use-skill / 009lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed2,566ms24,6715.0%
use-skill / 010lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed3,055ms24,6965.0%
use-skill-with-refs / 001lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed11,022ms511,8357.1%
use-skill-with-refs / 002lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed3,945ms36,9245.7%
use-skill-with-refs / 003lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed3,319ms36,7425.5%
use-skill-with-refs / 004lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed9,394ms817,0697.1%
use-skill-with-refs / 005lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed13,156ms1326,5548.3%
use-skill-with-refs / 006lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)failed2,839ms24,8265.2%
use-skill-with-refs / 007lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)failed6,726ms25,1276.2%
use-skill-with-refs / 008lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed7,218ms511,4076.6%
use-skill-with-refs / 009lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed31,931ms2980,41313.6%
use-skill-with-refs / 010lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed10,006ms612,5366.5%
use-skill-with-scripts / 001lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed14,763ms511,3196.5%
use-skill-with-scripts / 002lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed13,114ms59,5696.4%
use-skill-with-scripts / 003lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed20,223ms716,1997.2%
use-skill-with-scripts / 004lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed10,893ms37,2095.9%
use-skill-with-scripts / 005lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed12,794ms49,2156.2%
use-skill-with-scripts / 006lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed11,070ms37,0855.8%
use-skill-with-scripts / 007lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed14,124ms37,0875.8%
use-skill-with-scripts / 008lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed15,098ms37,0935.8%
use-skill-with-scripts / 009lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed14,012ms37,0575.8%
use-skill-with-scripts / 010lm-studio / qwen/qwen3.5-9bBasic SkillsBaseline (/skills)passed15,047ms37,0435.8%