Dense 27B sanity check through the proxy — 4/4

What was run

The dense sibling model qwen3.6-27b (not the MoE) on the 4-task smoke set, 1 attempt each, through the reasoning-cap proxy (port 8021, cap ~4000), thinking ON.

Results

4/4.

Interpretation

The harness + proxy mechanism is model-agnostic across the two Qwen variants, and the dense 27B handles the smoke set fine. The dense model decodes slower than the MoE (~3B active) though, so the MoE remains the primary benchmark target for full-suite runs on this single 3090.

Next steps

Back to the MoE for a bigger-K statistical run of the two signal tasks.

modelllama-local/qwen3.6-27bagentharnesses.minimal_pi:MinimalPithinkingonreasoning budgetproxy · cap 4000 — see journal for the authoritative mechanismmaxTokens / contextWindow32768 / 131072agent timeout ×2.0trials4 of 4 — 4 pass · 0 failmean reward1.00tokens (job total)399,991 in / 22,645 outstarted / finished2026-07-02T19:03 / 2026-07-02T19:16wall clock13m06s

smokeqwen3.6-27b20260702-190304

Dense 27B sanity check through the proxy — 4/4

What was run

Results

Interpretation

Next steps

Run details

Tasks

fix-git — 1/1 passed

nginx-request-logging — 1/1 passed

openssl-selfsigned-cert — 1/1 passed

regex-log — 1/1 passed

smoke__qwen3.6-27b__20260702-190304

Dense 27B sanity check through the proxy — 4/4

What was run

Results

Interpretation

Next steps

Run details

Tasks

fix-git — 1/1 passed

nginx-request-logging — 1/1 passed

openssl-selfsigned-cert — 1/1 passed

regex-log — 1/1 passed

smokeqwen3.6-27b20260702-190304