The assistant axis: situating and stabilizing the character of large language models
…AI. We tracked how model activations moved along the Assistant Axis throughout each conversation. The pattern was consistent across the models we tested. While coding conversations kept models firmly in Assistant territory…
