Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers
… I created a PR to add Flash Attention support for GPT-OSS: https://github.com/huggingface/transformers/pull/42345 If you can't wait for the PR to get merged and registered in PyPI, here's a patch: https://gist.github.com/markrogersjr/ebada9ad3a31381d8d4e0d956c852569