The thing that makes a significant difference is BLAS, and it's easy to substitute. There are some old numbers at https://loveshack.fedorapeople.org/blas-subversion.html#_add...
Most of it is unlikely to benefit much from -mavx and vectorization, but I have no numbers. -fno-semantic-interposition is probably a better candidate, which I've not got round to trying.