I think it's still easier to ensure performant code as reference counting is more deterministic than GC. I assume you can turn off ARC for compilation unit if you want, so performance critical code can be manually managed.
95% of code can run an order of magnitude slower than "optimal" with no discernable degradation of the user experience.
If a bus lock degrades your performance too much, either change your algorithm, or isolate that section of code and do it in another language that doesn't suffer the same performance issues.