You could disable Turbo and EIST in the BIOS - most will have a setting for these, and that will result in the cores running at nominal freq.
I don't know about a nice high-level linux interface, there may well be one, but if you need to access these settings from a running machine there are MSRs you can poke.
You’re surely right about the incentives at play, and why no intelligence agency is going to spend millions doing x company’s validation work for them. But how in the world would selective patching work? How do you hide that?
Is the laptop using the HPET for the QPC timer? (10MHz..) and the added latency[0] of calling into that is messing up the reported timings in the profiler?
I haven't checked. My workstation has a 2.53 MHz QPC that is expensive to call and (after stepping in to the function) I found that:
1) The 2.53 MHz QPC is just the rdtsc frequency divided by 1,024
2) The slowdown is from calling ReadTimeStampCounterFromEmulator, which it seems to discard the results of
My laptop doesn't run at 10.24 GHz so it probably is using HPET rather than rdtsc/1024. This makes QPC more expensive but I don't think it is messing up the reported timings - events are just outright being lost, as far as I can tell
It may seem that way, but at least here in the UK homelessness is a huge problem in the immigrant community. It is virtually invisible though. You can get kicked out of the UK as a migrant (even from the EU) for being homeless, so keeping a low profile is essential.
I'm not even sure they're going to let you program the FPGA drectly (with HDL)?
I expect the main workflow will be using OpenCL to offload arbitrary work to the coprocessor along with a few Intel-provided modules capable of common tasks.
The great thing about having the FPGA on-die via UPI is that the cache-coherency, decreased latency and massive bandwidth will allow much more granular offloading of work. This is as compared with PCIe coprocessor where it only makes sense to offload larger chunks of work and minimise the communication and data passing between the two.
The greater the granularity of work that we can offload, the more viable the OpenCL/high-level synthesis/heterogeneous computing type stuff will be, as it will integrate more seamlessly into existing software development methods. This is the holy grail at the moment for FGPA vendors: to get to the point where software developers can program them on their own.
As to your point though I guess we'll find out soon what the dev tools for this will actually look like.
Using OpenCL to program an FPGA is a significantly more difficult task than programming in HDL. Atleast for what would be relevant to a Xeon co-processor. The OpenCL flow is just terrible at getting to the performance levels you need to realistically offload anything from a Xeon. Intel are certainly working on it, but that's not a realistic proposition for the next 5 years, and if the Xeon+FPGA isn't already successful in that time frame, it'll be canned long before OpenCL is a solution.
From what I've seen the only applications for this will be pre-canned FPGA images that were written in-house by Intel for things like encryption or FEC.
I'm not so sure - it's a fantastic fit for GPUs, FPGAs make sense for stream processing for video encoding - but that's more of a discrete device play where you can plug the video feeds directly into the FPGA daughter board.
https://www.youtube.com/watch?v=ylkbjjykgG4&list=RDylkbjjykg...