This is expected: one HT processor is not a full core. HT is a technology which permits “splitting” the instruction stream in a way which minimizes contention. If you do have a series of instructions and one takes a longer time, the code won’t make any further progress during this time. With HT, some instructions can therefore run in parallel and it usually is effectively raising throughput by 5% on average and about 15% max.
However, should the thread scheduler uses the 2 HT units of the same core for 2 different threads, any synchronization between the 2 threads will effectively renders HT useless. Furthermore in this case, because the cache lines are shared, you can have other side-effects effectively reducing the throughput. This is why when you’re implementing a work task scheduler (see Intel TBB for example) you have to pay attention to this and effectively ‘pin’ your work task scheduler threads to “main” core only (i.e. CPU0, CPU2, CPU4, CPU6 etc…) for better results.
The issue is Win32 API presents these HT cores as if they were 2 real independent CPU cores which is not true, and it takes special care to manage these efficiently.
This is why in general for gaming, CPUs without HT are better (or disable it in the BIOS) and for flight simulator (not flight simulation) single core performance is key to choosing a CPU brand/model.