Sounds like either a bad thermal paste application or improperly torqued heat sink screws. The hotspot reading from the GPU is simply the “temperature of the hottest sensor”. You wouldn’t expect such QC issues from the factory, but you’d be surprised.
Heck, when I re-padded and re-pasted my Strix 3090, one of the memory thermal pads was barely making contact with the chips and backplate because the thermal pad next to it was too thick, preventing the backplate from making proper contact with the memory pad when properly tightened. After replacing all the pads and using a .5mm smaller pad on the adjacent pads, my memory temps dropped by 20c
This pic was taken immediately after removing the backplate. You can see the memory chips to the left weren’t making very good contatc with their thermal pad.
I’d at least take the card out and make sure that the 4 screws on the backplate which surround the GPU chip are properly tightened. You don’t need to take anything apart to check those screws.
Better to RMA it, someone I know had a hell of a time trying to return an expensive piece of kit (I forget what exactly) after the manufacturer noticed the screws had been tampered with (and I don’t mean a damaged label).
Good point. I RMA’d my 3090, that I had bought from eBay (yeah, I know, but I had no choice), that died and took my motherboards slot with it. I got it sorted in the end when the person I bought it from confirmed I was the new owner, but I wouldn’t have wanted the stress of having a £2000+ card dying on me, then find I have no legal recourse for a replacement.
If its out of warranty, do what you like, but if not, don’t touch.
It definitely matters where in the world you reside. There are vastly different consumer protection laws around the world. In the US we have “right to repair”, and re-pasting and re-padding a GPU fall in this category and will not void the warranty.
That’s interesting. I once tossed a failed gpu because I had to remove the large heatsink to fit the supplied low profile bracket thus breaking the seal. Unnoticed by me an errant piece of very thin copper thread had somehow stuck to a heatpad in the process and a short rendered the card useless… I guess in the US I could have RMA’d it.
Well, I don’t claim to be a consumer law expert. But, in this example, I think technically you were at fault. If you had sent it in, and they determined that the thread had caused the failure, they may not have RMAd it.
I don’t claim to fully understand it and there’s a lot of grey area. But if you damaged the GPU while replacing the pads, or if the failure were directly related to the pad replacement, it wouldn’t be covered. But, if you replaced the pads and the card failed for a different reason not related to the pad application, they would be obligated to replace it under warranty.
** Edit ** - I will add that I primarily replaced my pads and thermal paste because I mine with my card %98 of the time I am not flying the sim. I was able to drop the memory temps a good 20c by replacing the stock pads. If it fails, I will definitely try to RMA it, but I have already recovered most of the cost of the card in mining.
If I wasn’t interested in mining with the card, I likely would have left it alone. The temps when gaming were well within spec.
I’m hesitant suggesting to RMA the card over the high hotspot temperature. OP doesn’t seem to have any stability issues and strong clock boosting apparent from the provided screenshots. It’s not exactly the best time to RMA, as well. Lots of folks on Reddit have posted nightmares with the process. Some people had their card replaced by a second-hand repaired card, and others had theirs completely lost during shipping.
I’m just saying I’d be cautious jumping to early conclusions.
This is my feeling so far. I certainly don’t feel competent to take the thing apart, and aside from the occasional MSFS crash which is obviously a stability issue with the sim, I’ve not had any crashes or performance issues since owning the machine.
I wouldn’t have know I had a problem until I started watching Youtube videos about VRAM temps.
I think I’ll just monitor the situation for now and see how it goes. I definitely don’t intend to RMA the card and be without a PC for several days/weeks, and I’m loathe to void my warranty and open up my card to attempt a fix on something that isn’t broken.
After weeks of my GPU being super loud and my hotspot peaking out at 105 degrees I finally got MSI Afterburner and learnt how to undervolt. The Youtube video I watched recommended setting a 200 Mhz underclock and then raising the voltage at 825 to I believe 1800 mhz.
Anyway, it has definitely helped. My hotspot now peaks at 100, which is when the fans kick in, cooling it to 98 and the fans quieten down again, for a few minutes until coming back on again for a few seconds if it goes over 100 again. It’s a huge improvement in fan noise if only a 3 or 4 degree drop in hotspot temp.
Weirdly, my 3D Mark benchmark scores have actually improved slightly, which I guess is due to the fact I must have been thermal throttling before.
This is definitely a massive win for me, it seems stable, MSFS performance is great and everything is so much quieter. Definitely recommend undervolting if you are having hot spot temp issues like me.
Can you post a screenshot of your stats? I’m in a fairly graphically-intensive place w/current weather, and I’m not seeing anything higher than 90.0. GPU is hovering around 76 with a fan speed of 2800 rpm. I’m running my card stock though(Strix 3090).
Once I got out of the weather, my temps dropped to 82 hot spot and 69 GPU. Fan speed dropped to 2350. Where are you testing at?
I’ll take a screenshot next time I fly. It sounds likes yours has better thermal paste than mine. My GPU and memory temps are both really good, GPU rarely goes above low 70’s ,and memory always scores well, it’s just the hotspot which is always over 20 to 30 degrees higher. I expect I have a bad thermal connection in one place.
I’m not going to risk my warranty by opening it up and don’t want to RMA as the problem seems to be widespread and unlikely to change with another card (All MSI Ventus seem to all suffer bad hotspot temps) so this undervolt will have to do.
My temps are not the highest here on my 3090 (memory junction peaks at 98, card peaks at 78 in MSFS), but it is interesting that temps are way higher in MSFS than any other game I play.
Is there room for temperature control from the devs on the software side?
Not that high. On my evga 3090 xc3 ultra hybrid I have this memory temps even with an additional backplate cooler on warmer days. Msfs is a killer on that side, it drives the gpu harder than any mining ever could.
To illustrate this: For port royale benchmark I run this gpu with 1950mhz/+1200mhz ram with ease but in msfs I have to stay around 1750mhz meanwhile to avoid overload indicated crashes.