I got my new PC for about 3 or 4 months. Today, I was using my PC as usual and suddenly everything stopped reacting. Rebooting just boots be into the UEFI interface. Which is very concerning.
Then I got a liveusb to look into what’s happening. Upon using smartctl. It shows that my SSD have 0% spare capacity despite only writing 15TB to it.
So far, I knew that Samsung’s EVO 980 and 990 SSDs have a firmware bug that can cause this. But this is the 1st time I know of 970 Pros having this issue.
I know there’s a lot of servers using consumer drives for their system. Be careful and check if you are using a 970. If so, check the spare capacity RIGHT NOW and decided if to upgrade the firmware or RMA the product.
Interesting, I am dealing with a similar issue right now. I have a Samsung 980 Pro 2 TB nvme where files have become corrupted over the last few months (took me a while to notice). I am trying to get it replaced under warranty, but the company where I bought my laptop wants to run all sorts of tests and is suggesting to reformat the SSD :S even though the smartmontools lists a ton of errors (the SMART overall health self assessment test did say ‘PASSED’ though). I have only written 8.20 TB to it.
Can you check your spare capacity? Run
smartctl -a /dev/nvme<id>
Also which FS are you using? I have switched to BTRFS root partition to catch these errors as early as I can.