Ma machine s'est comporté de manière très étrange hier soir après une intense session de gaming. Le système s'est figé, à l'exception de la souris et de l'accès aux tty. Déplacer la souris fonctionnait, mais cliquer n'ouvrait aucun menu, et une tentative d'autentification en tty restait coincée juste avant d'ouvrir un shell.
Par chance, j'avais glances ouvert en même temps, qui continuait à tourner en m'indiquant un très haut ratio de CPU_IOWAIT. Le disque dur serait-il mort ? Petit détail d'importance, / est sur NVMe, et /home sur un disque dur (celui qui semble mort), les deux en btrfs.
J'ai redémarré la machine espérant en reprendre le contrôle, mais à l'ouverture de session, KDE m'a lancé tout un tas d'erreurs, et /home était monté en lecture seule. Au moins, je peux ouvrir une console et voir ce qu'il se passe. dmesg me retourne des lignes pas très rassurantes:
Code : Tout sélectionner
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Code : Tout sélectionner
A mandatory SMART command failed: exiting. To continue, add one or more '-T
permissive' options*
Pour m'assurer que le problème venait bien du disque et pas de la carte mère ou d'un cable, je l'ai branché en USB sur cette même machine. Mêmes erreurs, et toujours monté sur /home en lecture seule. Et aujourd'hui, je branche en USB ce même disque dur sur une autre machine, et j'y ai accès en lecture/écriture, et smartctl est content et m'indique même
Code : Tout sélectionner
SMART overall-health self-assessment test result: PASSED
Code : Tout sélectionner
SMART Error Log Version: 1
ATA Error Count: 26 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 26 occurred at disk power-on lifetime: 13182 hours (549 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 20 ff ff ff 4f 00 00:09:08.574 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 00:09:08.565 READ FPDMA QUEUED
60 00 20 ff ff ff 4f 00 00:09:08.564 READ FPDMA QUEUED
ef 10 03 00 00 00 a0 00 00:09:08.555 SET FEATURES [Enable SATA feature]
ef 10 02 00 00 00 a0 00 00:09:08.545 SET FEATURES [Enable SATA feature]
Error 25 occurred at disk power-on lifetime: 13182 hours (549 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 20 ff ff ff 4f 00 00:09:08.083 READ FPDMA QUEUED
60 00 20 ff ff ff 4f 00 00:09:08.080 READ FPDMA QUEUED
60 00 20 ff ff ff 4f 00 00:09:08.078 READ FPDMA QUEUED
60 00 20 ff ff ff 4f 00 00:09:08.068 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 00:09:08.045 READ FPDMA QUEUED
Error 24 occurred at disk power-on lifetime: 13182 hours (549 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 20 ff ff ff 4f 00 00:09:07.216 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 00:09:07.196 READ FPDMA QUEUED
60 00 20 ff ff ff 4f 00 00:09:07.195 READ FPDMA QUEUED
ef 10 03 00 00 00 a0 00 00:09:07.185 SET FEATURES [Enable SATA feature]
ef 10 02 00 00 00 a0 00 00:09:07.176 SET FEATURES [Enable SATA feature]
Error 23 occurred at disk power-on lifetime: 13182 hours (549 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 20 ff ff ff 4f 00 00:09:05.966 READ FPDMA QUEUED
60 00 20 ff ff ff 4f 00 00:09:05.964 READ FPDMA QUEUED
60 00 20 ff ff ff 4f 00 00:09:05.962 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 00:09:05.951 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 00:09:05.951 READ FPDMA QUEUED
Error 22 occurred at disk power-on lifetime: 13182 hours (549 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 20 ff ff ff 4f 00 00:09:03.606 READ FPDMA QUEUED
60 00 20 ff ff ff 4f 00 00:09:03.573 READ FPDMA QUEUED
60 00 00 ff ff ff 4f 00 00:09:03.562 READ FPDMA QUEUED
60 00 00 ff ff ff 4f 00 00:09:03.562 READ FPDMA QUEUED
60 00 00 ff ff ff 4f 00 00:09:03.562 READ FPDMA QUEUED