web-dev-qa-db-fra.com

Ubuntu 20.04 - arrêt après la surchauffe

J'utilise un ThinkPad L13. Maintenant, j'ai des problèmes thermiques surtout sous pleine charge. Quand j'exécute mon Python Programme qui utilise tous les noyaux, mon ordinateur portable s'arrête bientôt.

Qu'est-ce que j'ai essayé jusqu'à présent? J'ai installé TLP et Thermald sur ma machine. De plus, j'ai changé les paramètres Intel dans le BIOS en "équilibré".

Récemment, deux choses ont eu lieu:

  1. J'avais installé Ubuntu 20.04.

  2. En raison de problèmes graphiques avec mon ThinkPad, ils avaient changé ma carte mère récemment. Peut-être que c'est un problème matériel, comme le refroidisseur ne correspond pas correctement?

Avant cela, aucun problème ne s'est produit.

La commande grep -i -e temp -e therm /var/log/syslog* Produit la sortie suivante à cette occasion:

Apr 29 09:20:50 omikron systemd[1]: Started Daily Cleanup of Temporary Directories.
Apr 29 09:20:50 omikron systemd[1]: Starting Thermal Daemon Service...
Apr 29 09:20:50 omikron kernel: [    0.221560] mce: CPU0: Thermal monitoring enabled (TM1)
Apr 29 09:20:50 omikron kernel: [    0.376125] ACPI: \_SB_.PR00: _OSC native thermal LVT Acked
Apr 29 09:20:50 omikron kernel: [    0.539054] thermal_sys: Registered thermal governor 'fair_share'
Apr 29 09:20:50 omikron kernel: [    0.539055] thermal_sys: Registered thermal governor 'bang_bang'
Apr 29 09:20:50 omikron kernel: [    0.539056] thermal_sys: Registered thermal governor 'step_wise'
Apr 29 09:20:50 omikron kernel: [    0.539056] thermal_sys: Registered thermal governor 'user_space'
Apr 29 09:20:50 omikron kernel: [    0.539057] thermal_sys: Registered thermal governor 'power_allocator'
Apr 29 09:20:50 omikron kernel: [    0.725855] thermal LNXTHERM:00: registered as thermal_zone0
Apr 29 09:20:50 omikron kernel: [    0.725856] ACPI: Thermal Zone [THM0] (31 C)
Apr 29 09:20:50 omikron kernel: [    2.056100] proc_thermal 0000:00:04.0: enabling device (0000 -> 0002)
Apr 29 09:20:50 omikron kernel: [    2.147392] proc_thermal 0000:00:04.0: Creating sysfs group for PROC_THERMAL_PCI
Apr 29 09:20:50 omikron kernel: [    2.412750] thermal thermal_zone5: failed to read out thermal zone (-61)
Apr 29 09:20:50 omikron sensors[826]: temp1:            N/A
Apr 29 09:20:50 omikron sensors[826]: coretemp-isa-0000
Apr 29 09:20:50 omikron sensors[826]: temp1:         +1.0°C
Apr 29 09:20:50 omikron sensors[826]: temp2:         +1.0°C
Apr 29 09:20:50 omikron sensors[826]: temp3:         +4.0°C
Apr 29 09:20:50 omikron sensors[826]: temp4:         +0.0°C
Apr 29 09:20:50 omikron sensors[826]: temp5:       +121.0°C
Apr 29 09:20:50 omikron sensors[826]: temp6:       +121.0°C
Apr 29 09:20:50 omikron sensors[826]: temp7:         +0.0°C
Apr 29 09:20:50 omikron sensors[826]: temp8:         +0.0°C
Apr 29 09:20:50 omikron sensors[826]: temp9:        +64.0°C
Apr 29 09:20:50 omikron sensors[826]: temp10:        +3.0°C
Apr 29 09:20:50 omikron sensors[826]: temp11:       -80.0°C
Apr 29 09:20:50 omikron sensors[826]: temp12:        +0.0°C
Apr 29 09:20:50 omikron sensors[826]: temp13:        +0.0°C
Apr 29 09:20:50 omikron sensors[826]: temp14:        +0.0°C
Apr 29 09:20:50 omikron sensors[826]: temp15:        +0.0°C
Apr 29 09:20:50 omikron sensors[826]: temp16:        +0.0°C
Apr 29 09:20:50 omikron sensors[826]: temp1:        +48.0°C  (crit = +98.0°C)
Apr 29 09:20:50 omikron thermald[822]: [WARN]22 CPUID levels; family:model:stepping 0x6:8e:c (6:142:12)
Apr 29 09:20:50 omikron thermald[822]: [WARN]Polling mode is enabled: 4
Apr 29 09:20:50 omikron thermald[822]: [WARN]sensor id 10 : No temp sysfs for reading raw temp
Apr 29 09:20:50 omikron thermald[822]: message repeated 2 times: [ [WARN]sensor id 10 : No temp sysfs for reading raw temp]
Apr 29 09:20:50 omikron thermald[822]: I/O warning : failed to load external entity "/etc/thermald/thermal-conf.xml"
Apr 29 09:20:50 omikron thermald[822]: [WARN]error: could not parse file /etc/thermald/thermal-conf.xml
Apr 29 09:20:50 omikron thermald[822]: [WARN]sysfs open failed
Apr 29 09:20:50 omikron thermald[822]: I/O warning : failed to load external entity "/etc/thermald/thermal-conf.xml"
Apr 29 09:20:50 omikron thermald[822]: [WARN]error: could not parse file /etc/thermald/thermal-conf.xml
Apr 29 09:20:50 omikron systemd[1]: Started Thermal Daemon Service.
Apr 29 09:20:50 omikron thermald[822]: I/O warning : failed to load external entity "/etc/thermald/thermal-conf.xml"
Apr 29 09:20:50 omikron thermald[822]: [WARN]error: could not parse file /etc/thermald/thermal-conf.xml
Apr 29 09:21:04 omikron gsd-print-notif[1262]: Source ID 3 was not found when attempting to remove it
Apr 29 09:29:01 omikron kernel: [  493.759292] mce: CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
Apr 29 09:29:01 omikron kernel: [  493.759293] mce: CPU4: Core temperature above threshold, cpu clock throttled (total events = 1)
Apr 29 09:29:01 omikron kernel: [  493.759295] mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 1)
Apr 29 09:29:01 omikron kernel: [  493.759296] mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 1)
Apr 29 09:29:01 omikron kernel: [  493.759298] mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
Apr 29 09:29:01 omikron kernel: [  493.759299] mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
Apr 29 09:29:01 omikron kernel: [  493.759300] mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 1)
Apr 29 09:29:01 omikron kernel: [  493.759302] mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
Apr 29 09:29:01 omikron kernel: [  493.759326] mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
Apr 29 09:29:01 omikron kernel: [  493.759327] mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 1)
Apr 29 09:29:01 omikron kernel: [  493.760277] mce: CPU4: Core temperature/speed normal
Apr 29 09:29:01 omikron kernel: [  493.760278] mce: CPU0: Core temperature/speed normal
Apr 29 09:29:01 omikron kernel: [  493.760279] mce: CPU5: Package temperature/speed normal
Apr 29 09:29:01 omikron kernel: [  493.760280] mce: CPU1: Package temperature/speed normal
Apr 29 09:29:01 omikron kernel: [  493.760281] mce: CPU6: Package temperature/speed normal
Apr 29 09:29:01 omikron kernel: [  493.760282] mce: CPU2: Package temperature/speed normal
Apr 29 09:29:01 omikron kernel: [  493.760283] mce: CPU0: Package temperature/speed normal
Apr 29 09:29:01 omikron kernel: [  493.760284] mce: CPU4: Package temperature/speed normal
Apr 29 09:29:01 omikron kernel: [  493.760317] mce: CPU7: Package temperature/speed normal
Apr 29 09:29:01 omikron kernel: [  493.760318] mce: CPU3: Package temperature/speed normal
Apr 29 09:35:50 omikron systemd[1]: Starting Cleanup of Temporary Directories...
Apr 29 09:35:50 omikron systemd[1]: Finished Cleanup of Temporary Directories.
Apr 29 10:14:58 omikron kernel: [ 3250.661431] mce: CPU3: Core temperature above threshold, cpu clock throttled (total events = 1)
Apr 29 10:14:58 omikron kernel: [ 3250.661431] mce: CPU7: Core temperature above threshold, cpu clock throttled (total events = 1)
Apr 29 10:14:58 omikron kernel: [ 3250.661433] mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 196)
Apr 29 10:14:58 omikron kernel: [ 3250.661434] mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 196)
Apr 29 10:14:58 omikron kernel: [ 3250.661435] mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 196)
Apr 29 10:14:58 omikron kernel: [ 3250.661436] mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 196)
Apr 29 10:14:58 omikron kernel: [ 3250.661437] mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 196)
Apr 29 10:14:58 omikron kernel: [ 3250.661438] mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 196)
Apr 29 10:14:58 omikron kernel: [ 3250.661438] mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 196)
Apr 29 10:14:58 omikron kernel: [ 3250.661440] mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 196)
Apr 29 10:14:58 omikron kernel: [ 3250.665320] mce: CPU3: Core temperature/speed normal
Apr 29 10:14:58 omikron kernel: [ 3250.665321] mce: CPU7: Core temperature/speed normal
Apr 29 10:14:58 omikron kernel: [ 3250.665322] mce: CPU2: Package temperature/speed normal
Apr 29 10:14:58 omikron kernel: [ 3250.665323] mce: CPU0: Package temperature/speed normal
Apr 29 10:14:58 omikron kernel: [ 3250.665324] mce: CPU4: Package temperature/speed normal
Apr 29 10:14:58 omikron kernel: [ 3250.665325] mce: CPU5: Package temperature/speed normal
Apr 29 10:14:58 omikron kernel: [ 3250.665325] mce: CPU6: Package temperature/speed normal
Apr 29 10:14:58 omikron kernel: [ 3250.665326] mce: CPU1: Package temperature/speed normal
Apr 29 10:14:58 omikron kernel: [ 3250.665327] mce: CPU7: Package temperature/speed normal
Apr 29 10:14:58 omikron kernel: [ 3250.665328] mce: CPU3: Package temperature/speed normal
Apr 29 10:20:05 omikron kernel: [ 3557.746988] mce: CPU4: Core temperature above threshold, cpu clock throttled (total events = 323)
Apr 29 10:20:05 omikron kernel: [ 3557.746989] mce: CPU0: Core temperature above threshold, cpu clock throttled (total events = 323)
Apr 29 10:20:05 omikron kernel: [ 3557.746991] mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 650)
Apr 29 10:20:05 omikron kernel: [ 3557.746992] mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 650)
Apr 29 10:20:05 omikron kernel: [ 3557.746993] mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 650)
Apr 29 10:20:05 omikron kernel: [ 3557.746994] mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 650)
Apr 29 10:20:05 omikron kernel: [ 3557.747022] mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 650)
Apr 29 10:20:05 omikron kernel: [ 3557.747023] mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 650)
Apr 29 10:20:05 omikron kernel: [ 3557.747025] mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 650)
Apr 29 10:20:05 omikron kernel: [ 3557.747026] mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 650)
Apr 29 10:20:05 omikron kernel: [ 3557.749589] mce: CPU4: Core temperature/speed normal
Apr 29 10:20:05 omikron kernel: [ 3557.749590] mce: CPU0: Core temperature/speed normal
Apr 29 10:20:05 omikron kernel: [ 3557.749591] mce: CPU7: Package temperature/speed normal
Apr 29 10:20:05 omikron kernel: [ 3557.749591] mce: CPU3: Package temperature/speed normal
Apr 29 10:20:05 omikron kernel: [ 3557.749592] mce: CPU0: Package temperature/speed normal
Apr 29 10:20:05 omikron kernel: [ 3557.749593] mce: CPU4: Package temperature/speed normal
Apr 29 10:20:05 omikron kernel: [ 3557.749625] mce: CPU5: Package temperature/speed normal
Apr 29 10:20:05 omikron kernel: [ 3557.749626] mce: CPU1: Package temperature/speed normal
Apr 29 10:20:05 omikron kernel: [ 3557.749627] mce: CPU6: Package temperature/speed normal
Apr 29 10:20:05 omikron kernel: [ 3557.749628] mce: CPU2: Package temperature/speed normal
Apr 29 10:23:09 omikron kernel: [ 3741.654959] thermal thermal_zone0: critical temperature reached (100 C), shutting down

Modifier (05/01/2020) :

Aujourd'hui, j'ai eu une réunion de zoom et l'ordinateur portable est allé chaud de telle sorte qu'il soit éteint pendant la réunion. Ce n'est pas ce qui devrait arriver, non? Qu'est-ce qui se passe ici? Je n'ai pas exécuté un calcul compliqué ici. Peut-être que cela a quelque chose à voir avec le pouvoir d'alimentation depuis que je l'avais mis?


Edit (05/09/2020) :

Je pose les réglages de PEFORMANCE au niveau maximum et j'ai considéré le même test de stress que celui effectué dans divers revues de température de mon ordinateur portable. Sous Windows, j'ai des valeurs similaires comme elles le font. Par conséquent, je pense que -DOI être un problème avec le nouveau Ubuntu 20.04. En quelque sorte, Ubuntu ne manipulera pas la fréquence de sorte que la température descendrait.


Edit (07/19/2020) :

J'ai contacté le support de Lenovo et ils ont réparé mon cahier (tout ce qu'ils l'ont fait). Pendant quelques semaines, cela a fonctionné bien. Maintenant, j'ai encore le même problème.

J'ai mis à jour ma version de BIOS, qui aide mais vient avec un autre problème: la CPU est à 400 MHz dès que la température est proche de la surchauffe. En résultat, mon cahier est à peine utilisable pour des tâches exigeantes.

En tant que solution possible, j'ai désactivé le turbo de Turbo d'Intel. Les températures sont maintenant dans des gammes tolérables et tout fonctionne assez bien. C'est un compromis que je suis prêt à prendre.

11
YoungMath

Un diagnostic complet du matériel matériel + est difficile à effectuer via Askubuntu dans votre cas. Les problèmes matériels sont particulièrement difficiles.

Une alternative pour une première étape du diagnostic peut être fournie en installant un autre système d'exploitation côte à côte avec votre Ubuntu 20.04 et en effectuant également des tests intensifs.

Vous pouvez exécuter le même Python Program (si vous pouvez la configurer pour utiliser tous les noyaux). Même ainsi, il pourrait ne pas fonctionner sous exactement le même état que vous voyez des arrêts. Il y a pas mal Applications pour tester les performances, et ils doivent être suffisamment bons (voire plus stricts que votre programme). Et il n'aurait aucune "contamination" de votre possible configuration ubuntu 20.04.

Plus tard, lorsque le diagnostic complet est terminé, vous pouvez vous débarrasser de ce système d'exploitation et récupérer l'espace de votre Ubuntu.

Essaye ça:

mkdir ~/helper

curl https://raw.githubusercontent.com/Sepero/temp-throttle/stable/temp_throttle.sh -o ~/helper/temp_throttle.sh
chmod +x ~/helper/temp_throttle.sh

cat <<EOF > ~/helper/temp_down.sh 
#!/bin/bash
/usr/bin/Sudo -H -S <<< "yourpassword" -p GNOME_Sudo_PASS -u root bash -c '~/helper/temp_throttle.sh 65'
EOF

chmod +x ~/helper/temp_down.sh

Testez-le avec:

  sh ~/helper/temp_down.sh

Ceci est uniquement à tester si cela fonctionne, je ne recommande pas d'insérer votre mot de passe dans des fichiers texte facilement disponibles.

Vous pouvez l'ajouter aux applications de démarrage.

0
kenn