Question regarding hdparm (or something else)

Hi all, :wave:

IΒ΄ve no idea whether the hdparm command is the correct topic here. So IΒ΄ll give it a shot.

Usually, when I boot up my system (Linux Lite 6.2) and I get past the login screen the first thing I do is open a terminal and issue the command:

set serial 57584C314135364855375855; and echo (lsblk -o name,serial | grep $serial | awk '{print $1}'); and sleep 2; and sudo hdparm -B /dev/(lsblk -o name,serial | grep $serial | awk '{print $1}'); and sleep 2; and sudo hdparm -B 254 /dev/(lsblk -o name,serial | grep $serial | awk '{print $1}'); and sudo hdparm -B /dev/(lsblk -o name,serial | grep $serial | awk '{print $1}'); and sleep 3; and sudo smartctl -A -d sat /dev/(lsblk -o name,serial | grep $serial | awk '{print $1}'); and echo $status .

(fish-syntax: β€œ; and” is the same as β€œ&&” in bash)

This command provides an output like this:

/dev/sdb:
 APM_level	= 128

/dev/sdb:
 setting Advanced Power Management level to 0xfe (254)
SG_IO: bad/missing sense data, sb[]:  f0 00 01 00 50 40 fe 0a 00 00 00 00 00 1d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 APM_level	= 254

/dev/sdb:
 APM_level	= 254
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-76-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   126   108   021    Pre-fail  Always       -       4691
  4 Start_Stop_Count        0x0032   097   097   000    Old_age   Always       -       3320
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   085   085   000    Old_age   Always       -       11588
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       2305
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       163
193 Load_Cycle_Count        0x0032   178   178   000    Old_age   Always       -       66066
194 Temperature_Celsius     0x0022   111   104   000    Old_age   Always       -       36
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

0

What is does is: it shows the current APM_level settings for my (external) HDD (128), sets it to 254 and in addition to that shows the HDDΒ΄s health status.

As I said: normally thatΒ΄s the first thing I do after login. And it worked every time without fail. I.e.: it worked immediately.

Today, however, at some point during the execution of the various commands it seems to have failed. As this is a multi-command expression I donΒ΄t know at which point. :slightly_frowning_face:

I tried it right after the failure for the second time and again it failed. :slightly_frowning_face:

I tried it a third time and it suddenly it was executed in a pefect way, as always. :+1:

Hmm, :thinking:
IΒ΄ve never seen this behaviour before, so I shut down the system and performed a fresh (cold) start…
… with the same result. Only the third attempt was successful.

Once again I shut down the system and once again I started afresh. This time the very first attempt of the command sequence was successful, like it used
to be. :smiling_face:

The only difference between the first two (failed) attempts and the last one was:
After the first and the second boot I issued the htop command before executing my β€œspecial” command.

My third try was thus that I omitted β€œhtop” and issued my command first. Now it worked immendiately.

IΒ΄m not saying that issuing the htop command first was the cause, I was just pointing out the circumstances.

I also took a look at the log files with lnav, but couldnΒ΄t see anything pointing towards any problems regarding the matter.

Does anyone have a clue what might have been going on :question:

Thanks a lot in advance.

Many greetings
Rosika :slightly_smiling_face:

Hj Rosika

The SMART output looks OK
The old fashioned way to debug that is to separate all the lines of that startup command, and
put some print statements in between, so you can see where it stops.

Regards
Neville

2 Likes

Hi Neville, :wave:

thanks for your evaluation.

O.K., could you provide an example for that? IΒ΄m afraid IΒ΄m not completely following you. Sorry, Neville. :neutral_face:

Cheers from Rosika :slightly_smiling_face:

You must have that long command stored as a file.
Edit the file and break it up get rid of the ; and’s
Turn it into a script
Add some echo’s in between the lines so you can see where it stops
Run it as a script

3 Likes

Yes, I get it now.

Thanks a lot for your kind help, Neville. :heart:

Many greetings from Rosika :slightly_smiling_face:

1 Like

UPDATE:

Hi Neville, :wave:

I did what you suggested.

Boy, itΒ΄s been a long time since I last wrote a script :wink: .
I am so accustomed to the fish syntax that I really needed a few attempts to get the script right with the bash syntax.

Well, it looks like this now:

batcat hdparm_script.sh
───────┬─────────────────────────────────────────────────────────────────────────────────────
       β”‚ File: hdparm_script.sh
───────┼─────────────────────────────────────────────────────────────────────────────────────
   1   β”‚ #!/bin/bash
   2   β”‚ 
   3   β”‚ serial=57584C314135364855375855
   4   β”‚ echo $(lsblk -o name,serial | grep $serial | awk '{print $1}')
   5   β”‚ sleep 2
   6   β”‚ sudo hdparm -B /dev/$(lsblk -o name,serial | grep $serial | awk '{print $1}')
   7   β”‚ sleep 2
   8   β”‚ sudo hdparm -B 254 /dev/$(lsblk -o name,serial | grep $serial | awk '{print $1}')
   9   β”‚ sleep 2
  10   β”‚ sudo hdparm -B /dev/$(lsblk -o name,serial | grep $serial | awk '{print $1}')
  11   β”‚ sleep 2
  12   β”‚ sudo smartctl -A -d sat /dev/$(lsblk -o name,serial | grep $serial | awk '{print $1}
       β”‚ ')
  13   β”‚ echo $?

As I didnΒ΄t want to shut the system down for doing the experiment and then boot it up again, I decided to manually set the APM_level back to 128 again (fish syntax):

set serial 57584C314135364855375855
sudo hdparm -B 128 /dev/(lsblk -o name,serial | grep $serial | awk '{print $1}')

Then I let the script run in β€œcontrol” mode (-x option):

-x: Print commands and their arguments as they are executed.

(from bash man pages).

bash -x hdparm_script.sh
+ serial=57584C314135364855375855
++ lsblk -o name,serial
++ grep 57584C314135364855375855
++ awk '{print $1}'
+ echo sdb
sdb
+ sleep 2
++ lsblk -o name,serial
++ grep 57584C314135364855375855
++ awk '{print $1}'
+ sudo hdparm -B /dev/sdb
[sudo] password for [...]
/dev/sdb:
 APM_level	= 128
+ sleep 2
++ lsblk -o name,serial
++ grep 57584C314135364855375855
++ awk '{print $1}'
+ sudo hdparm -B 254 /dev/sdb

/dev/sdb:
 setting Advanced Power Management level to 0xfe (254)
SG_IO: bad/missing sense data, sb[]:  f0 00 01 00 50 40 fe 0a 00 00 00 00 00 1d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 APM_level	= 254
+ sleep 2
++ lsblk -o name,serial
++ grep 57584C314135364855375855
++ awk '{print $1}'
+ sudo hdparm -B /dev/sdb

/dev/sdb:
 APM_level	= 254
+ sleep 2
++ lsblk -o name,serial
++ grep 57584C314135364855375855
++ awk '{print $1}'
+ sudo smartctl -A -d sat /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-76-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   126   108   021    Pre-fail  Always       -       4691
  4 Start_Stop_Count        0x0032   097   097   000    Old_age   Always       -       3320
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   085   085   000    Old_age   Always       -       11591
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       2305
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       163
193 Load_Cycle_Count        0x0032   178   178   000    Old_age   Always       -       66066
194 Temperature_Celsius     0x0022   110   104   000    Old_age   Always       -       37
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

+ echo 0
0

O.K., It ran through without any errors. :+1:

No idea why the commands seemed to have produced some hiccups when I first started the system today… :thinking:

It almost seems as if some component/process took longer than expected to load after boot and thus wasnΒ΄t available at the time when I first issued the command(s).
But thatΒ΄s just a guess of course…

The only thing thatΒ΄s new is the fact that I have zRAM enabled since 2 days.

IΒ΄ve done that via β€œLite Tweaks”. Whether ot not that would cause such a potential β€œdelay” is something I surely donΒ΄t know…

Well, IΒ΄ll have to wait for tomorrow and then run the script again directly after the login process.

Thanks a lot for your help, Neville. :heart:

Many greetings from Rosika :slightly_smiling_face:

2 Likes

@nevj :

Hi again, Neville, :wave:

no problems today, it seems :smiley: .

After booting, when I got past the login screen, the first thing I did was run the script (again as bash -x [...] .
It went through smoothly. Nothing to complain.

I guess IΒ΄ll be doing it this way for at least the next few days.
Looks good so far.

Thanks for the suggestion to run a script, Neville. That surely was the wisest course.

Many greetings from Rosika :slightly_smiling_face:

1 Like