CCB request completed with an error

FreeNAS server was randomly crashing:

(da0:umass-sim0:0:0:0): Retrying command
(da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 00 24 26 9d 00 00 10 00
(da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da0:umass-sim0:0:0:0): Error 5, Retries exhausted
root@freenas:/mnt/data #

The USB drive is causing this.

root@wine:~ # usbconfig
ugen0.1: <Intel EHCI root HUB> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen1.1: <Intel EHCI root HUB> at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen1.2: <vendor 0x8087 product 0x0024> at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen0.2: <vendor 0x8087 product 0x0024> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen0.3: <SanDisk Cruzer Blade> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (224mA)

root@wine:~ # usbconfig -u 0 -a 3 dump_device_desc
ugen0.3: <SanDisk Cruzer Blade> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (224mA)

  bLength = 0x0012
  bDescriptorType = 0x0001
  bcdUSB = 0x0210
  bDeviceClass = 0x0000  <Probed by interface class>
  bDeviceSubClass = 0x0000
  bDeviceProtocol = 0x0000
  bMaxPacketSize0 = 0x0040
  idVendor = 0x0781
  idProduct = 0x5567
  bcdDevice = 0x0100
  iManufacturer = 0x0001  <SanDisk>
  iProduct = 0x0002  <Cruzer Blade>
  iSerialNumber = 0x0003  <4C531001561109121142>
  bNumConfigurations = 0x0001

Switched to a different USB stick and the problem went away.

Lesson: Not all USB sticks are created equally.

2018/05/28 14:43 · 0 Linkbacks

PEBS disabled due to CPU errata

I noticed this in /var/log/messages on my freshly installed CentOS 7.5 system.

May 24 09:46:43 localhost kernel: smpboot: CPU0: Intel(R) Xeon(R) CPU E31220 @ 3.10GHz (fam: 06, model: 2a, stepping: 07)
May 24 09:46:43 localhost kernel: Performance Events: PEBS fmt1+, 16-deep LBR, SandyBridge events, full-width counters, Intel PMU driver.
May 24 09:46:43 localhost kernel: core: PEBS disabled due to CPU errata, please upgrade microcode
May 24 09:46:43 localhost kernel: ... version:                3
May 24 09:46:43 localhost kernel: ... bit width:              48
May 24 09:46:43 localhost kernel: ... generic registers:      8
May 24 09:46:43 localhost kernel: ... value mask:             0000ffffffffffff
May 24 09:46:43 localhost kernel: ... max period:             00007fffffffffff
May 24 09:46:43 localhost kernel: ... fixed-purpose events:   3
May 24 09:46:43 localhost kernel: ... event mask:             00000007000000ff

RedHat has this to say about it. ref: https://access.redhat.com/solutions/634443

Root Cause

  • Clovertown and SandyBridge processors have errata regarding PEBS functionality.

Diagnostic Steps

  • Look at /proc/cpuinfo for model number 15, 42 or 45

So I did

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 42
model name      : Intel(R) Xeon(R) CPU E31220 @ 3.10GHz
stepping        : 7
microcode       : 0x29
cpu MHz         : 1599.951
cache size      : 8192 KB

Install a few packages

# yum install microcode_ctl.x86_64
# yum install iucode-tool
# reboot

I see this in /var/log/messages on the reboot

May 26 18:55:50 wine kernel: microcode: microcode updated early to revision 0x2d, date = 2018-02-07

Confirmation the processor has microcode patches applied

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 42
model name      : Intel(R) Xeon(R) CPU E31220 @ 3.10GHz
stepping        : 7
microcode       : 0x2d

The microcode is being updated as the system loads.

2018/05/26 19:04 · 0 Linkbacks

Laptop Hard Reboot

Model: HP EliteBook 745 G3

For a number of months my laptop has done what I can only describe as a hard boot. For no apparent reasons it would power cycle. No windows shutdown message, no warning, its as if somebody has come along and pulled the power cord or held down a reset button.

As this is the 2nd time its happened in less than 2 week this is really starting to cause me problems as I'm loosing work. You can see the last event 4/24/2017 on the reliability history report below.

Clicking on the view technical details reveals this snippet of information

The computer has rebooted from a bugcheck.  
The bugcheck was: 0x00000124 (0x0000000000000000, 0xffffe000f3f6c838, 0x0000000000000000, 0x0000000000000000). 
A dump was saved in: C:\WINDOWS\Minidump\050517-14718-01.dmp. Report Id: e55192e8-8d9d-45c1-8c03-e14e66640510.

Well at least windows agrees with me it was not shutdown correctly. Lets see if we can find out why.

We are going to need some additional windows tools to read that minidump

I've been keeping a log but the minidump directory has these occasions time stamped for me. As you can see this is the 8th time this has happened.

After installing the WDK we need to fire up windbg which can be found here.

Pulling the 0505 minidump into windbg this is what it tells us.

Microsoft (R) Windows Debugger Version 10.0.15063.0 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [C:\Windows\Minidump\050517-14718-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: srv*
Executable search path is: 
Windows 10 Kernel Version 10586 MP (4 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 10586.839.amd64fre.th2_release.170303-1605
Machine Name:
Kernel base = 0xfffff803`d0218000 PsLoadedModuleList = 0xfffff803`d04f5c90
Debug session time: Fri May  5 13:51:37.032 2017 (UTC - 4:00)
System Uptime: 0 days 0:00:02.748
Loading Kernel Symbols
..

Press ctrl-c (cdb, kd, ntsd) or ctrl-break (windbg) to abort symbol loads that take too long.
Run !sym noisy before .reload to track down problems loading symbols.

.............................................................
..
Loading User Symbols
Mini Kernel Dump does not contain unloaded driver list
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.
BugCheck 124, {0, ffffe000f3f6c838, 0, 0}
Probably caused by : AuthenticAMD
Followup:     MachineOwner
---------
2: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred. Parameter 1 identifies the type of error
source that reported the error. Parameter 2 holds the address of the
WHEA_ERROR_RECORD structure that describes the error conditon.
Arguments:
Arg1: 0000000000000000, Machine Check Exception
Arg2: ffffe000f3f6c838, Address of the WHEA_ERROR_RECORD structure.
Arg3: 0000000000000000, High order 32-bits of the MCi_STATUS value.
Arg4: 0000000000000000, Low order 32-bits of the MCi_STATUS value.

Debugging Details:
------------------

DUMP_CLASS: 1
DUMP_QUALIFIER: 400
BUILD_VERSION_STRING:  10.0.10586.839 (th2_release.170303-1605)
DUMP_TYPE:  2
BUGCHECK_P1: 0
BUGCHECK_P2: ffffe000f3f6c838
BUGCHECK_P3: 0
BUGCHECK_P4: 0
BUGCHECK_STR:  0x124_AuthenticAMD
CPU_COUNT: 4
CPU_MHZ: 705
CPU_VENDOR:  AuthenticAMD
CPU_FAMILY: 15
CPU_MODEL: 60
CPU_STEPPING: 1
CUSTOMER_CRASH_COUNT:  1
DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT
PROCESS_NAME:  System
CURRENT_IRQL:  0
ANALYSIS_SESSION_HOST:  ENGLAND1
ANALYSIS_SESSION_TIME:  05-05-2017 14:34:37.0456
ANALYSIS_VERSION: 10.0.15063.0 amd64fre

STACK_TEXT:  
ffffd000`ab3245b0 fffff803`d05c77cd : 00000000`00000000 ffffe000`f3f6c810 fffff803`d04e96a0 fffff803`d05aa340 : nt!WheapCreateLiveTriageDump+0x81
ffffd000`ab324ae0 fffff803`d0428c94 : ffffe000`f3f6c810 ffffe000`f3f73030 ffffd000`ab324af8 00000000`00000000 : nt!WheapCreateTriageDumpFromPreviousSession+0x2d
ffffd000`ab324b10 fffff803`d0429dd9 : fffff803`d04e9640 fffff803`d04e9640 fffff803`d04e96a0 fffff803`d028d710 : nt!WheapProcessWorkQueueItem+0x48
ffffd000`ab324b50 fffff803`d025dcf9 : fffff803`d05aa200 ffffe000`f3bb8040 fffff803`00000000 ffffe000`f42fca48 : nt!WheapWorkQueueWorkerRoutine+0x25
ffffd000`ab324b80 fffff803`d02cd9b5 : 00000205`b4bbbdff 00000000`00000080 ffffe000`f2427680 ffffe000`f3bb8040 : nt!ExpWorkerThread+0xe9
ffffd000`ab324c10 fffff803`d035fae6 : fffff803`d0534180 ffffe000`f3bb8040 fffff803`d02cd974 00000000`00000000 : nt!PspSystemThreadStartup+0x41
ffffd000`ab324c60 00000000`00000000 : ffffd000`ab325000 ffffd000`ab31f000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x16


STACK_COMMAND:  kb
THREAD_SHA1_HASH_MOD_FUNC:  26acd050bd9f055d0a04825d57b9e0e6be9c1a07
THREAD_SHA1_HASH_MOD_FUNC_OFFSET:  5e1e1a155874296ef3d407b143c830e84a016e94
THREAD_SHA1_HASH_MOD:  30a3e915496deaace47137d5b90c3ecc03746bf6
FOLLOWUP_NAME:  MachineOwner
MODULE_NAME: AuthenticAMD
IMAGE_NAME:  AuthenticAMD
DEBUG_FLR_IMAGE_TIMESTAMP:  0
FAILURE_BUCKET_ID:  0x124_AuthenticAMD_PROCESSOR_BUS_PRV
BUCKET_ID:  0x124_AuthenticAMD_PROCESSOR_BUS_PRV
PRIMARY_PROBLEM_CLASS:  0x124_AuthenticAMD_PROCESSOR_BUS_PRV
TARGET_TIME:  2017-05-05T17:51:37.000Z
OSBUILD:  10586
OSSERVICEPACK:  839
SERVICEPACK_NUMBER: 0
OS_REVISION: 0
SUITE_MASK:  272
PRODUCT_TYPE:  1
OSPLATFORM_TYPE:  x64
OSNAME:  Windows 10
OSEDITION:  Windows 10 WinNt TerminalServer SingleUserTS
OS_LOCALE:  
USER_LCID:  0
OSBUILD_TIMESTAMP:  2017-03-03 23:13:02
BUILDDATESTAMP_STR:  170303-1605
BUILDLAB_STR:  th2_release
BUILDOSVER_STR:  10.0.10586.839
ANALYSIS_SESSION_ELAPSED_TIME:  b84
ANALYSIS_SOURCE:  KM
FAILURE_ID_HASH_STRING:  km:0x124_authenticamd_processor_bus_prv
FAILURE_ID_HASH:  {6fd7875b-9a1b-9e09-d6d6-816026a875c8}

Followup:     MachineOwner
---------

Decoding that ARG2 from the WHEA_UNCORRECTABLE_ERROR (124)

2: kd> !errrec ffffe000f3f6c838
===============================================================================
Common Platform Error Record @ ffffe000f3f6c838
-------------------------------------------------------------------------------
Record Id     : 01d2c5c83ca54560
Severity      : Fatal (1)
Length        : 928
Creator       : Microsoft
Notify Type   : Machine Check Exception
Timestamp     : 5/5/2017 17:51:37 (UTC)
Flags         : 0x00000002 PreviousError

===============================================================================
Section 0     : Processor Generic
-------------------------------------------------------------------------------
Descriptor    @ ffffe000f3f6c8b8
Section       @ ffffe000f3f6c990
Offset        : 344
Length        : 192
Flags         : 0x00000001 Primary
Severity      : Fatal

Proc. Type    : x86/x64
Instr. Set    : x64
Error Type    : BUS error
Operation     : Generic
Flags         : 0x00
Level         : 3
CPU Version   : 0x0000000000660f01
Processor ID  : 0x0000000000000000

===============================================================================
Section 1     : x86/x64 Processor Specific
-------------------------------------------------------------------------------
Descriptor    @ ffffe000f3f6c900
Section       @ ffffe000f3f6ca50
Offset        : 536
Length        : 128
Flags         : 0x00000000
Severity      : Fatal

Local APIC Id : 0x0000000000000000
CPU Id        : 01 0f 66 00 00 08 04 00 - 0b 32 d8 7e ff fb 8b 17
                00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00
                00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00

Proc. Info 0  @ ffffe000f3f6ca50

===============================================================================
Section 2     : x86/x64 MCA
-------------------------------------------------------------------------------
Descriptor    @ ffffe000f3f6c948
Section       @ ffffe000f3f6cad0
Offset        : 664
Length        : 264
Flags         : 0x00000000
Severity      : Fatal

Error         : BUSLG_OBS_ERR_*_NOTIMEOUT_ERR (Proc 0 Bank 4)
  Status      : 0xfa000010000b0c0f

REF

https://answers.microsoft.com/en-us/windows/forum/windows_7-performance/help-windows-7-bsod-system-service-exception/7f165f52-d13b-4c1f-8160-f8483727c874?page=2

So my options are either:

  • Your RAM is faulty (Bank 4 = 4th DIMM slot). Run Memtest for NO LESS than ~8 passes (several hours):
  • Your motherboard is faulty, and will need to be replaced.

Either way its a hardware problem. Others reporting the same

Digging deeper: ref https://davidcmoisan.wordpress.com/2010/07/01/bad-hardware-day-more-on-hardware-bluescreens/

2: kd> .formats 0xfa000010000b0c0f
Evaluate expression:
  Hex:     fa000010`000b0c0f
  Decimal: -432345495507366897
  Octal:   1750000001000002606017
  Binary:  11111010 00000000 00000000 00010000 00000000 00001011 00001100 00001111
  Chars:   ........
  Time:    ***** Invalid FILETIME
  Float:   low 1.01452e-039 high -1.66154e+035
  Double:  -4.53808e+279

Wonder if this is heat related as I had this set to Passive as the FAN was noisy. Probably not that would imply that this laptop would never run on batteries ! We will change it back to ACTIVE and see if that helps any and I'll put up with the FAN spinning for a while.

That did not help I'm still suffering random reboots latest happened 16-Jun-2017

This machine is being returned I cannot tolerate a computer randomly rebooting.

2017/06/22 12:06

Windows patches and SHA1

Windows download URL contain a SHA1 checksum as part of the URL:

http://www.download.windowsupdate.com/msdownload/update/v3-19990518/cabpool/windowsserver2003-kb824141-x86-enu_90853a52ea80f7da3c5460ef102ade3.exe

You can download the file and the use the SHA1 checksum from the URL itself to then validate the file downloaded correctly. Sounds like a good idea. It is until MS screw up the SHA1 on the URL.

# openssl sha1 windowsserver2003-kb824141-x86-enu_90853a52ea80f7da3c5460ef102ade3.exe
SHA1(windowsserver2003-kb824141-x86-enu_90853a52ea80f7da3c5460ef102ade3.exe)= bfa8072aa29dbe552f952cdb42b1f635072ae081

These are a list of filenames that I've discovered where the SHA in the URL file does not match that computed.

['windowsserver2003-kb824141-x86-enu_90853a52ea80f7da3c5460ef102ade3.exe',
 'msjavwu_8073687b82d41db93f4c2a04af2b34d.exe',
 'windowsserver2003-kb835732-x86-enu_9c2348f833ade0cca439ec6b2a92179.exe',
 'windowsmedia9-kb819639-x86-enu_57af369562f19dc35e69681660521fb.exe',
 'windowsserver2003-kb828741-x86-enu_1e3156bf5ec0354f542c38f309bab49.exe',
 'windowsserver2003-kb819696-x86-enu_41cdc8619ebb756106ea383c055530d.exe',
 'windowsserver2003-kb825119-x86-enu_329e94ea193be4c2d2f8d9bfc4daf23.exe',
 'windowsserver2003-kb840374-x86-enu_eeafbc20c2402b1c951d155d3d2cb9c.exe',
 'windowsserver2003-kb837001-x86-enu_0a248bb59a71c52a288c837779ac98e.exe',
 'windowsserver2003-kb823980-x86-enu_7f97e0d2355f670acb9384ad0933515.exe',
 'windowsserver2003-kb824146-x86-enu_f759bdcfdc906b0b35ad697a29ed1a1.exe',
 'windowsserver2003-kb823559-x86-enu_d8d3b25c5678c692e29cf971a6c38fa.exe',
 'windowsserver2003-kb824105-x86-enu_c7fd830ee6b1c3bb594be4f7a61f43c.exe',
 'windowsserver2003-kb828028-x86-enu_52dce385c001ce81c2514c3fb1cac7e.exe',
 'windowsserver2003-kb828035-x86-enu_d1df77e311740d6c012bcda5a7f821f.exe',
 'directx9-kb819696-x86-enu_977f8cc86c1e151a0168d1296210913.exe',
 'windowsserver2003-kb830352-x86-enu_d67acb6c784dd87961c8070943dadd8.exe',
 'sql2000-kb815495-8.00.0818-enu_4c77bb3f492fb1670b90b477d674e7e.exe',
 'windowsserver2003-kb823182-x86-enu_c7ee6a3716815554656d98ed9bc85d5.exe',
 'windowsxp-kb883939-x64-enu_9e1efe32675530155c34f7af1172a6d496e1e5ee.exe',
 'ndp10_sp_q321884_en_0fc8b14a073e01a03c27c948d254feedaa79feae.exe']
2015/05/19 12:07 · 0 Linkbacks

Python decompress PACK_MAGIC

A file compressed with pack format has magic bytes in octal \036\037 or in hex 0x1e1f

GZIP can decode this along with the pcat program. For an exercise I converted the unpack.c module in gzip into its python equivalent.

The slowest part of the code is the look_bits function and this is where you can see how an interpreted language grinds compared to C.

Using the excellent line profiler: https://pypi.python.org/pypi/line_profiler/

Timer unit: 1e-06 s = 1uS

Total time: 43.3399 s
File: unpack.py
Function: look_bits at line 36

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    36                                               @profile
    37                                               def look_bits(self,bits,mask):
    38    351442       265280      0.8      0.6          while(self.valid < bits):
    39    140575     10361348     73.7     23.9              self.bitbuf <<= 8
    40    140575     14576495    103.7     33.6              self.bitbuf |= next(self.get_byte)
    41    140575       189102      1.3      0.4              self.valid += 8
    42    210867     17947709     85.1     41.4          return (self.bitbuf >> (self.valid - bits)) & mask

unpack.zip

2014/10/24 11:09 · 0 Linkbacks

Older entries >>