Alanarchy
Posts 0
Alanarchy
#1
Suddenly the boot up messages are complaining about ECC not being enabled or present in the BIOS and my sound cards aren't recognised __{{emoticon}}__

However inxi -Fxz gives this:

Code: Select all

$ inxi -Fxz
System:    Host: antiX64 Kernel: 3.9.5-antix.1-amd64-smp x86_64 (64 bit, gcc: 4.7.3) 
           Desktop: Fluxbox 1.3.5 Distro: antiX-13_x64-full Luddite 01 June 2013
Machine:   Mobo: Gigabyte model: GA-78LMT-USB3 version: x.x Bios: Award version: F4 date: 10/19/2012
CPU:       Octa core AMD FX-8350 Eight-Core (-MCP-) cache: 16384 KB flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm) bmips: 64286 
           Clock Speeds: 1: 1400.00 MHz 2: 1400.00 MHz 3: 1400.00 MHz 4: 1400.00 MHz 5: 1400.00 MHz 6: 1400.00 MHz 7: 1400.00 MHz 8: 1400.00 MHz
Graphics:  Card: Advanced Micro Devices [AMD/ATI] Turks XT [Radeon HD 6670/7670] bus-ID: 01:00.0 
           X.Org: 1.12.4 driver: fglrx Resolution: 1920x1080@60.0hz 
           GLX Renderer: AMD Radeon HD 6670 GLX Version: 4.2.12217 - CPC 12.104 Direct Rendering: Yes
Audio:     Card-1: Advanced Micro Devices [AMD/ATI] Turks/Whistler HDMI Audio [Radeon HD 6000 Series] bus-ID: 01:00.1 
           Card-2: Advanced Micro Devices [AMD/ATI] SBx00 Azalia (Intel HDA) bus-ID: 00:14.2 
           Sound: Advanced Linux Sound Architecture ver: k3.9.5-antix.1-amd64-smp
Network:   Card: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller 
           driver: r8169 ver: 2.3LK-NAPI port: ee00 bus-ID: 03:00.0
           IF: eth0 state: up speed: 100 Mbps duplex: full mac: <filter>
Drives:    HDD Total Size: 1000.2GB (2.0% used) 1: id: /dev/sda model: MB1000EAMZE size: 1000.2GB 
Partition: ID: / size: 915G used: 19G (3%) fs: ext4 ID: swap-1 size: 2.17GB used: 0.00GB (0%) fs: swap 
Sensors:   System Temperatures: cpu: 32.0C mobo: 30.0C gpu: 34.50C 
           Fan Speeds (in rpm): cpu: 1194 fan-1: 2490 fan-3: 0 
Info:      Processes: 147 Uptime: 11 min Memory: 488.9/32180.2MB Runlevel: 5 Gcc sys: 4.7.3 
           Client: Shell (bash 4.2.45) inxi: 1.9.12 
Plus the volume icon is missing from the task bar and there is no sound. This is the log:

Code: Select all

[    5.253225] Linux agpgart interface v0.103
[    5.254553] MCE: In-kernel MCE decoding enabled.
[    5.284661] microcode: CPU0: patch_level=0x0600081c
[    5.301991] AMD IOMMUv2 driver by Joerg Roedel <joerg.roedel@amd.com>
[    5.302225] AMD IOMMUv2 functionality not available on this system
[    5.311280] input: PC Speaker as /devices/platform/pcspkr/input/input5
[    5.359113] EDAC MC: Ver: 3.0.0
[    5.399273] parport_pc 00:08: reported by Plug and Play ACPI
[    5.399554] parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
[    5.542087] ACPI Warning: 0x0000000000000b00-0x0000000000000b07 SystemIO conflicts with Region \SOR1 1 (20130117/utaddress-251)
[    5.542599] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    5.545715] AMD64 EDAC driver v3.4.0
[    5.545978] EDAC amd64: DRAM ECC disabled.
[    5.546224] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
[    5.546224]  Either enable ECC checking or force module loading by setting 'ecc_enable_override'.
[    5.546224]  (Note that use of the override may cause unknown side effects.)
[    5.548533] sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver v0.05
[    5.548793] sp5100_tco: PCI Revision ID: 0x3c
[    5.549039] sp5100_tco: failed to find MMIO address, giving up.
[    5.583484] microcode: failed to load file amd-ucode/microcode_amd_fam15h.bin
[    5.583724] microcode: CPU1: patch_level=0x0600081c
[    5.583962] microcode: CPU2: patch_level=0x0600081c
[    5.584202] microcode: CPU3: patch_level=0x0600081c
[    5.584440] microcode: CPU4: patch_level=0x0600081c
[    5.584685] microcode: CPU5: patch_level=0x0600081c
[    5.584925] microcode: CPU6: patch_level=0x0600081c
[    5.585165] microcode: CPU7: patch_level=0x0600081c
[    5.585438] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
[    5.731678] fglrx: module license 'Proprietary. (C) 2002 - ATI Technologies, Starnberg, GERMANY' taints kernel.
[    5.732104] Disabling lock debugging due to kernel taint
[    5.737940] <6>[fglrx] Maximum main memory to use for locked dma buffers: 31522 MBytes.
[    5.738530] <6>[fglrx]   vendor: 1002 device: 6758 count: 1
[    5.738674] ACPI: acpi_idle registered with cpuidle
[    5.739388] <6>[fglrx] ioport: bar 4, base 0xce00, size: 0x100
[    5.739856] <6>[fglrx] Kernel PAT support is enabled
[    5.740101] <6>[fglrx] module loaded - fglrx 12.10.5 [Mar 28 2013] with 1 minors
[    5.776608] kvm: disabled by bios
[    5.885603] acpi-cpufreq: overriding BIOS provided _PSD data
at which point it takes a break from booting. This is weird because I changed neither the kernel, nor the BIOS settings.
Last edited by Guest on 08 Sep 2013, 10:05, edited 1 time in total.
Alanarchy
Posts 0
Alanarchy
#2
OK, so this appears to be known about in case anybody else gets the same problem.


========= SCRAPER REMOVED AN EMBEDDED LINK HERE ===========
url was:"https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=cab4d27764d5a8654212b3e96eb0ae793aec5b94"
linktext was:"https://git.kernel.org/cgit/linux/kerne ... 793aec5b94"
====================================
Alanarchy
Posts 0
Alanarchy
#3
Installing the Liquorix Kernel"fixes" this problem. Exactly why is beyond my area of expertise.
Alanarchy
Posts 0
Alanarchy
#4
I've reinstalled antix serveral times and Linux Mint once trying to sort this out over the past week. Linux Mint had no audio either.

So the whole thing comes down to LM Sensors, which causes graphics problems and breaks Audio on my machine, and I quote from the LM sensors web-site!
September 5th, 2013: Hardware breakage reported Over the past few months, we had several reports of sensors-detect causing serious trouble on recent hardware (most notably laptops.) We still don't know what exactly is happening, and while it might be reversible, we don't know how, so in practice this is equivalent to the hardware itself being broken. The symptoms are that the display starts misbehaving ( wrong resolution or wrong gamma factor.) We have mitigated the risk by changing the default behavior of sensors-detect to no longer touch EDID EEPROMs and then to no longer probe graphics adapters at all unless the user asks for it. We urge maintainers to backport changesets r6040 and r6084 to all Linux distributions which are still shipping lm-sensors 3.3.2 or older. Versions 3.3.3 and newer are not affected.

========= SCRAPER REMOVED AN EMBEDDED LINK HERE ===========
url was:"http://lm-sensors.org/"
linktext was:"http://lm-sensors.org/"
====================================


Thanks for your help sorting out this serious problem which explains why two of my machines have stopped working recently, but then again, it isn't an antix problem per se, but an lm sensors problem, isn't it? Why should anybody here give a damn?
Posts: 1,308
BitJam
Joined: 31 Aug 2009
#5
Alanarchy wrote:Thanks for your help sorting out this serious problem which explains why two of my machines have stopped working recently, but then again, it isn't an antix problem per se, but an lm sensors problem, isn't it? Why should anybody here give a damn?
This is indeed a serious problem. I can't speak for others but for me at least please don't confuse an inability to help you with the lack of a desire to do so.

When you first reported this problem, it seemed likely it was an issue with failing hardware. It was certainly not someone I had any previous experience with. I'm happy to help people here with non-antiX Linux problems, especially if it is something I'm familiar with. I will also help Google for solutions to non-antiX problems I'm not familiar with but in this case I didn't even know what to Google for.

Then a day or two after your first report, you seemed to have tracked down the cause of the problem and to have come up with a solution. It still seemed like a strange problem but some kind of weird BIOS interaction and an over-reaction by the kernel seemed at least plausible. Since this was a problem that was introduced by a change to the kernel and then fixed, it seemed like something that would automatically self-correct when our kernel was updated.

Kudos to you for tracking this down to a extremely bizarre problem in sensors-detect. As you reported, even the lm-sensors devs don't understand how this is happening.

TBH, one of my difficulties in responding to posts I don't have a ready solution for is I sometimes totally misunderstand what the poster is talking about, often because I take things too literally. This is compounded by the fact that sometimes people don't report problems correctly. It would be a big waste of everyone's time if I responded to every posted I didn't understand saying something like"I'm sorry you are having trouble but I have nothing useful to suggest".

I am sorry you've been having trouble. This is an extremely bizarre problem. I don't think any of us here could have helped you with it besides doing a Google search for the initial ECC error message. I have zero experience with ECC memory so I didn't think I would be much help. Your system has an 8-core processor so maybe it was a high-end system that did have ECC memory.

I'm not sure what it is you wanted us to do. If we all posted our condolences in threads we could not help answer then the forums would become unwieldy and nearly useless. Looking back now, I wonder if anti took on the task of responding to the posts no one else had an answer to or suggestion for. If so, then maybe we could assign this role to someone else so there is at least one response to trouble reports but we aren't flooding the forums with unhelpful responses.
Posts: 1,062
Dave
Joined: 20 Jan 2010
#6
Same here.
The only little bit of information is the recent"no sound" issues in the forum.
This is with oss-compat being installed on an update when it either a) needs to be removed or b) needs to be configured.

When saying it was a known case about the kernel and that the liqourix kernel solves it.... It seemed odd to me.... It is also not my area of expertise __{{emoticon}}__
Posts: 765
rust collector
Joined: 27 Dec 2011
#7
I wonder what the difference between the antix and liquorix kernels are, that makes this happen...

I would also be happy to tell you to"Do this: xxxxxxx, and it will be ok", But I have no idea , so I can not... Sorry.
Alanarchy
Posts 0
Alanarchy
#8
TBH, one of my difficulties in responding to posts I don't have a ready solution for is I sometimes totally misunderstand what the poster is talking about, often because I take things too literally. This is compounded by the fact that sometimes people don't report problems correctly. It would be a big waste of everyone's time if I responded to every posted I didn't understand saying something like"I'm sorry you are having trouble but I have nothing useful to suggest".
I was totally confused. In ten years of using Linux I had never seen a Kernel Panic before. Now I know what one is I won't be so vaque __{{emoticon}}__

I can now clarify.

The old computer"died" directly after LM sensors was updated in 32 Bit testing. I watched the temperature reading rise to silly levels and then eveything cut out."OK" I thought,"It's over ten years old and that buggered up fan must have cooked the chipset".

Then the new computer did the kernel panic right after the update, and cos it was lm sensors that updated, while this appeared to be a sound and graphics problem I didn't connect the two.

I installed 32 bit antix alongside the original 64 bit antix and LMDE too. LMDE installed and broke upon updating, revealing this is not an antix problem. I kept the 32 bit on Wheezy and updated that OK, thus the problem has to be in"Testing" cos LMDE follows"Testing" too. Then I thought"LM sensors update" and checked their web-site.

Sorry if it looked like I was having a rant at you guys. I was so relieved it wasn't the new box broken; just a software problem. If I was really mad about it I wouldn't have posted to let you all know what was wrong in case it happens to you.

Does that make sense? I don't seem to be making much sense after several days of banging my head up against the walls. __{{emoticon}}__
Posts: 765
rust collector
Joined: 27 Dec 2011
#9
I suggest calling these people:


========= SCRAPER REMOVED AN EMBEDDED LINK HERE ===========
url was:"http://actionwallpads.com/index.htm"
linktext was:"http://actionwallpads.com/index.htm"
====================================
Alanarchy
Posts 0
Alanarchy
#10
I suggest calling these people:
Good idea RC.

You asked why I tried the Liquorix kernel. That kernel is 3.10.something, while the antix one is 3.9.5.
Posts: 765
rust collector
Joined: 27 Dec 2011
#11
Aha, so lm-sensors breaks older kernels... ok.

I did not really mean"why you used liqourix" it was more"why did it not have the same issue" but I guess the answer might be the same.
Alanarchy
Posts 0
Alanarchy
#12
Aha, so lm-sensors breaks older kernels... ok.
It must do cos LMDE uses a 3.2.something kernel. The problem I then have is Liquorix Kernels are too far ahead of the propritery graphics drivers.
Posts: 2,238
dolphin_oracle
Joined: 16 Dec 2007
#13
the testing repos are up to 1.3.3.4 on lm-sensors. maybe this will work better with the antix kernel now.
Alanarchy
Posts 0
Alanarchy
#14
Yes, that update screwed up my music computer but following the advice in the"no sound" thread removing oss-compat fixed that __{{emoticon}}__

Maybe oss-compat and lm sensors don't get on __{{emoticon}}__