Intel J1900 flaw causing early failure for embedded devices

This morning I woke up, made some coffee and went to the computer room to work on some projects when I noticed that the shared folders from my NAS weren’t showing up on my desktop. That’s odd I thought. I went over to the NAS and my fears were confirmed, not only was it offline, but it had an error light. It would power on, but it wouldn’t start up.

Hoping that this was some simple fault like a bad stick of memory, I disconnected the unit, pulled out the drives and brought it over to my test bench. I confirmed the power supply was good and swapped the memory in/out, but with no effect. Ouch, this thing is really dead I thought. Before consigning it to the e-waste bin, I thought I’d search around just to make sure and stumbled on a thread from 2020 about the CPU on these devices having a flaw. Not only that, there was a possible fix! (if, perhaps only a temporary one)

At the time the flaw was discovered Intel posted an addendum to their CPU specification update for the J1900 and related CPUs. (Intel has since removed these docs from the public facing side of their website and requires a CNDA account to access them. Thankfully the wayback machine has an archive of them linked above) Unfortunately, the problem lies in the silicon of the CPU itself and is not repairable.

The fix documented in the forum link above and in a similar Reddit thread a year later both outline connecting a 100-200ohm resistor to pull the LPC clock signal to ground. Thankfully this signal is exposed on a pin header that also supplies a ground pin on the NAS model I have. I first hooked up my oscilloscope to the clock pin and verified that it was operating out of spec. I rummaged through my component collection and found a 180ohm resistor that would work. I had some jumper wires with dupont connectors for another project and used that to make a dongle that would jump these 2 pins. I put power into the unit and it started right up. Amazing!

Sadly the problem with the J1900 CPUs is only going to get worse over time. It’s possible that I could be able to keep the unit running for some time, possibly by changing out for different resistors in the future as the circuit continues to degrade. However the real solution is to start planning a migration from this device to something new.

If you have an embedded device powered by a J-series, N-series or similar and it’s been operating 24×7 for several years, you’re likely on borrowed time. Get a good backup of your data and start planning your migration now.