All Domestic Flights Grounded in the USA last night due to a Massive FAA NOTAM Computer Failure

Status
Not open for further replies.

 
My entire 28 year Silicon Valley career was spent working as a systems engineer for a Fortune 50 company.

It's easy to throw stones at "antiquated systems" - which indeed there are many in both government and the private sector.

The problem is migrating from an old system architecture to a new system architecture without any interruption in service. It's virtually impossible.

It is for this reason that many of these antiquated systems remain in production.

Scott
 
I’m certain that the FAA uses even more antiquated computer architecture…
See my post in #8.

They must have upgraded to Windows 8 finally.
Even upgrading to a new OS on a Windows system toy requires a shutdown and restart. Imagine doing that on a $40M mission critical system.

I supported a well known, worldwide bank who used our systems. On the rare occasion we needed to shutdown and restart our systems - systems that maintained a production database for their worldwide operations - let me tell you, it was stressful as eff. Over run the two hour planned outage window by just 45 seconds...a post mortem with interrogation lights and beatings was a certainty.

Unless one either works or had worked in mission critical production environments, they have no idea of the complexities and the h*** to pay with even the slightest misstep or delay.

I'm not throwing stones at you, Pew, I'm just explaining to the thread.

Scott
 
Last edited:
Even upgrading to a new OS on a Windows system toy requires a shutdown and restart. Imagine doing that on a $40M mission critical system.

I supported a well known, worldwide bank who used our systems. On the rare occasion we needed to shutdown and restart our systems - systems that maintained the production database for their worldwide operations - let me tell you, it was stressful as eff. Over run the two hour planned outage window by just 45 seconds...a post mortem with interrogation lights was a certainty.

Unless one has either worked or has worked in mission critical production environments, they have no idea of the complexities and the h*** to pay with even the slightest misstep or delay.

I'm not throwing stones at you, Pew, I'm just explaining to the thread.

Scott

No worries! I completely understand and agree. A 10 minute server restart for a SMB during work hours can sometimes have users complain already so I can't imagine the stress and planning that goes into maintaining international mission critical systems.
 
Found the culprit 🇨🇳
08F12E02-C3E2-4D2E-B6E5-3014AD9427E7.jpeg
 
No worries! I completely understand and agree. A 10 minute server restart for a SMB during work hours can sometimes have users complain already so I can't imagine the stress and planning that goes into maintaining international mission critical systems.
All I can say it was stressful. One time we had a disastrous migration to some new, fully custom software developed by the bank. It required a system shutdown on our part to support their extensive redesign.

People lost their jobs over this - literally fired within days. We're talking long time bank employees, senior engineers who had intimate knowledge of the bank's custom applications.

The fact that they planned and tried their best was not even considered. That fact that they had erasers on their pencils is what got them fired. The bank employees I'm referring to were some of the best engineers I ever worked with in my career.

Scott
 
My entire 28 year Silicon Valley career was spent working as a systems engineer for a Fortune 50 company.

It's easy to throw stones at "antiquated systems" - which indeed there are many in both government and the private sector.

The problem is migrating from an old system architecture to a new system architecture without any interruption in service. It's virtually impossible.

It is for this reason that many of these antiquated systems remain in production.

Scott


I agree with this having worked in a large healthcare organization. We had a major change in our computers system wide that entailed months of training plus a period of side by side operation on a limited basis. All went well until the Go Live date and then the system floundered for a week. It was a huge fiasco.
 
I agree with this having worked in a large healthcare organization. We had a major change in our computers system wide that entailed months of training plus a period of side by side operation on a limited basis. All went well until the Go Live date and then the system floundered for a week. It was a huge fiasco.
I can relate and feel your pain. It's not fun.

Scott
 
Well the gov. was all over Southwest a few weeks ago for the delays....Lets see if the Gov is all over the FAA for this....and will the Gov. pay all of the extra costs to the airlines and refund money back if flights were cancelled because of this..I say NO...double standard....do as I say not as I do...
 
Scott, as our BITOG resident expert (you volunteered) on something like this, how does it get done? At some point it has be be updated, or does it? Kick the can down the road far enough will there be a complete system failure/shutdown?
 
Scott, as our BITOG resident expert (you volunteered) on something like this, how does it get done? At some point it has be be updated, or does it? Kick the can down the road far enough will there be a complete system failure/shutdown?
Just like our electric grid...gets kicked down the road...Gov. is very good at doing that...city state and federal...
 
Many of the oldest systems are far more secure and reliable than anything windows era.

Security by obscurity is very real, as are sandboxed proprietary networks that are read only and don’t touch the normal internet
 
Status
Not open for further replies.
Back
Top