NASA’s IRT publishes report on SpaceX’s CRS-7 failure

On Monday, NASA’s Independent Review Team (IRT) published the first public report on the SpaceX CRS-7 failure. The accident occurred on June 28th, 2015 when a Falcon 9 rocket’s second stage experienced an over-pressurization event during first stage ascent. The incident led to a launch vehicle failure approximately two minutes and nineteen seconds into flight. SpaceX blamed the accident on a manufacturing defect involving a steel strut. However, the IRT report blames a design error on SpaceX’s behalf.

Through NASA’s Commercial Resupply Services (CRS) program, SpaceX is contracted to resupply the International Space Station using its Falcon 9 rocket and Dragon spacecraft.

CRS-7 was the nineteenth flight of the Falcon 9 and the first to have a catastrophic anomaly.

At approximately 14:21 UTC on June 28th, 2015 the Falcon 9 lifted off from Space Launch Complex 40 (SLC-40) at Cape Canaveral Air Force Station. For the first two minutes of flight, everything appeared nominal, and there were no indications of any problems.

Then, at approximately T+2:19 into first stage flight gas was seen coming out of the second stage. Within milliseconds the vehicle begins to disintegrate and telemetry was lost.

However, the Dragon capsule which was mounted on top of the rocket detached during the anomaly and remained intact. It continued to report telemetry until it crashed into the ocean.

Due to Dragon surviving the launch vehicle breakup, additional software coding was added to future Dragons – according to Elon Musk – to allow them to deploy their chutes and potentially survive such an incident.

According to the IRT report published by NASA, immediately after the incident “SpaceX established an Accident Investigation Team (AIT) in accordance with its own, FAA approved contingency response plan, and consistent and compliant with the NASA Contingency Action Plan.”

Additionally, on July 2nd, 2015, “the NASA Associate Administrator for Human Exploration and Operations designated the Launch Services Program (LSP) to be the coordinating Program within NASA to interface with the SpX CRS-7 accident investigation. The LSP would provide the required insight for the three directorate programs that interface with SpaceX for its launch services: the LSP, the Commercial Crew Program and the ISS Program.”

Then, on August 3rd, 2015 the LSP was designated as NASA’s IRT for the investigation. The IRT was given the following key objectives:

  • Independently determine the cause of the failure
  • Ensure the proper corrective measures are taken
  • “Validate the SpaceX AIT efforts”
  • “Inform the Agency’s risk posture in order to support SpaceX’s return-to-flight activities.”
  • Make recommendations to enhance reliability

After spending thousands of hours analyzing the over 3,000 telemetry channels, SpaceX published its initial findings on July 20th, 2015. It noted that the anomaly occurred very quickly with only 0.893 seconds passing between the first sign of an issue and loss of telemetry.

The report stated, “the strut that we believe failed was designed and material certified to handle 10,000 lbs of force, but failed at 2,000 lbs, a five-fold difference. Detailed close-out photos of stage construction show no visible flaws or damage of any kind.”

As a result of the strut failure, “the helium system integrity was breached. This caused a high pressure event inside the second stage within less than one second, and the stage was no longer able to maintain its structural integrity.”

To correct the anomaly, SpaceX promised to no longer use that type of strut for flight applications. Additionally, a full review of Falcon 9’s hardware would be conducted to ensure reliability.

While this report from SpaceX was preliminary, in the years following SpaceX never publicly changed their position. The contents of the IRT report confirm that SpaceX’s AIT analysis concluded that a faulty strut was to blame.

According to the IRT report, the SpaceX AIT found that “a helium filled composite overwrapped pressure vessel (COPV)within the Stage 2 LOx tank had become liberated” due to a strut failing. As a result, the COPV “hit the LOx tank dome causing it to rupture.”

The IRT’s own independent investigation into the accident “came to the conclusion that all but the Stage 2 Fault Tree block could be closed.”

In addition, “the IRT determined that the direct (or proximate) cause of the Falcon 9 launch vehicle failure was the rupture of the Stage 2 LOx tank.” This matched the findings of the AIT report.

Therefore, the next step in the IRT investigation was to confirm that a dislodged COPV was, in fact, the cause of the LOx tank rupture.

To do so, the IRT investigated two alternatives that could have caused the LOx tank to rupture while still matching the telemetry indications.

The first alternative involved RP-1 leaking into the casing of the LOx transfer tube. The transfer tube is used to transport liquid oxygen from the LOx tank to the Merlin MVac engine. To do so, it has to travel through the RP-1 tank.

In this scenario, the leaking RP-1 would warm the liquid oxygen “causing it to spew or geyser.” However, this theory was eventually dismissed.

The second alternative was similar to the first. Only this time, the LOx leaked into the casing of the transfer tube. As a result, thermal energy would be transferred to the RP-1 surrounding the transfer tube potentially causing a failure. In the end, this alternative was ruled out, as testing at NASA’s Marshall Space Flight Center showed that the amount of thermal energy available would be insufficient.

With the two alternatives ruled out and the IRT being able to account for all but nine of the 115 major telemetry indications, they were able to reach a conclusion.

The IRT determined that it was “credible” that a COPV was “liberated” due to a strut failing and thus rupturing the Stage 2 LOx tank. Therefore, the IRT’s assessment of the “direct and immediate causes” of the anomaly aligned with SpaceX AIT’s assessment.

However, the IRT did not agree with SpaceX on the “initiating cause.”

SpaceX AIT blamed a manufacturing defect for the failure. The IRT agreed that a manufacturing defect could have been involved, but noted that there were other potential strut related failures that were “credible” including an installation failure.

Regardless, the IRT had a significant problem with the grade of strut that SpaceX chose. The report stated, “the key technical finding by the IRT with regard to this failure was that it was due to a design error: SpaceX chose to use an industrial grade (as opposed to aerospace grade) 17-4 PH SS (precipitation-hardening stainless steel) cast part in a critical load path under cryogenic conditions and strenuous flight environments.”

It went on to add, “the implementation was done without adequate screening or testing of the industrial grade part, without regard to the manufacturer’s recommendations for a 4:1 factor of safety when using their industrial grade part in an application, and without proper modeling or adequate load testing of the part under predicted flight conditions. This design error is directly related to the Falcon 9 CRS-7 launch failure as a ‘credible’ cause.”

In simpler terms, the steel strut that SpaceX chose was not certified to be used in such conditions. Furthermore, SpaceX did not meet the 4:1 redundancy requirement that the manufacturer had instructed.

Therefore, the IRT recommended that SpaceX applied greater care when certifying commercially sourced parts for flight.

Interestingly, the IRT also discovered another area of concern not directly related to the accident that arose during the investigation.

The report found that the telemetry architecture on the upcoming “Full Thrust” version of the Falcon 9 included a new method of handling packets that increased latency, and thus vital data could have been lost in the event of a similar anomaly.

The IRT report finished by noting that all of the key findings in the report were addressed by SpaceX in time for the successful Jason-3 mission for NASA.

However, that launch utilized a Falcon 9 version 1.1 vehicle. The “Full Thrust” variant is version 1.2, so it remains unclear if the telemetry issue has been resolved.

That being said, in January of this year the Falcon 9 Full Thrust received certification to launch Category 2 NASA missions. Category 2 certification allows Falcon 9 to fly medium value NASA payloads. Such certification would not have been warranted if NASA did not have significant confidence in the Falcon 9 launch vehicle.

Following the CRS-7 investigation, SpaceX returned to flight just six months later with the successful launch of OG2 Mission 2. The now historic launch featured the first ever successful landing of a Falcon 9 first stage.

Falcon 9 went on to perform eight more successful missions before encountering another setback. On September 1st, 2016, a rocket exploded on the SLC-40 launch pad just minutes before a planned static fire test. The anomaly resulted in the loss of the AMOS-6 communications satellite.