Python for Industrial Engineers

Process Capability Analysis with Python

Measuring Process Performance

Roberto Salazar
May 14 · 5 min read
Image by Wim van ‘t Einde available at Unsplash

Process Capability Analysis

Process capability analysis represents a significant component of the Measure phase from the DMAIC (Define, Measure, Analysis, Improve, Control) cycle during a Six Sigma project. This analysis measures how a process performance fits the customer’s requirements, which are translated into specification limits for the interesting characteristics of the product to be manufactured or produced. The results from this analysis may help industrial engineers identify variation within a process and develop further action plans that lead to better yield, lower variation, and fewer defects.

Specifications are the voice of the customer. Every process should be capable of fulfilling the customer’s requirements, which must be quantified to be attainable. Specification limits are the numerical expressions of the customer requirements. Due to natural variations within the process, specifications usually are a range with upper and lower bounds. USL (Upper Specification Limit) is a value above which the process performance is unacceptable, while LSL (Lower Specification Limit) is a value below which the process performance is unacceptable.

Specifications must be realistic. To evaluate their validity, the RUMBA method from the Six Sigma field is used, where R stands for Reasonable, U for Understandable, M for Measurable, B for Believable, and A for Achievable, respectively.

Process performance is the voice of the process. A process can be considered right when it is approximating to the target, with as little variation as possible. In the Six Sigma approach, the most common process performance measures are:

  • Yield (Y): the number of good products or items produced by the process. It can be assessed once the process is finished, counting the items that fit the specifications:
  • First-time yield (FTY): takes into consideration the rework in the middle of the process. Thus, regardless of the number of correct items at the end of the process, counts the correct items as “first time” correct items:
  • Rolled throughput yield (RTY): used when the process is formed by several linked processes. It is calculated by multiplying the FTY of every chained process:
  • Defects per opportunity (DPU): number of nonconformities per unit. Defects are the complement of the yield:
  • Defects per million opportunities (DPMO): number of nonconformities per million opportunities. It is mainly used as a long-term performance measure of a process:

The sigma score of a process (Z) is a simple number that conveys how a process fits the customer specifications. Processes that reach a sigma level of 6 may be considered as “almost perfectly” (i.e. with almost zero defects) designed processes. A sigma value of 6 implies that less than 3.4 DPMO (defects per million opportunities) will be attained. The sigma is the number of standard deviations that fit between the specification limit and the mean of a process. It is calculated using the formula:

DPMO through sigma scores

Capability indices directly compare the customer specifications with the performance of the process. They are based on the fact that the natural limits or effective limits of a process are those between the mean and +/- 3 standard deviations (i.e. where 99.7% of the data is contained). The capability of a process (Cp) is calculated using the formula:

However, this formula does not allow to validate whether the process is centered in the mean (which is desirable). To deal with this issue, the adjusted capability index (Cpk) is calculated using the formula:

Like the sigma score, capability indices help to determine how well a process is meeting customer specifications. In general, a Cpk of 1.33 is acceptable, but the greater its value, the better.

For the following example, let’s use some of the most popular Python libraries to perform a process capability analysis for a given process. Let’s take a look at the Python code!

Output file:

Concluding Thoughts

Process capability analysis represents a great tool for industrial and process engineers for identifying variation within a process to improve its yield and make it more efficient. Python’s most popular libraries allow getting significant information about a process capability with just a few lines of code. Industrial, process, and quality engineers are highly encouraged to take advantage of this tool to be able to fulfill the customer’s requirements with high quality and efficiency standards.

— —

If you found this article useful, feel welcome to download my personal codes on GitHub. You can also email me directly at and find me on LinkedIn. Interested in learning more about data analytics, data science, and machine learning applications in the engineering field? Explore my previous articles by visiting my Medium profile. Thanks for reading.

- Robert

Geek Culture

Proud to geek out. Follow to join our +500K monthly readers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store