Building an IoT Product — The Product(ion) Feedback Loop
This story explains the learnings we gathered from building IoT consumer products we offer at tado°. This article shares the perspective of an engineer working on getting physical products produced with different remote EMS (Electronic Manufacturing Services).
The Overall Concept
We are constantly working on improvements to the overall product life cycle of our physical products. To make that happen we are relying on two important aspects.
IIoT (Industrial Internet of Things)
At every stage of the manufacturing process, we collect various quality and process metrics. This includes e.g. test results, test cycle times, test station IDs, and others.
IoT (Internet of Things)
From a product life cycle perspective, the biggest difference between traditional (non-connected) and IoT devices is that you can update and gather information about your devices after they leave the factory floor. This connection opens up the opportunity to create a feedback loop. This feedback loop allows us to continuously improve our product and the involved production process.
The key to recognizing and understanding problems along the product life cycle is to collect and aggregate data. Based on that data we formulate assumptions and draw conclusions that improve our product and processes slice by slice.
For each individual product we have a digital twin. In this digital representation we can look up all associated data of the product. Besides all test results we collect timings, camera checks, traceability data (used components), rework steps. Together with collected runtime data of the devices from the field, this forms a mighty tool to make data driven decisions. Let’s have a look at the different sources of data.
Before diving into data collection, let me explain a few important elements and steps of our Electronic Manufacturing Service.
MES (Manufacturing Execution System)
An MES is an IT system to track, trace, and store data about the state and processing of raw goods until the production of the final product. This system is usually implemented by the manufacturer. As we want to work with this data, we have another lightweight MES on our side that is used to record and control the production flow.
There are a multitude of steps happening during the production process of electronic products. As I will use some of them in my further description, below is a short overview of a few.
SMT (Surface Mount Technology)
Originally describing the technology itself, the abbreviation is also used to describe the process of picking, placing, and soldering components onto a PCB (Printed Circuit Board). In this step all components get reported into our MES, to provide us traceability on a component level.
ICT (In-Circuit Test)
Once all components are assembled on your PCB, a special test equipment puts test needles on certain positions of the PCB, to check if all components were mounted and soldered correctly. These tests are developed and performed by our contract manufacturing partner. The test results are reported to our MES.
FCT (Functional Test)
The functional test is also performed on the assembled PCB. FCT in our case includes programming the firmware, checking communication interconnections between different chips and verifying the functionality of all attached components. (e.g. reading out chip revisions, IDs). This part is fully under our remote control. Thus we keep flexibility and establish a secure provisioning process (loading firmware and security credentials) of our devices.
EOL (End of Line Test)
This test step happens at the stage right before the product is packaged to be shipped to our customer. The device is almost or fully assembled and the acceptance tests are performed to make sure everything was integrated correctly. As the FCT, this process and equipment is also under our full control and easy to adapt at any point of time.
The Data Collection
Along all these steps we are collecting production data from our devices. All the data is aggregated in our lightweight MES system. To visualize and draw conclusions we make heavy use of ElasticSearch and Kibana.
Some of the involved steps are particularly interesting to us. While SMT and ICT are quite similar and standardized across electronic products, FCT and EOL are more product specific steps. They are usually defined through acceptance tests during the product qualification stage. As we want to continuously iterate on our production process, we are owning this equipment and have full control over these parts remotely. This allows us to increase or change our test coverage and parameters on the fly in production. If we see a certain product change causing errors at a customer, we can fine tune the production process accordingly.
Let’s now have a look at a few examples of how we can use the collected data.
Generating value out of the data
Production related improvement examples:
- Due to the nature of our products, our production peaks between August and December, slightly ahead of the ‘heating season’. The collected production volume data per day helps us to determine our maximum possible peak production throughput very precisely. Thus we can very well estimate where we will run into bottlenecks when scaling production volume year to year. We know exactly where in the line we need to restructure equipment and workflows, and what equipment we need to buy.
- Sometimes it’s also hard to spot the best place for improvements inside a particular production line. Is it better to squeeze out a few seconds of the flashing process during firmware programming, or should we duplicate and parallelize a complete production step? Measured data guides us to the next most valuable steps to implement or fix in our process, to increase single line performance.
Product related improvement examples:
- After a firmware release to one of our products in the field, we saw undefined behavior on a tiny fraction of our devices. After deep investigation (which makes a great embedded story by itself), we found the source of the bug inside the silicon in some of our chips. However, by clustering the devices in the field and tracing them back to the actual production data (time/date code of the chip manufacturing process), we could instantly make sure that no more faulty devices get shipped from the factory and identify all affected devices in the field.
- When an NPI (New Product Introduction) is happening, we usually have a few validation builds happening before going to pilot or mass production. Using our system we can see failures and process timings during production runtime. This allows us to improve, fix, and rearrange the process without any delay. Overall this improves delivery speed and time to market significantly.
Adhering to our agile mindset, we try to keep the feedback loop as short as possible: we test as early as possible in the production flow. When producing hardware this is even more important compared to software. Each failure discovered at a later stage can lead to manual work. Disassembling the plastic parts of an assembled product is different to a 1-click rollback in software ;-). Besides this, we learnt to…
- Reduce cost of change. Make it as easy as possible to introduce almost any change into production at any point of time.
- Make your product with all it’s modules and components traceable, e.g. by putting a QR code on each individual PCB.
- Collect any data you can gather from your production line, even if you don’t see an immediate use case. It can be useful finding patterns once you receive data from the field.
- Define acceptance criteria well for your product. Take some time with your HW team to think about what tests need to be covered, and keep the flexibility to improve them, based on feedback data that you collect on the way.
- Shorten the feedback loop for executed tests. The higher the test coverage in earlier stages, the better the flow.
- Shorten communication feedback loops to the factory floor. In case your production floor is remote, adapt to communication tools of line workers. For us, this meant adopting WeeChat, Telegram, and Remote Desktop.
- Testing RF is hard. Add e.g. NFC to perform your final product tests. Production floor staff use smartphones, WiFi is everywhere, the frequency spectrum is everything but clean.
We assume many engineers in different IoT companies share similar problems. Let us hear and learn what different approaches you took to get your production flowing.
Further Reading / Information
- Demystifying Hardware Jargon — a great primer to production terminology
- Agile at Tesla — Great inspiration of bringing the Agile View to Hardware
- eXtreme manufacturing
- Check out Blue Clover Devices, they build standardized equipment to get the production flow running for you!
Working at tado°
At tado°, we’re passionate about bringing the best user experience to our customers. We’re looking for engineers with a mindset to solve great technical challenges like this to deliver best in class products. If you think alike, join us on our way here: open positions