DATA STORIES | MACHINE LEARNING | KNIME ANALYTICS PLATFORM
At the Intersection of Machine Learning and Image Analysis: Insights from a Data Scientist
My Data Guest — An Interview with Jérôme Treboux
Co-interviewer: Shantanu Tyagi
I am excited to welcome a machine learning specialist and a skilled KNIME user on this new episode of My Data Guest. Jérôme shared insights into his Ph.D. research, his experience with machine learning and image analysis, and his engagement with the KNIME French-speaking community. Joining me as a special co-interviewer for this episode is Shantanu Tyagi, Community Event Coordinator at KNIME.
Jérôme Treboux is a Ph.D. student in Computer Sciences with a specialization in image analysis at the University of Fribourg (Switzerland). His dissertation focuses on vine line detection in aerial images using advanced machine learning algorithms. Additionally, he is a founding partner of datastory, a data analysis and visualization startup, and a research assistant at the Institute of Informatics at HES-SO Valais where he works on applied research projects that range from data manipulation to deployment.
Rosaria: Tell us more about your research and your Ph.D.
Jérôme: I started my Ph.D. a few years ago and now I have almost completed it. My research focuses on image analysis to detect vine lines in aerial images. The idea is to use a drone, which is flying over vineyards to collect aerial images, and rely on machine learning to detect each line. Afterwards, we can count the number of lines and conduct additional analyses on these images.
Rosaria: What are you looking for when you take pictures of a vineyard with a drone?
Jérôme: The vine lines. Each vineyard is structured in lines, so from the top you can really see very defined lines. The idea is to isolate each line so we can create a flight path that the drone can follow. In the Swiss Alps, the vineyards are on steep slopes so it’s more convenient to use a drone instead of a traditional machine like a truck or a helicopter.
Rosaria: What tools did you use for your research?
Jérôme: I used many different tools and languages. I used a lot of Python to do machine learning as well as KNIME Analytics Platform for quick prototyping. For example, I used the analytics nodes to compare the performance of different algorithms (e.g., the Random Forest). Additionally, I defined and trained a neural network directly in KNIME Analytics Platform, and experimented with different network configurations to evaluate how the performance changed. At the same time, I also worked with Python as it can be faster for computations that heavily rely on GPU usage.
Rosaria: Since you are a Python user, surely you know about the KNIME Python integration that you can use to develop KNIME nodes on top of Python code. It is quite a new feature. Have you already created a new Python-based KNIME node?
Jérôme: I tried the new feature but I haven’t used it for a practical case yet. I think it’s a very good idea and an interesting way to combine both technologies. I have to try to implement something with that!
Rosaria: When did you encounter KNIME for the first time?
Jérôme: Back in 2010, when I was doing my Bachelor’s at HES-SO Valais-Wallis, the professors teaching machine learning courses were already using KNIME Analytics Platform. When I started working at the Institute of Informatics, we decided to use it too. We liked it because we could see the results of each data operation and the evolution of our project straight away. Eventually, we started to explore this tool more and more for different clients and applied projects. We went even further with it and acquired KNIME’s commercial product, which helped us integrate different mobile applications thanks to the possibility of interacting with its API. Overall, we really enjoyed the new features coming out every six months to expand the experience and provide new solutions to our partners.
Rosaria: How did you find working professionally with a low code tool?
Jérôme: It’s really useful when you work with people who are not into IT or expert programmers. You can talk and explain to others what you are doing with your project easily. Without going into detailed descriptions, you can visually describe and show what has been done and what the output is.
Rosaria: As a KNIME expert, which feature or node of KNIME Software could you not do without?
Jérôme: All of them. I really like the GroupBy node because you can do almost anything with it. I use the Java Snippet node a lot when I feel like it’s easier to write just one line of code rather than searching for nodes. These days, I use the Python nodes to integrate the result of my Python code directly into my workflow.
Rosaria: When will you write the words “The End” to your PhD program? When do you plan to finish?
Jérôme: It’s difficult to say. The experts are reading my thesis so I’m waiting for their feedback. In the best-case scenario, it will be in one month. If I have some corrections to do, I hope to finish in March.
Rosaria: Close enough. Do you have any plans after graduation?
Jérôme: I’m seeking a job. I would like to leave Switzerland to gain new experiences. I want to discover something else so I want to move to Australia. I was there last month to talk to different people, look for a job, exchange experiences and present my research at a conference. The idea is to move to the other side of the planet soon!
Rosaria: Well, Jerome will be soon on the job market and he can offer expertise with KNIME, Python, data science, and machine learning. If you are looking for a KNIME expert and a data scientist somewhere in Australia, just get in touch with him.
Let’s now move to another topic: your work with and for the KNIME French-speaking community. Shantanu has some questions for you.
Shantanu: In 2022, as COVID measures were gradually eased, we started a series of community events all over the world. These events are called Data Connect and usually consist of 2 short presentations by KNIME experts and one hour or so of networking. The latest event took place at Harvard University in Cambridge, Massachusetts and was centered around geospatial applications using the new Geospatial Analytics Extension developed by KNIME and Harvard’s Center for Geographic Analysis. What is your role in the KNIME French-speaking community?
Jérôme: I helped organize the first Data Connect: France in April and was a speaker myself at Data Connect: France in October last year. The aim is to expand the French-speaking community, invite different French speakers to share knowledge on different projects, and exchange ideas on machine learning and data science in general.
Shantanu: They were great and successful events, so congrats on that! When and where was the last Data Connect event you organized? Who joined the event?
Jérôme: The last one was in October 2022 in Paris at School 42. We had two speakers and the KNIME team. Our target audience were the students but the event was open to anyone. We presented a few advanced machine learning-driven applications and reserved some time for networking (in French) where the students had the chance to connect with people from the industry.
Shantanu: What is the model of School 42 and how does it differ from other schools?
Jérôme: École (School) 42 is a French school where there aren’t any professors offering formal courses. The school is completely free and you start off with an exam that lasts many weeks. You have to work in a team to develop different projects and you get admitted if you succeed. It is still a school in the sense that there are people organizing projects for you to work on. However, you don’t go to class and listen to a professor. You start with small projects and you move on to more advanced topics. When you hand in your project, your peers evaluate your work. After that, you can go into different directions, such as Machine Learning, Python or other programming languages. When the students are satisfied with what they learnt or when they find a job, they can stop going to school. In that regard, they don’t get a Bachelor’s degree but they can come back to school whenever they want.
Shantanu: So they have a unique model that focuses on practical learning rather than obtaining a formal degree. What was the topic of your talk there during the Data Connect?
Jérôme: I presented a machine learning project on image analysis of rice grain. There is a big business behind rice grain quality. For quality control, instead of relying on humans to check the quality of each rice grain, we created an app that takes a picture of a rice grain and sends it to a machine learning model. Next, the model evaluates the grain and classifies it as “good” or “bad”. An interesting point was that we used KNIME Server to deploy our model so the application was communicating with the API on KNIME Server to call the model and receive answers. It was a very practical presentation that showed the implementation in KNIME Analytics Platform, what we had to do, the challenges, etc. The second talk was by Luc Dufour from EDF France, a really big electric utility company in France. Luc talked about energy management and machine learning using KNIME Analytics Platform in order to reduce energy consumption in commercial buildings, as well as in private houses.
Shantanu: Did you have fun at the event? How was the vibe?
Jérôme: We organized the event in the amphitheater which allowed a smooth and pleasant exchange between students and speakers. During the networking time after the presentations, everybody stayed and engaged in different conversations. For example, some people were curious about how we dealt with images to train our machine learning model. Others, on the other hand, were discovering machine learning for the first time. It was a good, mixed audience with various backgrounds and levels of expertise.
Shantanu: That is the idea of Data Connect events: everybody comes together, learns something new, and shares their experience. So it sounds like you fully achieved that! Where and when will the next French Data Connect event take place? Around what topic?
Jérôme: It’s not confirmed yet but we would like to organize an event in May. We have different ideas about the topic(s). It will certainly be some concrete application again. Perhaps image, tourism or medical analysis. We already have one speaker and are currently looking for a second one. Please get in touch if you’d like to present or have topic ideas. It will be in Lausanne or Geneva, at the local university or School 42 in Lausanne. We will keep you up-to-date.
Shantanu: Are there other Data Connect events around the world in other languages?
Jérôme: Besides the French Data Connect, which includes France and Switzerland, I know that there are often events in Italy, Germany, the UK, and the US where there are strong KNIME communities. The concept is the same but in Italian, German or English. I like the idea of organizing Data Connects in different languages, as it’s not easy for everyone to follow the talks in English. Even if you cannot attend on-site, there is usually a live streaming or recordings of the event so you can watch the talks in your own language any time.
Shantanu: The Data Connect events are reaching new communities in new regions this year. How did you become an organizer of the French-speaking Data Connect events?
Jérôme: I joined different KNIME events in the past, and helped Rosaria organize a few in Switzerland and France. In principle, it is always very helpful to have somebody who speaks the local language — French for the French community, in my case. Additionally, it was easier for me to help with the logistics and arrange locations thanks to the many contacts I have in Lausanne, Geneva and Paris. So the process started with me attending different conferences, then I got to know KNIME and the KNIME team, and eventually I became an organizer for the French-speaking community.
Shantanu: We are reaching the end of our conversation. Before we say goodbye, we have a funny question for you. As on-site events are finally back and KNIME communities are scattered in many different cities, how about a traveling KNIME Data Connect event series? Do you have a VW van?
Jérôme: It could be very interesting! I don’t have a VW van, but why not? It’s not always about going to big companies or schools. We can tour local vineyards and organize a Data Connect event there. It could be fun, we have to talk about that!
Rosaria: For the vineyard tour, we could visualize the location(s) on an interactive map using the Geospatial Analytics extension that was released in December 2022.
How can people in the audience get in contact with you and your work?
Jérôme: The best way to get in touch is via LinkedIn. Otherwise, you can search my name on Google or the KNIME Forum, and you will find all my contact information.
Rosaria: On this happy note, we conclude our interview. Merci Jérôme, for the great conversation. We have learned what machine learning can be used for and how to develop a vibrant and engaged community literally from nothing.
Watch the original interview with Jérôme Treboux on YouTube.