Settling the path for building an AI research group

Christian Stürmer
CONTACT Research
Published in
6 min readNov 17, 2023

--

This text was written closely with my colleagues Leif Sabellek, Marianne Michaelis, and Lucas Kirsch

AI research is a challenging and rewarding field, especially when starting with a new team. CONTACT Research embarked on this journey shortly after we were established, while seeing the rising attention for GPT-based chatbots, like ChatGPT and Google Bard. We were asking ourselves in which direction the development might go and how this would affect our work in the company. The first previews were as impressive as doubtful their true potential was, but the pure velocity of development made a whole industry feel, including us, that there must be more to come. Having AI among our prioritized research topics, it was for sure, that we need to evaluate tools and technologies in this emerging field, but one question was there from day one:

“How should we start as research team for AI?”

In this article, I’ll provide insight into how we actually started and offer helpful reminders for those looking to embark on a similar journey … Like landmarks, for finding the path into the world of AI.

We had some initial work on NLP and data strategies, and we used this along with customer and internal feedback to steer our search for a research topic to start with. Knowing that generic similarity is relevant for many PLM use cases, and our software solution contains a lot of information about products, projects, and processes, it was clear, that we wanted to gain experience in exactly this field. To make the start as easy as possible, we needed to narrow down our scope and make it more feasible. We opted to start with small experiments and were looking for a specific problem or use case, that was close to the topic of similarity but still allowed a step-by-step approach.

Start by experimenting within your broader scope to get a feeling of where to go in detail!

The most important step in any AI project is to examine the available data and its sources. In our case, this meant narrowing the scope to a prototype for internal use with internal data. We wanted to use a data source, that was as close to an enterprise software application as possible but was not critical in terms of sharing intellectual property when used in our local AI-Pipeline or sending it to an AI-Service. We considered various alternatives from a previously done data quality study and ultimately decided to start crawling our web-based documentation. This documentation is accessible to every customer or partner, contains no restricted information, and can be easily crawled using a few lines of Python code and basic knowledge of HTML parsing. So, the path was settled towards internal assistance for our own software solution based on a dataset with text data from the content area and the corresponding link from the navigation tree of the portal.

Invest time in data studies and internal data research, because choosing the data to work with sets the principal direction of your AI journey and has a strong influence on your success!

Documentation portal of the CONTACT Elements platform, © CONTACT Software

The next step is to get a feeling for the potential benefits of using the AI application in the chosen data context and develop a more detailed vision, that illustrates the impact the introduction of AI will have on this context.

Our vision was clear from early on: Making the knowledge stored in diverse information resources, like handbooks, tutorials, specifications, and examples readily available to everyone in a very elegant and intuitive manner, simply by “asking” about it, would be a direct benefit to the entire company and later our customers. Even during the initial experiments, which were based on easy-to-use frameworks for fast progress, we already achieved promising results! Consequently, we started to deeply evaluate different usage scenarios, focused on developing a semantic search application, and started to gain a deeper understanding of the topic and possible applications.

So, start small and basic, with reachable quick win and clarify your vision!

More details on the actual similarity search and how we handle data will be provided in an upcoming article!

Looking back from this point, when scope, data, and vision are clarified, it became clear, that each of these preliminary decisions demands a specific profession within a distinct AI team:

Defining the scope of research efforts requires general AI knowledge, estimation of challenges, requirements, and usability, as well as technical design decisions. This is the domain of data engineers, who bridge technical knowledge and strategic goals. Data scientists process and analyze large data sets in diverse software environments, interpret findings, and communicate results to diverse audiences. Defining the vision of AI value in a specific domain requires extensive domain knowledge and customer needs, which may be derived from preliminary experience or customer environment studies. This cross-domain task is best moderated by AI architects, who can reflect internal AI use and guide external stakeholders.

Find people for your AI team with a strong dedication to evaluating “data”, “scope” and “vision” of your AI activities!

These derived professions build the foundation of the first project phase for an AI team … With the addition of one further colleague: AI itself. AI-based tools boost productivity a lot when they are used with care and caution. They remove the “silly” robot-like tasks and free time for creative tasks! In addition, they help to overcome this feeling of being stuck because it seems to be impossible to find a starting point for work. Of course, they won’t take all the work from the team, but they can ease it a lot. So, developing the experience where and how AI tools can be used, is not only a crucial task in the project, but also in the AI team itself.

During the whole implementation of our semantic search, as described in our previous article about boosting efficiency with LLMs, we tried to use different AI-supporting tools, to speed up the development process of simple building blocks, like parser functions or UX components. In contrast to that, the architecture itself was developed through a continuous questioning process and a lot of discussions with the team!

Ask tools to build the general blocks, while you can focus on their arrangement!

We finalized this first chapter by releasing a beta version of our semantic search for internal use within the company network to collect feedback from our colleges as early as possible. There is already great excitement for further development of our semantic search capabilities, enabling seamless exploration of diverse information sources to enhance the value of AI in our work.

Engage your colleagues with public demos as early as possible!

Conclusion

In conclusion, we successfully found our first steps toward doing AI research at CONTACT Research. There is this special feeling of excitement you feel when a journey starts! We also found our first landmarks, that guided the way till now. The upcoming month will be a continuous interplay between all of the professions in our team and a continuous cycle of successive optimization.

As we refine our approach to semantic search, we anticipate even greater breakthroughs in the realm of the combinations of AI-powered solutions, first for us and in the future for everybody using our tools. Our team will grow, both in members and in experience and we are all looking forward to gaining ground on our path through this great AI adventure!

Links:

About CONTACT Research. CONTACT Research is a dynamic research group dedicated to collaborating with innovative minds from the fields of science and industry. Our primary mission is to develop cutting-edge solutions for the engineering and manufacturing challenges of the future. We undertake projects that encompass applied research, as well as technology and method innovation. An independent corporate unit within the CONTACT Software Group, we foster an environment where innovation thrives.

--

--

Christian Stürmer
CONTACT Research

Doing Research for CONTACT Software based on Open Source components, the skills of Physics and the dedication to energize great minds for a sustainable future!