- Ruhr-Universität Bochum
More transparency in the data jungle of materials science
Big data – at first glance, the term sounds like a promise. But a lot of data is useless unless someone provides structure. Someone like Markus Stricker.
Many years ago, the decoding of the human genome opened up a new branch of research in biology: bioinformatics. A similar trend can currently be observed in materials science. At present, Professor Markus Stricker, head of the Bochum working group for materials informatics and data science, is still something of an oddity in Germany with his field of research. In an interview, he explains why this will probably change in the future and what potential materials informatics has to offer.
Professor Stricker, you research in materials informatics. It is still a young field of research, isn’t it?
The term has existed in academic publications since 2005. It describes the idea of bringing together computer science and materials research. The trigger was big data. It became clear that we needed a concept for dealing with all that data. Even though high-throughput methods for synthesis and characterisation, such as those Alfred Ludwig has been using in Bochum for a long time, have been around for a while, they’ve only become established on a broad scale in recent years. Now, the huge data volume is becoming a problem.
Professorships like yours that specialise in data science applied to materials are probably few and far between.
Apart from my junior professorship here in Bochum, the only other professorship I knew of in Germany that focuses on materials informatics was in Jülich. Since April 2023, there’s been an additional professorship with Miquel Marques at the Interdisciplinary Centre for Advanced Materials Simulation, ICAMS, which focuses on data, artificial intelligence and materials. Apart from that, this field is more of an addition to research rather than the focus. It might eventually turn out, that it’s not viable for a professorship as a stand-alone discipline. But that’s not what it looks like at the moment, rather the opposite.
Does that mean you’re taking a risk by focusing on this research field?
This junior professorship is a great opportunity for me, and I’m very happy about it. I've encountered many characterisation methods in the past and know about the different kinds of data that are generated in experiments and about the parameters that have to be documented. I also have experience with the use of many simulation methods and their results. I find it exciting to now bring everything together on the basis of data – and to be able to contribute to the establishment of a new field!
Why do you think this field has a future?
Because materials science has the potential to answer the pressing questions of our time. Energy, transport, sustainability – all of these are ultimately material issues. It typically takes 10 to 20 years to develop a material using traditional methods. If we don’t find answers to today’s problems until 2050, it’s game over. We have to speed up the processes.
And this can be done with materials informatics?
Machine learning can help us accelerate materials development. To create a new material, we combine different elements in a certain mixing ratio. This is not always possible in a state of equilibrium, so for some element combinations you need special processes to combine them into a material. However, many elements exist and the possibilities for element combinations in diverse compositions are practically infinite. It’s impossible to produce and characterise all of them to find the best properties. Using algorithms, we can make predictions for the properties of new element combinations based on a few measurements. This allows us to narrow down in which direction it makes sense to keep searching.
For this, you have to compile a lot of data from different sources.
Yes, and they differ greatly – that’s quite a challenge. There are many different measurement, simulation and documentation methods. Currently, almost every researcher has their individual data filing system, which they maintain properly. But when the person in charge of the system leaves, the data can usually no longer be reused, or only with great effort, because others often don’t understand how the system works. Therefore, it’s important to develop standards for the storage and documentation of data that we can use to make and keep data accessible and usable across work groups and for use in algorithms.
Tell us about the cooperation with your colleagues.
I’m very happy about how open and appreciative my colleagues are at Ruhr University Bochum; in an environment like this, research is a lot of fun. In many projects, the materials informatics part of my research is often the only contribution that – to put it bluntly – doesn’t generate its own data, but rather requires data from others. I’m brought into many projects from the get-go to help think through the data science part. Later on, others will benefit from my research when my results help to accelerate the development of new materials.