BOSTON, May 16, 2024
/PRNewswire/ -- The impact of machine learning and data
science techniques on the materials industry has grown
exponentially since IDTechEx started covering the field of
materials informatics in 2020, impacting the real world from
lightweight alloys to new battery chemistries. However, the keys to
project success here have changed remarkably little in this time,
as outlined in IDTechEx's report, "Materials Informatics 2024-2034:
Markets, Strategies, Players". What, then, are the essential areas
of focus for an organization looking to deploy a materials
informatics strategy, whether as an end-user or as a provider of
software?
"Garbage in, garbage out" – the problem of
sourcing good data
Machine learning models can only ever be as good as the data
they are trained on. The major choices consist of performing your
own experiments, computational simulation, pulling data from public
or private repositories, or scraping data from patents and
scientific literature.
Predictably, experimentation is the gold standard in terms of
accuracy and will almost universally be needed at the verification
stage of a materials informatics project, but the high costs can
limit data volume significantly. Simulation is cheaper, but the
financial and monetary expense of computation remains significant,
and accuracy to reality can be dubious.
On the other hand, if data repositories and scraped data cover
the right problem space, they can yield higher data volumes, but
bias is likely to be a significant issue, including the
non-inclusion of negative results. Often, limited information is
available about the full experimental conditions. Some larger
end-users of materials informatics have told IDTechEx that the
limitations of external data have led them to reject it entirely,
but this is not an option for players without such deep
pockets.
Japanese player Preferred Computational Chemistry (PFCC) have
approached the challenge of speeding up data gathering with their
Matlantis software, which trains a graph neural network surrogate
model for potential energy surfaces on density functional theory
(DFT) data. By taking millions of DFT simulation results and
performing it's on unstable structures close to the stable
structures in the pre-existing data, PFCC's model has a much fuller
picture of the infinite combinatorial space, allowing it to model
first-principles results more closely. The end product reduces the
time needed to get results to seconds, compared to DFT simulations,
which can take hours to months. The model will only ever be as good
as the DFT data it's trained on, but being able to produce more
results will offset much of this disadvantage.
Pulling information together
Managing data is typically the key stumbling block for a
materials firm seeking digital transformation in its R&D
efforts, especially given the conservative nature of this industry.
Electronic lab notebook and laboratory information management
(ELN/LIMS) software generally form the easiest off-the-shelf tools
for moving away from disparate Excel files or even paper notebooks.
Problems arise when different business units take different
approaches here, which can lead to data siloing and
overspending.
Fortunately, most materials informatics software is designed to
interface with the APIs of common ELN/LIMS software. Indeed, the
offerings of some materials informatics providers, including
Uncountable Inc., MaterialsZone, and Albert Invent, focus heavily
on managing information in the lab while integrating advanced
machine learning features. For end-users looking for a one-stop
shop, opting for an integrated platform may make a lot of
sense.
Applying AI requires creative
approaches
There are opportunities to use machine learning at every stage
of the materials informatics process, from scraping data to using
large language models to help design an experimental process.
However, the process commonly of most interest is modeling the
behavior of a class of materials to suggest candidates that meet a
desired set of properties.
This inverse design process will typically use high-dimensional
data that often has many missing values and may be pulled from many
data sources, offering a substantially different challenge from
"big data" problems. Active or sequential learning, where the
performance of suggested candidates is verified and the underlying
model retrained, is a common approach pursued by players like
Citrine Informatics to forming an optimal experimental strategy.
Advanced AI approaches abound: a cherry-picked example from UK
player Intellegens modifies the input/output structure of neural
networks to allow missing properties to be estimated iteratively.
The peculiarity of this class of problem to materials science is
why materials companies have, in general, been founded to focus on
materials R&D instead of pivoting from another class of AI
problem.
Usability is key
The role of a materials informatics provider is to link the
expertise of data and materials scientists, who will typically have
significantly different expertise. Interfaces need to be accessible
to users who have no coding experience while offering more powerful
code inputs to those with programming and machine-learning skills.
Visualization of results should be flexible and intuitive to allow
users to get the most out of the platform. The materials
informatics SaaS companies that have seen the most success so far
have tended to excel in making the software easy to use while
offering more advanced tools for power users, allowing end users to
get enthusiastic about the platform long before its use on a
full-scale commercial project. Putting usability front and center
should be a top priority for anyone looking to enter this
industry.
Further insights
IDTechEx's recent report, "Materials Informatics 2024-2034:
Markets, Strategies, Players", is now in its fourth edition.
Informed by first-hand interviews with the industry's major
players, the report provides market forecasts, player profiles,
investments, roadmaps, and comprehensive company lists, making this
essential reading for anyone wanting to get ahead in this
field.
To find out more about this report, including downloadable
sample pages, please visit
www.IDTechEx.com/materialsinformatics.
For the full portfolio of advanced materials and critical
minerals market research from IDTechEx, please
visit www.IDTechEx.com/Research/AM.
About IDTechEx:
IDTechEx provides trusted independent research on emerging
technologies and their markets. Since 1999, we have been
helping our clients to understand new technologies, their supply
chains, market requirements, opportunities and forecasts. For more
information, contact research@IDTechEx.com or
visit www.IDTechEx.com.
Image
download:
https://www.dropbox.com/scl/fo/uvbdj70cnjj0ewt6cxor1/AE6ft6Ye30VARB8qMxASZTM?rlkey=9wlpb11koz901wytiy62zygm2&dl=0
Media Contact:
Lucy
Rogers
Sales and Marketing Administrator
press@IDTechEx.com
+44(0)1223 812300
Social Media
Links:
Twitter: www.twitter.com/IDTechEx
LinkedIn: www.linkedin.com/company/IDTechEx
Photo:
https://mma.prnewswire.com/media/2414814/Pieces_of_the_materials_informatics_puzzle.jpg
Logo: https://mma.prnewswire.com/media/478371/IDTechEx_Logo.jpg
View original content to download
multimedia:https://www.prnewswire.com/news-releases/idtechex-explains-the-pillars-of-success-in-materials-informatics-302147675.html
SOURCE IDTechEx