Let's put the hype and anti-hype of ML aside and discuss the possibilities it can provide to the field of metal casting. But what challenges must be crushed first?

The Machine Learning hype is real. The term appears everywhere. You hear terms like “Artificial Intelligence is the new electricity.” The anti-hype is also beginning to gain traction. Some still think that any Machine Learning method is just a black box. One item is clear: the hype is due to the unique and hard-to-dismiss impacts of Machine Learning on fields ranging from computational medicine to finance. The area of steel casting** **has not yet felt the effect.

### Why Machine Learning for Metal Casting?

So far, steel casting simulations have had to rely completely on mesh-based numerical techniques such as a finite part or finite-difference. Computer simulations based on these techniques have helped improve the quality of castings to the extent that nowadays, casting simulation software is an ordinary practice for casting makers. Despite the favorable effect that those simulations have had, simulating real solidification models with a spatial solution that is high enough to fully resolve the physical phenomena included in the simulated model is still computationally unmanageable. Therefore, creators are left with no choice but to rely on measures that make oversimplifying beliefs. As an example, they generally use models that break melt convection entirely or consider it by just raising the heat conductivity. As another sample, the dynamic collar of solidification calculations to thermodynamic databases, which is required for natural simulation of multicomponent alloys, remains beyond what can be performed in an industrial simulation. Even the over-simplified examples can only be simulated on meshes that are generally not fine enough, and therefore the simulation effects are typically not fully resolved. Despite all these rules, running simulations still takes a few hours to a few days and needs expensive hardware.

The underlying cause for all the above drawbacks is surprisingly simple: it is the fact that recent simulations use numerical differentiation to compute the results in an equation, and numerical differentiation suffers from discretization mistakes. These errors are balanced to the size of the differentiation step and keeping them small needs having a relatively small step size. That improves the computational cost of the simulations and limits the size of the simulation domain.

In addition to the issues associated with computational cost, current casting simulations have gained maturity over the past few decades, and the chance of creating a radical new technique to address issues known to be very challenging in solidification simulations, is doubtful. In other words, if we in the casting simulation residents use only the methods that have been tried in the past decades, extreme progress in the field will be fantastic.

### Theory-Trained Neural Networks (TTNs)

ML owes its current popularity especially to deep neural networks. These networks are computing methods that consist of several easy but highly interconnected processing features called neurons, which map an input array to one or numerous outputs. Each neuron has a preference and connection weights, which are determined in a method called training. After a network is trained, it can be used to make forecasts on new input information.

Deep neural networks are now changing fields such as speech recognition, computer vision, and computational medicine. Their application was recently opened to the author's field of alloy solidification modeling. I used a theoretical solidification model to train neural networks for a solidification benchmark issue in a theory-training procedure. Theory-trained Neural Networks do not need any prior knowledge of the resolution of the governing equations or any external data for training. They self-train by depending on the ability of a neural network to learn the answer of partial differential equations. In the deep learning publications, that ability is sometimes directed to by the term “solving PDEs”; TTNs can predict the key of a PDE without actually solving it; the term “solving PDEs” simply ignores that strong capability. Since TTNs can expect the answer of an equation without actually solving it, it is plausible to argue that TTNs have discovered the equations they were trained on, and the term “learning the solution” rightly emphasizes that.

Before examining the advantages that TTNs can potentially bring to the casting simulation society, consider a finite contrast method and imagine we want to perform a *d* dimensional simulation in a domain with length *L* in each spatial directive and from time zero to *t*1. To estimate the CPU time of that simulation, let me achieve a first-order analysis. Because all the nodes in the mesh need to be scanned at all the time phases, the computational time *t*cpu is balanced to the total number of time steps *N*t and the digit of grid issues in the simulation *N*x: *t*cpu~*N*t*N*x~*t*1/∆*t* (*L*/∆*x*)d~*t*1*L*d (∆x)-(2+d). The last concern follows from the stability limit of an explicit time marching technique in a diffusion-controlled design. The fact that *d* occurs in the exponent in the relationship is causing two problems, which are discussed following.

The first issue is that the formation of *d* in the exponent makes fully resolved simulations computationally very expensive. Imagine *d* = 3. Refining the mesh by just a characteristic of two will increase the computational time by a factor of thirty-two. In other words, a simulation that was carrying one day will now take one month. The second issue is that the formation of *d* in the exponent severely limits large-scale simulations. Increasing the length of the domain by only two in each direction will improve CPU time by a factor of eight. Because of these two issues, a large-scale, fully resolved simulation is generally impossible in practice and never fast. Again, these issues are because the number of dimensions *that appeared* in the exponent, and they are linked to a problem directed to as the curse of dimensionality in fields such as finance.

As TTNs do not discretize the equations, they can be predicted to not to have those issues. In other words, one can predict to train networks that can simulate a sensation with full resolution and at large-scale. This seems to be the first main benefit of TTNs compared to a mesh-based method. That anticipation is further supported by the fact that, in fields such as mathematical finance, the curse of dimensionality has been successfully overcome by using deep neural networks.

The second benefit of TTNs compared to a mesh-based method is that, as mentioned earlier, TTNs can expect the solution of an equation without actually solving it. This decreases the computational cost associated with predictions to nearly zero. In other words, nevertheless, of the computational cost associated with training TTNs, which as described above can be expected to be entirely tractable even for a fully resolved and large-scale simulation, their forecasts are almost instantaneous. This opens the opportunity for having fast, fully determined, large-scale simulations.

### Future Directions

Properly using the above two benefits can potentially re-invent casting simulations by allowing us to perform fast, fully resolved, and large-scale casting simulations. In practice, this can have a network that can instantaneously indicate, for example, hard-to-resolve defects, such as channel segregates or porosity, regardless of the size of the simulation domain.

Although utilizing the above benefits of TTNs in the field seems to be very profitable, actually achieving those goals is a difficult task mainly because of difficulties in movement TTNs. A few of the excellent research issues are:

- comparing the implementation of different optimizers
- comprehending the role of network depth and width and the size of training dataset in the implementation of a TTN
- theory-training using solidification models that include melt convection