Of these, the supervised model is a better choice for developing soft sensors or creating predictive tags.
Although there are hundreds of supervised machine learning models, only a handful of them—from a category known as regression algorithms—are useful for creating soft sensors. Following is a description of each:
Linear Regression: This is one of the most useful and simplest ways to create a soft sensor. However, certain processes, such as measuring the viscosity of a polymer, are too complex for a linear regression. This algorithm generates a function that predicts the value of a target variable. It does so as a function of a linear combination of one or more variables. When using one variable, it is called a univariate linear regression. Multiple variables give it the name multivariate linear regression. The benefit of using this model is its clarity. It is easy to determine which of the variables have the greatest effect on the target. This is known as feature importance.
Decision Tree: Decision trees theoretically can have as many rules and branches as they need to fit the data. They use these rules from independent variables known as a group of features. The result is a piecewise constant estimate of the target value. Because they can have many rules and branches, they can be very flexible. On the other hand, they also carry a risk of overfitting the data. Overfitting happens when a model is trained for too long. This allows the model to adapt to noise in the dataset, which it begins to accept as normal. Underfitting the data also can occur. In that case, the algorithm is not trained long enough, and therefore does not have enough data to determine how independent variables might be related to the target variable, or what influence they may have on it. Both overfitting and underfitting the data lead to model failure. The model can no longer handle new data and cannot be used for a soft sensor. The concepts of overfitting and underfitting data are not unique to a decision tree model.
Random Forest: This is essentially a combination of multiple decision tree models in one model. It offers more flexibility to allow more features and it gives more predictive capabilities. However, it also carries a high risk of overfitting the data.
Gradient Boosting: In machine learning, gradient boosting is often referred to as an ensemble model. Like random forest, gradient boosting combines multiple decision trees. But it is different in that it optimizes each tree to minimize the loss function calculated at the end. These models can be very effective, but they become harder to interpret over time.
Neural Network: The concept known as deep learning is a neural network regression model. This model takes the input variables and, when applied to a regression problem, generates a value for the target variable. The most basic neural network is a multiplayer perceptron. In these models, only a single neuron arrangement is used. More often, a neural network will have an input layer, one or more hidden layers (each with multiple neurons) and an output layer for the value.
The value of the weighted inputs within each neuron inside the hidden layer are added up and passed through an activation function (such as the sigmoid function). This function makes the model non-linear. Once the function makes its way through the model, it reaches the output layer containing a single neuron. Weights and biases that best fit the features and target values are determined while the model is being trained.
Collaborative design
A common misconception for those new to machine learning is that there will be a correct model that will fit a certain need. That is not the case. Choosing one model over another is a complex decision that is partially based on the experience of the data scientist.
Furthermore, none of these supervised regression models will produce the same results each time. Therefore, there is no such thing as a “best” model, but some might suit certain situations better.
Collaboration between data scientists and operations experts on any machine learning exercise begins with a mutual understanding of the parameters involved, the targeted use and the method of development and deployment.
With a solid understanding of these algorithms, engineers can make important contributions to soft sensor design.
Eduardo Hernandez is customer success manager at TrendMiner.