Gradient Boosting Neural Networks: GrowNet
AI and machine learning permeate every aspect of modern life, from email spam filtering and e-commerce to financial security and medical …
AI and machine learning permeate every aspect of modern life, from email spam filtering and e-commerce to financial security and medical diagnostics.
Deep learning has been one of the key innovations that have pushed the boundary of science beyond what was considered feasible.
It remained challenging to develop tailor-made deep neural networks for new application areas due to their inherent complexity, despite their seemingly limitless potential in theory and demonstrated practice.
Architecture must be designed with immense skill and often with great luck for any given application.
There is no established paradigm for creating application-specific DNNs, which results in practitioners resorting to heuristics or hacks.
What is GrowNet?
GrowNet uses the idea of gradient boosting, which has a formidable reputation in machine learning for its capacity to build sophisticated models out of simpler components incrementally, that can successfully be applied to the most complex learning tasks.
To build complex models widely used in academia and industry as a reliable workhorse for everyday tasks in various domains, GBDT frameworks like XGBoost, LightGBM, and CatBoost combine weak learners with gradient boosting frameworks.
Although decision trees are helpful in their own right, there are many domains in which deep neural networks perform much better, especially when dealing with structured data.
GrowNet combines the power of gradient boosting with the flexibility and versatility of neural networks and introduces a new modeling paradigm that can build up a DNN layer by layer.
Rather than decision trees, GrowNet uses shallow neural networks as weak learners in a gradient boosting framework, which can be applied to a range of classification, regression, and ranking tasks.
GrowNet is an off-the-shelf optimization algorithm that is faster and easier to train than traditional deep neural networks.
GrowNet uses second-order statistics and global corrective steps to improve stability and fine-tune our models for specific tasks as part of its training innovations.
Grown versus DNN
What would happen if we combined all these external networks into one deep neural network?
There are a couple of issues with this approach:
1) Select architecture of the model, Number of units per each hidden layer, batch normalization, dropout level, etc. is very time-consuming.,
2) DNNs require vast computational power and generally run slower.
GrowNet leverages shallow neural networks as "weak learners in a gradient boosting framework."
Using this flexible network structure, we can perform multiple machine learning tasks simultaneously while incorporating second-order statistics, corrective steps, and dynamic boost rates to avoid the pitfalls of gradient boosting decision trees.
A study is conducted to explore the limits of neural networks as weak learners in the boosting paradigm and analyze the effects of each GrowNet component on the model performance and convergence.
GrowNet performs better in regression, classification, and learning to rank than state-of-the-art boosting methods. In these tasks, GrowNet performs better than DNNs, requires less training time, and is easier to tune than DNNs.