Don't use deep learning for tabular data

Most business data is tabular (think Excel or SQL type data), and deep learning is generally not the best tool for modelling it.

I think people already know this, but it’s good to hear it again. A new paper shows that XGboost tends to beat DNNs for tabular data. Deep-learning is great, but it’s not the solution for most problems with data seen in businesses today. If you have images, languages or sound, the networks designed to deal with these are often very effective. But when you have rows and columns made up of a mixture of continuous and categorical data, XGBoost is probably a better bet.

“Our study shows that XGBoost outperforms these deep models across the datasets, including datasets used in the papers that proposed the deep models. We also demonstrate that XGBoost requires much less tuning.”

🛎️ Why this matters: Tabular data still rules in most businesses. Use the right tool for the job.

📖 Read more (4445 words) 📖