Multimodal prediction for tabular data with text fields based on transformers
Abstract
The topic of this paper is the use of machine and deep learning methods and models in solving the prediction task on structured tabular data that includes text fields. The purpose of this paper is to improve the results of methods that have proven to be the best in working with tabular data (ensembles of decision / regression trees), by including methods that have proven to be the best in working with sequences and text (transformer models of deep learning based on the attention mechanism). Also, several classical machine learning and text processing methods will be used for referencing and comparison.
Keywords: DistilBERT, Deep Learning, Linear Regression, Attention Mechanism, Natural Language Processing with Transformer Neural Network, PCA, Random Forest Regression, Transformer Neural Networks, Ensemble Learning, XGBoost
Published on website: 3.7.2023
Attached files: amicic.pdf