Cross-Attention Multimodal Transformer for Calibrated Binary Time-Series Forecasting of Rural Public Services
DOI:
https://doi.org/10.11594/ijmaber.06.11.34Keywords:
Multimodal transformer, Cross-attention mechanism, Binary time-series classification, Rural public servicesAbstract
Good governance, evidence-based planning, and sustainable rural development all rely on correct rural public service performance predictions. This study presents a Cross-Attention Multimodal Transformer developed for binary time-series classification of service conditions in the areas of agriculture, health, and environment at the level of the local government unit (LGU). Using bidirectional cross-attention layers, the model mixes several temporal signals so that healthcare and agriculture-environment streams can interact with one another. Using weighted uncertainty and calibration-awareness in a loss function helps to guarantee that the confidence scores are properly calibrated. With AUCs of 83.00% (agriculture), 79.40% (environment), and 63.90% (healthcare), which is lower, experimental results on a rural public service dataset indicate great discriminative and calibration performance. With 61.50%, 23.10%, and 18.90% respectively, the Brier scores suggest that the forecasts for health care and the environment are well calibrated. These findings suggest that cross-attention multimodal transformers may be quite useful in producing precise binary predictions of rural service results. At the LGU level, this would enable data-driven decision-making support.
Downloads
References
Park, S. (2024). Multimodal Block Transformer for Multimodal Time Series Forecasting. In Annual Conference of KIPS (pp. 636-639). Korea Information Processing Society.
Kim, K., Tsai, H., Sen, R., Das, A., Zhou, Z., Tanpure, A., ... & Yu, R. (2024). Multi-modal forecaster: Jointly predicting time series and textual data. arXiv preprint arXiv:2411.06735.
Mou, S., Xue, Q., Chen, J., Takiguchi, T., & Ariki, Y. (2025). MM-iTransformer: A Multimodal Approach to Economic Time Series Forecasting with Textual Data. Applied Sciences, 15(3), 1241.
Yuan, Y., Li, Z., & Zhao, B. (2025). A survey of multimodal learning: Methods, applications, and future. ACM Computing Surveys, 57(7), 1-34.
Su, L., Zuo, X., Li, R., Wang, X., Zhao, H., & Huang, B. (2025). A systematic review for transformer-based long-term series forecasting. Artificial Intelligence Review, 58(3), 80.
Jia, F., Wang, K., Zheng, Y., Cao, D., & Liu, Y. (2024, March). Gpt4mts: Prompt-based large language model for multimodal time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 38, No. 21, pp. 23343-23351).
Zhao, F., Zhang, C., & Geng, B. (2024). Deep multimodal data fusion. ACM computing surveys, 56(9), 1-36.
Al-Zoghby, A. M., Al-Awadly, E. M. K., Ebada, A. I., & Awad, W. A. (2025). Overview of Multimodal Machine Learning. ACM Transactions on Asian and Low-Resource Language Information Processing, 24(1), 1-20.
Adam, M., Albaseer, A., Baroudi, U., & Abdallah, M. (2025). Survey of Multimodal Federated Learning: Exploring Data Integration, Challenges, and Future Directions. IEEE Open Journal of the Communications Society.
Jing, T., Chen, S., Navarro-Alarcon, D., Chu, Y., & Li, M. (2024). SolarFusionNet: Enhanced Solar Irradiance Forecasting via Automated Multi-Modal Feature Selection and Cross-Modal Fusion. IEEE Transactions on Sustainable Energy.
Kalisetty, S., & Lakkarasu, P. (2024). Deep Learning Frameworks for Multi-Modal Data Fusion in Retail Supply Chains: Enhancing Forecast Accuracy and Agility. American Journal of Analytics and Artificial Intelligence (ajaai) with ISSN 3067-283X, 2(1).
Shao, M., Li, D., Hong, S., Qi, J., & Sun, H. (2024). IQFormer: A novel transformer-based model with multi-modality fusion for automatic modulation recognition. IEEE Transactions on Cognitive Communications and Networking.
Thundiyil, S., Picone, J., & McKenzie, S. Transformer Architectures in Time Series Analysis: A Review.
Abdullahi, S., Danyaro, K. U., Zakari, A., Aziz, I. A., Zawawi, N. A. W. A., & Adamu, S. (2025). Time-series large language models: A systematic review of state-of-the-art. IEEE Access.
Siebra, C. A., Kurpicz-Briki, M., & Wac, K. (2024). Transformers in health: a systematic review on architectures for longitudinal data analysis. Artificial Intelligence Review, 57(2), 32.
Cui, Y., Li, Z., Wang, Y., Dong, D., Gu, C., Lou, X., & Zhang, P. (2024). Informer model with season-aware block for efficient long-term power time series forecasting. Computers and Electrical Engineering, 119, 109492.
Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., & Zhang, W. (2021, May). Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence (Vol. 35, No. 12, pp. 11106-11115).
Cui, W., Wan, C., & Song, Y. (2022). Ensemble deep learning-based non-crossing quantile regression for nonparametric probabilistic forecasting of wind power generation. IEEE Transactions on Power Systems, 38(4), 3163-3178.
Jensen, V., Bianchi, F. M., & Anfinsen, S. N. (2022). Ensemble conformalized quantile regression for probabilistic time series forecasting. IEEE Transactions on Neural Networks and Learning Systems, 35(7), 9014-9025.
Turki, A., Alshabrawy, O., & Woo, W. L. (2025). Multimodal Deep Learning for Stage Classification of Head and Neck Cancer Using Masked Autoencoders and Vision Transformers with Attention-Based Fusion. Cancers, 17(13), 2115.
Xiao, W., Wang, Z., Gan, L., Zhao, S., Li, Z., Lei, R., ... & Wu, F. (2024). A comprehensive survey of direct preference optimization: Datasets, theories, variants, and applications. arXiv preprint arXiv:2410.15595.
Emami Gohari, H., Dang, X. H., Shah, S. Y., & Zerfos, P. (2024, November). Modality-aware Transformer for Financial Time series Forecasting. In Proceedings of the 5th ACM International Conference on AI in Finance (pp. 677-685).
Bouatmane, A., Daaif, A., Bousselham, A., Bouihi, B., & Bouattane, O. (2025). A Multimodal Deep Learning Model Integrating CNN and Transformer for Predicting Chemotherapy-Induced Cardiotoxicity. IEEE Access.
Nikhil, U. V., Pandiyan, A. M., Raja, S. P., & Stamenkovic, Z. (2024). Machine learning-based crop yield prediction in south india: performance analysis of various models. Computers, 13(6), 137.
Saravanan, K. S., & Bhagavathiappan, V. (2024). Prediction of crop yield in India using machine learning and hybrid deep learning models. Acta Geophysica, 72(6), 4613-4632.
Chaturvedi, R. (2024, May). Temporal knowledge graph extraction and modeling across multiple documents for health risk prediction. In Companion Proceedings of the ACM Web Conference 2024 (pp. 1182-1185).
Postiglione, M., Bean, D., Kraljevic, Z., Dobson, R. J., & Moscato, V. (2024). Predicting future disorders via temporal knowledge graphs and medical ontologies. IEEE Journal of Biomedical and Health Informatics, 28(7), 4238-4248.
Guo, M. H., Xu, T. X., Liu, J. J., Liu, Z. N., Jiang, P. T., Mu, T. J., ... & Hu, S. M. (2022). Attention mechanisms in computer vision: A survey. Computational visual media, 8(3), 331-368.
Ruan, T., & Zhang, S. (2024). Towards understanding how attention mechanism works in deep learning. arXiv preprint arXiv:2412.18288.
Kotipalli, B. (2024). The Role of Attention Mechanisms in Enhancing Transparency and Interpretability of Neural Network Models in Explainable AI.
Chefer, H., Gur, S., & Wolf, L. (2021). Transformer interpretability beyond attention visualization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 782-791).
Downloads
Published
Data Availability Statement
The dataset used in this study is available in Kaggle.com
Issue
Section
Categories
License
Copyright (c) 2025 Daryl John C. Ragadio

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See the Effect of Open Access).














