Hello, I’m Md Fahim. I am currently working as a Research Assistant in the CCDS Lab at IUB, and I also work part-time as an AI/ML Researcher at Penta Global Limited. I earned my B.Sc. degree in Computer Science and Engineering from the University of Dhaka.
My research interests are primarily in Natural Language Processing (NLP) and Multimodality. In the field of NLP, I focus on areas such as low-resource languages, hate speech detection, and improving the inference speed of language models. In multimodal research, I am engaged in projects related to visual question answering, modality alignment, addressing modality gaps, and developing multimodal adapters. Additionally, I have also explored several important & popular topics Variational Autoencoders (VAEs), Posterior Collapse, State Space Models (SSMs), and Diffusion models.
News and Updates
- December 2024: Reviewing @ December ACL-ARR 2024
- October 2024: Reviewing @ October ACL-ARR 2024 [NAACL-2025]
- September 2024: Paper accepted in EMNLP 2024 - Findings 😍🎉
- August 2024: Two papers have been accepted in ICPR 2024
- July 2024: One paper has been accepted in ECAI 2024 😍
- May 2024: Finalists at Robi Datathon 3.0, Bangladesh’s largest data analysis event with 3,500+ participants.
- May 2024: Two shared task papers accepted in EXIST 2024.
Selected Publications
BanglaTLit: A Benchmark Dataset for Back-Transliteration of Romanized Bangla
EMNLP-2024 [Findings]
Md Fahim*, Fariha Tanjim Shifat*, Fabiha Haider*, Deeparghya Dutta Barua, Md Sakib Ul Rahman Sourove, Md Farhan Ishmam, Farhad Alam Bhuiyan
- First large-scale automated Bangla transliteration, BanglaTLit, with over 42.7k samples.
- A romanized Bangla pre-training corpus, BanglaTLit-PT, with over 245.7k samples.
- Novel T5-based dual encoder architecture achieving SOTA on BanglaTLit.
Improving the Performance of Transformer-based Models Over Classical Baselines in Multiple Transliterated Languages
ECAI-2024 [Full Talk Presentation]
Fahim Ahmed*, Md Fahim*, Amin Ahsan Ali, Ashraful Amin, AKM Mahabubur Rahman
HateXplain Space Model: Fusing Robustness with Explainability in Hate Speech Analysis
Md Fahim,Md Shihab Shahriar, Mohammad Ruhul Amin
Aambela at BLP-2023 Task 2: Enhancing BanglaBERT Performance for Bangla Sentiment Analysis Task with In Task Pretraining and Adversarial Weight Perturbation
BLP Workshop @ EMNLP [Best Paper Award]
Md Fahim
TinyLLM Efficacy in Low-Resource Language: An Experiment on Bangla Text Classification Task
ICPR 2024 [Oral Presentation]
Farhan Noor Dehan*, Md Fahim*, Amin Ahsan Ali, Ashraful Amin, AKM Mahabubur Rahman
ChitroJera: A Regionally Relevant Visual Question Answering Dataset for Bangla
Deeparghya Dutta Barua*, Md Sakib Ul Rahman Sourove*, Md Farhan Ishmam*, Fabiha Haider, Fariha Tanjim Shifat, Md Fahim, Farhad Alam Bhuiyan