Date of Award


Document Type


Degree Name

Master of Science in Computer Science


School of Computer Science and Engineering

First Reader/Committee Chair

Dr. Jennifer Jin


A novel technique for remote sensing image scene classification is employed using the Compact Vision Transformer (CVT) architecture. This model strengthens the power of deep learning and self-attention algorithms to significantly intensify the accuracy and efficiency of scene classification in remote sensing imagery. Through extensive training and evaluation of the RSSCNN7 dataset, our CVT-based model has achieved an impressive accuracy rate of 87.46% on the original dataset. This remarkable result underscores the prospect of CVT models in the domain of remote sensing and underscores their applicability in real-world scenarios. Our report furnishes an elaborate account of the model's architecture, training methodology, and evaluation process, shedding light on the key insights and advancements in remote sensing image analysis. This work holds promise for a variety of applications, including agriculture, environmental surveillance, and disaster control, where precise scene classification is of utmost importance.