Journal of International Technology and Information Management

Document Type



Over the last few years, big data has emerged as an important topic of discussion in most firms owing to its ability of creation, storage and processing of content at a reasonable price. Big data consists of advanced tools and techniques to process large volumes of data in organisations. Investment in big data analytics has almost become a necessity in large-sized firms, particularly multinational companies, for its unique benefits, particularly in prediction and identification of various trends. Some of the most popular big data analytics software used today are MapReduce, Hive, Tableau and Hive, while the framework Hadoop enables easy processing of such extremely large data sets. The current research attempts to create a comparative assessment of five such applications namely IBM SPSS, IBM Watson Analytics, R, Minitab and SAS. The case taken into effect for the test was that of the factors affecting housing affordability in the US. Based on the statistics obtained from the American Housing Survey (AHS) database, the researcher has identified different factors impacting the affordability in the states. The technique of reducing variables though Principal Component Analysis (PCA) and a model based on partial least square regression/polynomial regression was fitted to check the impact on the affordability. The primary findings suggest that majorly age of the head of the household, income earned were the two most important factors affecting the pricing in the region. Also, a comparison is drawn at the end of study with interpretation of the most and least effective applications.