image

The What and How of Data Analysis

Download Paper PDF: Download pdf
Author(s):
Abstract:

In this paper, we discuss about what and how of data science and data analysis: i.e., the approach and the mechanism that analysts employ while working with data. A Philosophical approach to analysis of data and data science has been undertaken those peeks into the conceptual world of aspects of the epistemology of data science. The paper also highlights the role played by analysts, tools, and specialised techniques that analysts employ in data science to derive insights from data. The discussion demonstrates the complexities associated with data science, and by what mechanism and how organisations and businesses draw insights that constitute the real value of data, and that which lay hidden deep within datasets constituting as a form of resource and asset for the organisations. 


© 2024 The Author(s). Published by RITHA Publishing. This article is distributed under the terms of the license CC-BY 4.0., which permits any further distribution in any medium, provided the original work is properly cited.


How to cite:

Chatterjee, S. (2024). The What and How of Data Analysis. Journal of Research, Innovation and Technologies, Volume III, 1(5), 51-65. https://doi.org/10.57017/jorit.v3.1(5).04 


References:

[1]  Agresti, A. (2012). Categorical data analysis, Volume 792 din Wiley Series in Probability and Statistics, John Wiley & Sons, 752 pp. ISBN: 978-0470463635

[2]  Albright, S. C., & Winston, W. L. (2020). Business Analytics: Data Analysis and Decision Making. 7th Edition, Cengage Learning, Inc. ISBN: 978-0357109953

[3]  Albright, S. C., Winston, W. L., Zappe, C. J., & Broadie, M. N. (2011). Data Analysis and Decision Making, (Volume 577). South-Western/Cengage Learning. https://www.wu.ac.at/fileadmin/wu/d/i/ifr/Data_Analysis_ and_Decision_Making.pdf 

[4]  Ali, S. M, Noopur, G., Gopal, K. N., & Rakesh, K. L. (2016). Big data visualization: Tools and challenges, In: 2nd IEEE International Conference on Contemporary Computing and Informatics (IC3I), pp. 656-660. https://doi.org/10.1109/IC3I.2016.7918044 

[5]  Aria, M., & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics, 11(4), 959-975. https://doi.org/10.1016/j.joi.2017.08.007 

[6]  Ballou, D. P., & Tayi, G. K. (1999). Enhancing data quality in data warehouse environments. Communications of the ACM, 42(1), 73-78. http://doi.org/10.1145/291469.291471 

[7]  Bornmann, L., & Mutz, R. (2015). Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references. Journal of the Association for Information Science and Technology, 66(11), 2215-2222. https://doi.org/10.1002/asi.23329 

[8]  Brandt, S. (1976). Statistical and computational methods in data analysis (No. 04). Amsterdam, The Netherlands: North-Holland Publishing Company. https://doi.org/10.1119/1.1986393 

[9] Cao, L. (2023). AI and data science for smart emergency, crisis and disaster resilience. International Journal of Data Science and Analytics, 15(3), 231-246. https://doi.org/10.1007/s41060-023-00393-w 

[10]   Carlyle, T. (1910). Lectures on heroes: Hero-worship and the heroic in history. Clarendon Press. https://www.gutenberg.org/files/1091/1091-h/1091-h 

[11]   Carpineto, C., & Romano, G. (2004). Concept Data Analysis: Theory and Applications. John Wiley & Sons. https://DOI:10.1002/0470011297

[12]   Dahlstedt, P. (2019). Big data and creativity. European Review, 27(3), 411-439. https://doi:10.1017/S1062798719000073

[13]   Donthu, N., Kumar, S., Mukherjee, D., Pandey, N., & Lim, W. M. (2021). How to conduct a bibliometric analysis: An overview and guidelines. Journal of Business Research, 133, 285-296. https://doi.org/10.1016/j.jbusres.2021.04.070

[14]   Dzemyda, G., & Sakalauskas, L. (2011). Large-scale data analysis using heuristic methods. Informatica, 22(1), 1-10. https://doi.org/10.15388/Informatica.2011.310

[15]   Feyerabend, P. K. (1991). Three dialogues on knowledge. John Wiley & Sons. ISBN: 978-0-631-17918-4

[16]   Foster, J. G., Rzhetsky, A., & Evans, J. A. (2015). Tradition and innovation in scientists’ research strategies. American Sociological Review, 80(5), 87 5-908. https://doi.org/10.1177/0003122415601618 

[17]   Garfield, E. (1980). Citation indexing. Journal of Information Science, 2(1), 47-47. https://doi.org/10.1177/016555158000200109 

[18]   Haig, B. D. (2020). Big data science: A philosophy of science perspective. In: Big Data in Psychological Research, (pp. 15-33). American Psychological Association. https://psycnet.apa.org/doi/10.1037/0000193-002 

[19]   Halevy, A., Norvig, P., & Pereira, F. (2009). The unreasonable effectiveness of data. IEEE Intelligent Systems, 24(2), 8-12. http://doi:10.1109/MIS.2009.36  

[20]   Healy, K. (2018). Data Visualization: A Practical Introduction. Princeton University Press.

[21]   Heeringa, S. G., West, B. T., & Berglund, P. A. (2017). Applied survey data analysis. Chapman and Hall CRC. https://doi.org/10.1201/9781315153278 

[22]   Hsiao, C. (2022). Analysis of panel data (No. 64). Cambridge University Press. https://doi.org/10.1017/9781009057745.016 

[23]   Iacopini, I., Milojević, S., & Latora, V. (2018). Network dynamics of innovation processes. Physical Review Letters, 120(4), 048301. https://doi.org/10.1103/PhysRevLett.120.048301 

[24]   Igual, L., & Seguí, S. (2024). Introduction to data science. In: Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications (pp. 1-4). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-50017-1 

[25]   Johnston, M. P. (2014). Secondary data analysis: A method of which the time has come. Qualitative and Quantitative Methods in Libraries, 3(3), 619-626. https://qqml-journal.net/index.php/qqml/article/view/169 

[26]   Kar, A. K., Angelopoulos, S., & Rao, H. R. (2023). Big data-driven theory building: Philosophies, guiding principles, and common traps. International Journal of Information Management, 102661. https://doi.org/10.1016/j.ijinfomgt.2023.102661 

[27]   Khatri, N., & Ng, H. A. (2000). The role of intuition in strategic decision making. Human Relations, 53(1), 57-86. https://psycnet.apa.org/doi/10.1177/0018726700531004 

[28]   Kuhn, T. S. (1997). The structure of scientific revolutions (Vol. 962). Chicago: University of Chicago Press. 

[29]   Laney, D. (2001). 3-D Data Management: Controlling Data Volume, Velocity and Variety. META Group Research Note, 6. http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf

[30]   Liu, J., Li, J., Li, W., & Wu, J. (2016). Rethinking big data: A review on the data quality and usage issues. ISPRS Journal of Photogrammetry and Remote Sensing, 115, 134-142. https://ui.adsabs.harvard.edu/ link_gateway/2016JPRS..115..134L/doi:10.1016/j.isprsjprs.2015.11.006

[31]   Martinez, I., Viles, E., & Olaizola, I. G. (2021). Data science methodologies: Current challenges and future approaches. Big Data Research, 24, 100183. https://doi.org/10.1016/j.bdr.2020.100183 

[32]   Miles, M. B., & Huberman, A. M. (1994). Qualitative Data analysis: An Expanded Sourcebook. Sage. https://vivauniversity.wordpress.com/wp-content/uploads/2013/11/milesandhuberman1994.pdf

[33]   Miranda-Saavedra, D. (2022). How to Think about Data Science. CRC Press. https://doi.org/10.1201/b23197

[34]   Murtagh, F., & Heck, A. (2012). Multivariate data analysis (Vol. 131). Springer Science & Business Media. https://doi.org/10.1007/978-94-009-3789-5

[35]   Myatt, G. J., & Johnson, W. P. (2009). Making sense of data II: A practical guide to data visualization, advanced data mining methods, and applications. 1st Edition, John Wiley & Sons. ISBN: 978-0470222805

[36]   Nasution, M. K., Syah, R., & Elveny, M. (2023). What is data science. In Data Science with Semantic Technologies (pp. 1-25). CRC Press. http://dx.doi.org/10.1201/9781003310785-1

[37]   Nicola-Gavrilă, L., & Dincă, S. (2023). Future Interdisciplinary Combination of AI Technologies and Psychology. Journal of Contemporary Approaches in Psychology and Psychotherapy, 1(1). https://doi.org/10.57017/jcapp.v1.1.02

[38]   Peng, R. D., & Matsui, E. (2015). The Art of Data Science: A guide for anyone who works with Data. Skybrude Consulting, LLC. https://ci.nii.ac.jp/ncid/BC16512258?l=en

[39]   Price, D. J. D. (1965). Networks of scientific papers. Science, 149, 3683, 510-515. https://doi.org/10.1126/science.149.3683.510

[40]   Quine, W. V. O. (1981). Theories and Things. Harvard University Press. ISBN: 978-0674879263

[41]   Rizk, A., & Elragal, A. (2020). Data science: developing theoretical contributions in information systems via text analytics. Journal of Big Data, 7, 1-26. https://doi.org/10.1186/s40537-019-0280-6

[42]   Sadiku, M., Shadare, A. E., Musa, S. M., Akujuobi, C. M., & Perry, R. (2016). Data visualization. International Journal of Engineering Research and Advanced Technology, 2(12), 11-16. http://doi.org/10.31695/IJERAT

[43]   Saha, P. (2003). Principles of Data Analysis. Cappella Archive. https://www.physik.uzh.ch/~psaha/pda/pda-a4.pdf

[44]   Sanger, J. (1994). Seven types of creativity: looking for insights in data analysis. British Educational Research Journal, 20(2), 175-185. http://dx.doi.org/10.1080/0141192940200203

[45]   Sheard, J. (2018). Quantitative data analysis. In Research Methods: Information, Systems, and Contexts, 2nd Edition, Williamson, K. & Johanson, G. (Eds.), pp. 429-452, Chandos Publishing. ISBN 978-0081022207. https://doi.org/10.1016/B978-0-08-102220-7.00018-2 

[46]   Shi, Y. (2022). Advances in Big Data Analytics. Theory, Algorithms and Practices. eBook. ISBN 978-981-16-3607-3.https://doi.org/10.1007/978-981-16-3607-3 

[47]   Sinclair, M., Sadler-Smith, E., & Hodgkinson, G. P. (2009). The role of intuition in strategic decision making. In: Handbook of Research on Strategy and Foresight. Edward Elgar Publishing. https://doi.org/10.4337/9781848447271.00032 

[48]   Smith, A. K., Ayanian, J. Z., Covinsky, K. E., Landon, B. E., McCarthy, E. P., Wee, C. C., & Steinman, M. A. (2011). Conducting high-value secondary dataset analysis: An introductory guide and resources. Journal of General Internal Medicine, 26, 920-929. https://doi.org/10.1007/s11606-010-1621-5 

[49]   SMITH, C. M. (2000). Bioinformatics, genomics, and proteomics. The Scientist. https://www.the-scientist.com/bioinformatics-genomics-and-proteomics-55317 

[50]   Spector, A. Z., Norvig, P., Wiggins, C., & Wing, J. M. (2022). Data science in context: Foundations, challenges, opportunities. http://www.cambridge.org/9781009272209

[51]   Spicer, J. (2005). Making Sense of Multivariate Data Analysis. Sage. ISBN: 978-1412904018. http://dx.doi.org/10.4135/9781412984904 

[52]   Tabah, A. N. (1999). Literature dynamics: studies on growth, diffusion, and epidemics. Annual Review of Information Science and Technology, 34, 249-286. https://www.learntechlib.org/p/92548/ 

[53]   Tukey, J. W. (1977). Exploratory Data Analysis, 1st Edition, Pearson. ISBN 978-0201076165

[54]   van Gils, B. (2023). Data in Context: Models as Enablers for Managing and Using Data, 1st Edition, Springer. 240 pp. ISBN 978-3031355387

[55]   Wang, R. Y., Reddy, M. P., & Kon, H. B. (1995). Toward quality data: An attribute-based approach. Decision Support Systems, 13(3-4), 349-372. https://doi.org/10.1016/0167-9236(93)E0050-N 

[56]   Wasserman, L. (2018). Topological data analysis. Annual Review of Statistics and Its Application, 5, 501-532. https://doi.org/10.1146/annurev-statistics-031017-100045