Comparative Analysis of Reddit Posts and ChatGPT-Generated Texts’ Linguistic Features: A Short Report on Artificial Intelligence’s Imitative Capabilities

Authors

  • Erika Kristine E. Arcenal Department of Humanities and Social Sciences, De La Salle University Integrated School Senior High School
  • Licca Pauleen V. Capistrano Department of Humanities and Social Sciences, De La Salle University Integrated School Senior High School
  • Marielle Jessie D. De Guzman Department of Humanities and Social Sciences, De La Salle University Integrated School Senior High School
  • Micaela Isabel M. Forrosuelo Department of Humanities and Social Sciences, De La Salle University Integrated School Senior High School
  • Janeson M. Miranda Department of Humanities and Social Sciences, De La Salle University Integrated School Senior High School, 0922, Philippines

DOI:

https://doi.org/10.11594/ijmaber.05.09.06

Keywords:

appreciatory posts, artificial intelligence, ChatGPT, imitative capabilities, linguistic analysis, Reddit, comparative analysis

Abstract

In recent years, the unprecedented explosion of artificial intelligence (AI), particularly generative AI, has dramatically and drastically altered many human fields, posing queries about how generative AI can imitate human language. Given the newness of generative AI as a controversial phenomenon, there is an urgency to closely examine how its linguistic outputs could mimic human language produced in natural contexts. Hence, in this short report, we discuss the observed similarities and differences in the linguistic features of the subreddit r/Marriage spouse appreciatory posts and ChatGPT-4 outputs. These results were the offshoot of our genre analysis on these two linguistic data sets. Our analysis revealed that ChatGPT-4 generated texts contain impeccable grammar, while the Reddit appreciatory posts have grammatical discrepancies, such as errors in subject-verb agreement, improper punctuation marks, and erroneous capitalization; ChatGPT-4 generated texts have more complex syntactical structure; Reddit dataset utilized more internet jargon, slang, and profanities and seems to be unpredictable and arbitrary in terms of textual length; and ChatGPT-4 outputs appear to overuse emojis while underuse emoticons and tend to use these digital linguistic elements without regard to their proper contexts. In light of these results, we claim that AI-generated texts, although they can mimic human language, this is on a mere surface level, and a closer inspection could uncover distinct variations. We recommend that future studies use more comprehensive and different datasets and continuously employ comparative and contrastive linguistic analysis to further investigate AI’s imitative capabilities.

Downloads

Download data is not yet available.

References

Andry, P., Gaussier, P., Moga, S., Banquet, J. P., & Nadel, J. (2001). Learning and commu-nication via imitation: An autonomous robot perspective. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 31(5), 431-442.

Collins, C., Dennehy, D., Conboy, K., Mikalef, P. (2021, October). Artificial intelligence in information systems research: A system-atic literature review and research agen-da. International Journal of Information and Management, 60. https://doi.org/10.1016/j.ijinfomgt.2021.102383

Dwivedi, Y. K., Sharma, A., Rana, N. P., Gianna-kis, M., Goel, P., & Dutot, V. (2023). Evolu-tion of artificial intelligence research in Technological Forecasting and Social Change: Research topics, trends, and fu-ture directions. Technological Forecast-ing and Social Change, 192. https://doi.org/10.1016/j.techfore.2023.122579

Jarrahi, M. H., Lutz, C., & Newlands, G. (2022). Artificial intelligence, human intelligence and hybrid intelligence based on mutual augmentation. Big Data & Society, 9(2). https://doi.org/10.1177/20539517221142824

Lenhart, A. (2015, October 1). Chapter 4: So-cial media and romantic relationships. Pew Research Center. https://www.pewresearch.org/internet/2015/10/01/social-media-and-romantic-relationships/

Mah, P., Skalna, I., Muzam, J. (2022). Natural Language Processing and Artificial Intel-ligence for Enterprise Management in the Era of Industry 4.0. Applied Sciences, 12(18), 9207. https://doi.org/10.3390/app12189207

Markowitz, D. M., Hancock, J., & Bailenson, J. (2023). Linguistic markers of AI-Generated text versus human-generated text: Evidence from hotel reviews and news headlines. PsyArXiv. January, 30.

Miller, L. E., Bhattacharyya, D., Miller, V. M., & Bhattacharyya, M. (2023). Recent trend in Artificial Intelligence-Assisted Biomedical Publishing: A Quantitative Bibliometric analysis. Cureus. https://doi.org/10.7759/cureus.39224

Perazzoli, S., Neto, J., Menezes, M. (2022, June). Systematic analysis of constella-tion-based techniques by using Natural Language Processing. Technological Forecasting and Social Change, 179. https://doi.org/10.1016/j.techfore.2022.121674.

Proferes, N., Jones, N., Gilbert, S., Fiesler, C., & Zimmer, M. (2021). Studying Reddit: A systematic overview of disciplines, ap-proaches, methods, and ethics. Social Me-dia + Society, 7(2), 205630512110190. https://doi.org/10.1177/20563051211019004

Roberts, J., Baker, M., & Andrew, J. (2024). Ar-tificial intelligence and qualitative re-search: The promise and perils of large language model (LLM)‘assistance’. Critical Perspectives on Accounting, 99, 102722. https://doi.org/10.1016/j.cpa.2024.102722

Shah, A., Ranka, P., Dedhia, U., Prasad, S., Mu-ni, S., & Bhowmick, K. (2023). Detecting and unmasking AI-generated texts through explainable artificial intelligence using stylistic features. International Journal of Advanced Computer Science and Applications, 14(10).

Tidemann, A. (2011). An artificial intelligence architecture for musical expressiveness that learns by Imitation. In NIME (pp. 268-271). https://citeseerx.ist.psu.edu/document?rep-id=rep1&type=pdf&doi=cabbb342df626a93446054320deb730db9843251

Vogels, E. A., & Anderson, M. (2020, May 8). Dating and relationships in the digital age. Pew Research Center. https://www.pewresearch.org/internet/2020/05/08/dating-and-relationships-in-the-digital-age/

White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J., & Schmidt, D. C. (2023). A prompt pattern catalog to enhance prompt engineering with ChatGPT arXiv preprint arXiv:2302.11382 (Cornell Uni-versity). https://doi.org/10.48550/arxiv.2302.11382

Xu, Y., Liu, X., Cao, X., Huang, C., Liu, E., Qian, S., Liu, X., Wu, Y., Dong, F., Qiu, C., Qiu, J., Hua, K., Su, W., Wu, J., Xu, H., Han, Y., Fu, C., Yin, Z., Liu, M., ... Zhang, J. (2021, Oc-tober 28). Artificial intelligence: A power-ful paradigm for scientific research. The Innovation, 2(4). https://doi.org/10.1016/j.xinn.2021.100179

Downloads

Published

2024-09-23

How to Cite

Arcenal, E. K. E., Capistrano, L. P. V., De Guzman, M. J. D., Forrosuelo, M. I. M., & Miranda, J. M. (2024). Comparative Analysis of Reddit Posts and ChatGPT-Generated Texts’ Linguistic Features: A Short Report on Artificial Intelligence’s Imitative Capabilities. International Journal of Multidisciplinary: Applied Business and Education Research, 5(9), 3475-3481. https://doi.org/10.11594/ijmaber.05.09.06