Intern
Center for Artificial Intelligence and Data Science

Research Pillar: Human-Centered AI

The research pillar "Human-Centered AI" focuses on the intersection of artificial intelligence and human interaction. Researchers in this area explore how people interact with AI, specifically speech-based interactive technologies like virtual assistants, and analyze user behavior and perceptions of these systems. They also study the integration of AI into information systems and collaboration platforms to enhance human abilities and create collective intelligence in mixed teams of humans and AI-driven agents. Additionally, the ethical and social implications of human-AI cooperation, as well as the potential impact on employment and the economy, are considered. Another area of focus is the democratization of language technology, specifically making state-of-the-art language technology accessible to a wider range of people, including speakers of low-resource languages. This is achieved through sustainable, modular, and sample-efficient NLP models, fair and ethical NLP, and truly multilingual NLP. The research also extends to the application of cutting-edge NLP methods to interesting problems from other disciplines, particularly in the area of computational social science. In the area of Recommender Systems the focus is on deep learning models for sequential user behavior and the integration of formalized knowledge from ontologies and unstructured information.

Research Areas

Artificial intelligence is often perceived as a competition to human intelligence, however, a more likely scenario for the future digital society is one where there is collaboration between humans and AI. This collaboration is an active research area in computer science and social science, with work on interactive learning, trust in AI, and explainability of machine decisions. One key area of study is how people interact with AI, specifically speech-based interactive technologies like virtual assistants. Researchers are analyzing user behavior and perceptions of these systems, observing individuals as they use AI in their daily lives, and using data science methods to examine online resources such as reviews and opinion pieces. The research questions that are being addressed include "Can we develop an automated approach to understand user sentiment and perceptions towards AI-based systems, specifically in the context of smart speakers?", "What are the most common misconceptions about AI and how do they impact user interactions with the technology?" and "Can we develop an automatic method for identifying and addressing user's misconceptions about AI-based systems?". Additionally, the integration of AI into information systems and collaboration platforms can enhance human abilities and create collective intelligence in mixed teams of humans and AI-driven agents. It is important to consider the ethical and social implications of human-AI cooperation, as well as the potential impact on employment and the economy.

Projects:

Principal Investigators:

Publications:

  • Point me to your Opinion, SenPoi.
    In: Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 1313-1323. Association for Computational Linguistics, Seattle, United States, 2022.
    Jan Pfister, Sebastian Wankerl and Andreas Hotho.
    [doi]  [abstract]  [BibTeX] 
  •  
  • Willing to Revise? Confidence and Recommendation Adoption in AI-Assisted Image Recognition.
    In: HHAI2022: Augmenting Human Intellect. IOS Press, 2022.
    Leonore Röseler, Ingo Scholtes, Bernhard Sendhoff and Anikó Hannák.
    [doi]  [BibTeX] 

The increasing volume of available data on social systems opens new opportunities for large-scale, quantitative studies of social phenomena. Such studies can help us to better understand how humans communicate and collaborate, what makes teams productive, what mechanism are at work in successful social organizations, and how technology shapes human behavior. This research not only offers new ways to address long-standing issues in the social sciences, it is also crucial to model, design and manage socio-technical systems.

Addressing these questions, several groups at CAIDAS use data science and machine learning techniques to study social organizations. In a large-scale analysis of data on more than 30,000 developers in 58 Open Source Software projects, we could validate and quantify the Ringelmann effect known from social psychology and organizational theory. We could also show how coordination structures in software development teams influence the productivity of team members. Studying large bibliographic data sets, we could further show how social mechanisms influence editorial processes and citation practices. These works provide actionable insights for project management and policy-making.

Principal Investigators:

Publications:

  • Big Data = Big Insights? Operationalising Brooks' Law in a Massive GitHub Data Set.
    2022. cite arxiv:2201.04588Comment: Conference: ICSE 2022 - The 44th International Conference on Software Engineering, 25 pages, 4 figures, 3 tables.
    Christoph Gote, Pavlin Mavrodiev, Frank Schweitzer and Ingo Scholtes.
    [doi]  [abstract]  [BibTeX] 
  •  
  • Modeling social resilience: Questions, answers, open problems.
    2022. cite arxiv:2301.00183.
    Frank Schweitzer, Georges Andres, Giona Casiraghi, Christoph Gote, Ramona Roller, Ingo Scholtes, Giacomo Vaccario and Christian Zingg.
    [doi]  [abstract]  [BibTeX] 

 

  • Quantifying the effect of editortextendashauthor relations on manuscript handling times.
    Scientometrics, 113(1):609-631, 2017.
    Emre Sarigöl, David Garcia, Ingo Scholtes and Frank Schweitzer.
    [doi]  [BibTeX] 
  •  
  • From Aristotle to Ringelmann: a large-scale analysis of team productivity and coordination in Open Source Software projects.
    Empirical Software Engineering, 21(2):642-683, 2015.
    Ingo Scholtes, Pavlin Mavrodiev and Frank Schweitzer.
    [doi]  [BibTeX] 
  •  
  • Predicting scientific success based on coauthorship networks..
    EPJ Data Sci., 3(1):9, 2014.
    Emre Sarigöl, René Pfitzner, Ingo Scholtes, Antonios Garas and Frank Schweitzer.
    [doi]  [BibTeX] 

Software systems are at the heart of the digital society: They control critical infrastructures like communication or energy systems, fuel the increasing automation in industrial manufacturing and are key drivers of the digital economy. Despite this importance, the development of complex software systems is still a fundamental challenge. Credible reports indicate that the majority of software projects run over time or budget -- or fail altogether, resulting in billions of dollars wasted every year. And while technical aspects like, e.g., programming techniques, testing methods, or developer support tools have improved significantly over the past years, our understanding how human and social factors contribute to success or failure of software projects is still in its infancy.

Addressing these challenges, we use data science to quantitatively study collaborative software engineering processes. As an example, we use network analysis and statistical modeling to study the evolution of software architectures based on large-scale data from software repositories. This not only allows us to trace the maintainability of software systems. We can also assist developers in the refactoring of code. We further extract large data sets from online support tools, and analyze them to better understand how social factors influence software development processes. This approach has helped us to uncover social mechanisms at work in software development, to quantify risks in Open Source communities, and to improve information systems used by software development teams.

Principal Investigator:

Publications:

  • git2net: Mining Time-Stamped Co-Editing Networks from Large git Repositories.
    Gesellschaft für Informatik e.V., 2019.
    Christoph Gote, Ingo Scholtes, Frank Schweitzer.
    [doi]  [BibTeX] 
  •  
  • Automated software remodularization based on move refactoring.
    In: Proceedings of the 13th international conference on Modularity. ACM, 2014.
    Marcelo Serrano Zanetti, Claudio Juan Tessone, Ingo Scholtes and Frank Schweitzer.
    [doi]  [BibTeX] 
  •  
  • Categorizing bugs with social networks: A case study on four open source software communities.
    In: 2013 35th International Conference on Software Engineering (ICSE), pages 1032-1041. 2013.
    Marcelo Serrano Zanetti, Ingo Scholtes, Claudio Juan Tessone and Frank Schweitzer.
    [doi]  [abstract]  [BibTeX] 
  •  
  • The rise and fall of a central contributor: Dynamics of social organization and performance in the GENTOO community.
    In: 2013 6th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE), pages 49-56. 2013
    Marcelo Serrano Zanetti, Ingo Scholtes, Claudio Juan Tessone and Frank Schweitzer.
    [doi]  [abstract]  [BibTeX] 

This research area focuses on making state-of-the-art language technology accessible to a wider range of people, in particular speakers of low-resource languages and those with limited access to large-scale computational resources. CAIDAS primarily uses deep learning and representation learning methods for semantic modeling of natural language, with a focus on multilingual representation learning and cross-language transfer of models for specific NLP tasks. In addition to this core line of work on multilinguality, CAIDAS researchers also address challenges of sustainability, that is, reducing the carbon footprint of NLP research through more efficient training of language models and fairness, that is, removal of negative societal stereotypes and biases from language models. Finally, we enable researchers in social sciences and humanities as well as practitioners from various application areas to benefit from cutting-edge NLP models through interdisciplinary projects. In sum, CAIDAS develops sustainable, modular and computationally efficient NLP models, fair and ethical NLP, and truly multilingual NLP with a focus on low-resource languages.

Principial Investigator:

Publications:

  • Don't Stop Fine-Tuning: On Training Regimes for Few-Shot Cross-Lingual Transfer with Multilingual Language Models.
    In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10725-10742. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 2022.
    Fabian David Schmidt, Ivan Vulić and Goran Glavaš.
    [doi]  [abstract]  [BibTeX] 
  •  
  • RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models.
    In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 2021.
    Soumya Barikeri, Anne Lauscher, Ivan Vulić and Goran Glavaš.
    [doi]  [BibTeX] 
  •  
  • Sustainable Modular Debiasing of Language Models.
    In: Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, 2021.
    Anne Lauscher, Tobias Lueken and Goran Glavaš.
    [doi]  [BibTeX] 
  •  
  • From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers..
    In: B. Webber, T. Cohn, Y. He and Y. Liu, editors, EMNLP (1), pages 4483-4499. Association for Computational Linguistics, 2020.
    Anne Lauscher, Vinit Ravishankar, Ivan Vulic and Goran Glavas.
    [doi]  [BibTeX] 
  •  
  • Political Text Scaling Meets Computational Semantics.
    ACM/IMS Transactions on Data Science, 2(4):1-27, 2021.
    Federico Nanni, Goran Glavaš, Ines Rehbein, Simone Paolo Ponzetto and Heiner Stuckenschmidt.
    [doi]  [BibTeX] 

Recommender systems support users in their decisions by suggesting personalized and suitable recommendations in various domains. Latest research focus is on deep learning models for the sequential user behaviour, e.g. for customer in online shops or in online games. Content based recommender focus on the integration of formalized knowledge from ontologies and unstructured information such as natural language, to improve the treatment of patients in hospitals.

Projects:

Principal Investigator:

Publications:

  • Personalization through User Attributes for Transformer-based Sequential Recommendation.
    2022 .
    Elisabeth Fischer, Alexander Dallmann and Andreas Hotho.
    [BibTeX] 
  •  
  • Towards Responsible Medical Diagnostics Recommendation Systems.
    2022. 4th FAccTRec Workshop on Responsible Recommendation.
    Daniel Schlör and Andreas Hotho.
    [doi]  [abstract]  [BibTeX] 
  •  
  • A Case Study on Sampling Strategies for Evaluating Neural Sequential Item Recommendation Models.
    In: Fifteenth ACM Conference on Recommender Systems. ACM, 2021.
    Alexander Dallmann, Daniel Zoller and Andreas Hotho.
    [BibTeX] 

Principial Investigators

Marc Latoschik

Human-Computer Interaction

Goran Glavaš

Natural Language
Processing

Andreas Hotho

Data Science

Ingo Scholtes

Machine Learning for
Complex Networks

Carolin Wienrich

Psychology of Intelligent Interactive Systems