Main Data Inhibitors and Enablers for AI Applications
Wings, Sujit; Härkönen, Janne (2023-10-15)
Wings, Sujit
Härkönen, Janne
R. Piskac c/o Redaktion Sun SITE, Informatik V, RWTH Aachen
15.10.2023
Wings, S. & Härkönen, J. (2023). Main Data Inhibitors and Enablers for AI Applications. In J. Kasurinen & T. Päivärinta (eds.), Proceedings of the Annual Symposium of Computer Science 2023 co-located with The International Conference on Evaluation and Assessment in Software Engineering (EASE 2023) (s. 59-71). R. Piskac c/o Redaktion Sun SITE, Informatik V, RWTH Aachen. https://ceur-ws.org/Vol-3506/paper05.pdf
https://creativecommons.org/licenses/by/4.0/
©️ 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)
https://creativecommons.org/licenses/by/4.0/
©️ 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)
https://creativecommons.org/licenses/by/4.0/
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202401031043
https://urn.fi/URN:NBN:fi:oulu-202401031043
Tiivistelmä
Abstract
Companies are increasingly leveraging AI (Artificial Intelligence), in attempts to gain competitive advantage. This paper focuses on the AI applications for analytics to enable automated decision-making. The AI applications are especially attractive to companies due to them potentially enabling process automation, and the wider adoption of RPA (Robotic Process Automation). These appear as the key drivers for reducing operational process expenses. The specific focus of this paper is on the data-related inhibitors and enablers for AI applications, as AI relies heavily on data. The methodology involves a literature review and an in-depth case study, involving a questionnaire covering roles in the data domain and the product domain, semi-structured interviews, and analyzing internal use-case descriptions. The findings indicate that data fragmentation is among the main inhibitors. Data fragmentation appears as the root cause for the low quality of two intrinsic data quality dimensions, namely completeness, and consistency. In addition, data fragmentation drives the cost of AI modeling up, as data scientists need to re-create data assets on a per-use-case basis. The findings also indicate that productized data assets could be the main enabler for leveraging AI applications as they not only ensure the quality of the intrinsic data quality dimensions (correctness, completeness, timeliness, and consistency), but also contribute to the re-use of data assets. The latter is a driver for both cost reduction of AI modeling and faster AI model iterations, which in turn is a driver for AI model quality.
Companies are increasingly leveraging AI (Artificial Intelligence), in attempts to gain competitive advantage. This paper focuses on the AI applications for analytics to enable automated decision-making. The AI applications are especially attractive to companies due to them potentially enabling process automation, and the wider adoption of RPA (Robotic Process Automation). These appear as the key drivers for reducing operational process expenses. The specific focus of this paper is on the data-related inhibitors and enablers for AI applications, as AI relies heavily on data. The methodology involves a literature review and an in-depth case study, involving a questionnaire covering roles in the data domain and the product domain, semi-structured interviews, and analyzing internal use-case descriptions. The findings indicate that data fragmentation is among the main inhibitors. Data fragmentation appears as the root cause for the low quality of two intrinsic data quality dimensions, namely completeness, and consistency. In addition, data fragmentation drives the cost of AI modeling up, as data scientists need to re-create data assets on a per-use-case basis. The findings also indicate that productized data assets could be the main enabler for leveraging AI applications as they not only ensure the quality of the intrinsic data quality dimensions (correctness, completeness, timeliness, and consistency), but also contribute to the re-use of data assets. The latter is a driver for both cost reduction of AI modeling and faster AI model iterations, which in turn is a driver for AI model quality.
Kokoelmat
- Avoin saatavuus [37744]