The Ethics of AI Use in Healthcare Diagnostics
AI is used in many parts of health provision, but perhaps the most ethically fraught way that AI is being integrated into the field is its growing presence in diagnostics. AI algorithms and software are being increasingly developed to support clinical decision-making overall. In general, this is a process of predictive analysis algorithms filtering and searching for patterns in multi-source data sets to generate probability analyses that healthcare providers can use to inform rather than replace their own decisions. The technology as it exists now would not allow these AI algorithms to have the final say in care provision or diagnostics – they are currently a screening tool and diagnostic aid. A pertinent example is research being done with patient electronic health records. Previously kept largely on paper, they now exist electronically, and AI algorithms are being used to link these patient records with other data sources like insurance claims, genome databanks, and research databases to generate clinically relevant information in real time.1 Clinical Decision Support Systems use AI algorithms and machine learning tactics to assist clinicians with their ultimate decision-making, basing these projects on information such as past successful diagnoses and treatments. AI continuously monitors our medical histories, as individuals and as a vast collective, to support diagnostic decisions made by GPs. The human-to-human relationship of diagnostics is, with the increasing reliance on AI-supported technologies like CDSS, developing toward a tool-to-human relationship. However, AI as a tool is not neutral, like a scalpel or stethoscope: it is a non-human entity that is changing a human-to-human relationship by being wielded by one party. There are a few key ethical issues surrounding AI-assisted diagnostic systems: first, one of accountability for the decisions (and mistakes) these systems make; and one of the potentials for harm on a large scale across population groups from the biases that can be built into AI algorithms and datasets and exacerbated by their diagnoses. Technology used in this way can be very helpful, but it can also alienate the patient from the practitioner and amplify existing biases in healthcare delivery, that can underscore problems like racism in healthcare and lead to worse health outcomes among already marginalised populations.
Even with strong data protections in place, there have been times when AI-predicted risk profiles based on genetics, socio-demographics, and lifestyles used in such places as the public health system to allocate resources or health insurers to sort people into risk groups and calculate premiums have catastrophically misfired for certain groups and put them in positions of precarious access to care and financial stress.2 Correlations between diagnoses and genetic or social factors such as gender, ethnicity, and disability can be detected by AI-assisted CDSS if programmed to do so. AI-assisted outcome predictions based on these social determinants of health could lead to further medical discrimination of these groups who are already marginalised. People from impoverished backgrounds tend to have fewer economic and social resources to access care and poorer health outcomes. If this correlation becomes, for example, a reason to deny them health insurance, it reinforces a structural negative feedback loop rather than helping these patients.3 AI is created, programmed, evaluated, tweaked, and applied by people: the datasets it works with can be biased and this bias can be controlled to a point, but ultimately AI is not a neutral or objective source of diagnostic knowledge. It is instead a superconductor of diagnostics’ strengths and weaknesses. Many people who are most likely to receive poor healthcare or struggle to access it simply do not exist in these datasets, making their populations underrepresented. There are structural influences of who is more often misdiagnosed or inappropriately treated. AI technology is powerful: expecting this power to translate into objectivity is dangerous. Treating these new diagnostic technologies as flawed and fallible with built-in biases to look carefully for is a crucial starting point for accepting them into our already fractured and imperfect health systems.
In the current paradigm of healthcare provision, a person is medicalised (officially registered as possessing or suffering from a disease, issue, problem, or abnormality requiring medical intervention) via the observation and diagnosis of a clinician. This is a human-to-human interaction reliant on both the medical canon and the doctor’s access to and application of this knowledge set. This process is imperfect and at times alienating, but it has been (in varying forms) the backbone of all medical interaction since we first began to treat our wounds. Factoring in that non-neutral instrument, an AI-assisted diagnostic system, alters this interaction in a few ways. In general, the diagnosis and observation of a human body to produce categories of wellness and illness is human-generated. What does it mean for our relationship with the “normal” body when what is normal and what is not is sorted and applied inhumanly? The ‘normal’ or ‘good’ human body is not a universal norm: the aesthetics and productivity of ‘good’ bodies are temporally, geographically, evolutionarily, and culturally influenced. How do we reconcile the fundamental individuality of illness experience (my understanding of a sharp pain may differ from yours) and the fundamental individuality of the goals of health provision (a well person) with a highly macroscopic categorising AI lens of “good” and “ill” bodies, fed with datasets that do not take invisible and social factors into account? AI seeks the norm to determine the abnormal, but what if that norm it seeks does not exist? Quantifying ultimate norms of the human body is a project incompatible both with heterogeneous societies that bring together multiple cultures, backgrounds, and contexts, and also with the body itself. What, for example, is a normal human height? What about a normal Dutch height, female height, height below the equator, height 500 years ago? These all have different answers, and we can at least partially intuit why that is, but “why” is not a part of a diagnostic programme’s process regardless of its pertinence to the individual being diagnosed. As much as we know of the body in medicine, our knowledge of every disease or thing that can go wrong in the body nor what these issues look like in every individual is not comprehensive. The lacunas remaining in our biomedical knowledge systems will, if ignored to welcome non-human diagnostic systems into healthcare delivery that, in order to be trusted in a practical clinical context, must operate as if this knowledge is near-complete, be exacerbated and potentially cause serious harm to the individuals incompletely represented by the algorithm.
Tess Buckley’s discussion of AI and justice touched astutely on another fundamental ethical question raised by the use of imperfect AI in an imperfect system: that of blame. If the clinician, who did not design the system or see even a fraction of the data the AI tool uses to assist their diagnosis, offers a diagnosis that feeds into the biases present in the AI’s algorithms and perpetuates structural injustice rather than centring the individual, are their own biases being amplified by their diagnostic tool to blame? Could having this tool and access to huge scalability of probable diagnostics weaken the human diagnostic gaze? What accountability does the AI and its programmers have to patients? The Colloquium’s discussions, particularly those around the contributions of Max Van Kleek, returned often to the Universal Declaration of Human Rights. There is serious uncertainty about where AI-contributed diagnostics fit into the Hippocratic Oath, and, if so, how operations and development should change. There are social determinants of health – where you grew up, your race – but the flip side of this coin is that are also social determinants of healthcare. Doctors and clinicians still predominantly come from certain backgrounds and have lenses through which they view their patients that cannot entirely be set aside. The structure of health provision itself is often exclusionary, more accessible to some than others. AI-assisted healthcare delivery adds a third party but not a third person to this already imperfect, biased, highly individualised dynamic. While we like to think of medicine as a linear progression toward holding ultimate truth over our internal mechanisms, medicine in practice is political, culturally influenced, and fragile. AI-assisted diagnostics will not alleviate or erase these influences nor this fragility. Regulation of diagnostic AI must take into account the situation, history, and positionality of the AI itself and its wielders to skirt its potential harms while capitalising on its potential for improving care efficiency.4 A universal code of conduct for development and use of these technologies is a natural next step, one that prioritises transparency in how the AI makes its choices and what it uses to do so, to make its operational applications explainable and avoid over-reliance and the insidious reinforcement of biases. It seems clear that mass information gathering on this level can be enormously helpful if approached and used for what it is, and not held as a step toward the shimmering utopia of true objectivity in health provision. Rather than removing itself from context, empathic healthcare provision must immerse itself in context in order to provide truly informed care to individuals.
References
-
T. Lysaght et al, “AI-Assisted Decision-making in Healthcare”, Asian Bioethics Review 11 (2019) ↩
-
T. Panch et al, “The ‘inconvenient truth’ about AI in healthcare”, NPJ Digital Medicine 2:77 (2019) ↩
-
S. Bhattacharya et al, “Inequality and the future of healthcare” J Family Med Prim Care 8:3 (2019) ↩
-
J. Morley et al, “The ethics of AI in healthcare: a mapping review”, Social Science and Medicine 260 (2020) ↩