PE
Petter Ericson
2 records found
1
Helpful, harmless, honest?
Sociotechnical limits of AI alignment and safety through Reinforcement Learning from Human Feedback
This paper critically evaluates the attempts to align Artificial Intelligence (AI) systems, especially Large Language Models (LLMs), with human values and intentions through Reinforcement Learning from Feedback methods, involving either human feedback (RLHF) or AI feedback (RLAIF
...
This article explores the rapidly developing field of Critical AI Studies and its relation to issues of class and capitalism through a hybrid approach based on distant reading of a newly collected corpus of 300 full-text scientific articles, the creation of which is itself a firs
...