paper 2025 · AAAI/ACM Conference on AI, Ethics, and Society (AIES)

From Big Data to Valued Data: A Dataset Value Taxonomy for AI-Native Empirical Research

Scott Seidenberger, Anindya Maiti

Abstract

We propose a dataset value taxonomy for AI-native empirical research, moving beyond the "bigger is better" paradigm. The taxonomy evaluates datasets on dimensions of quality, representativeness, ethical sourcing, and fitness for purpose.

Contribution

Provides a structured framework for researchers to evaluate and communicate dataset value, addressing a gap in the AI ethics and data governance literature.