top of page

Moral Uncertainty and the Parliament of Values Approach

  • Writer: Yatin Taneja
    Yatin Taneja
  • Mar 9
  • 13 min read

Moral uncertainty arises when agents lack definitive knowledge of which moral theory or value system is correct, creating a core epistemic gap that complicates the decision-making process. This state of doubt differs significantly from empirical uncertainty, where an agent lacks information about the state of the world, because it involves uncertainty about the normative principles themselves that ought to guide action. The challenge lies in making coherent choices when competing ethical frameworks prescribe different actions, and there is no higher-order meta-theory to adjudicate between them definitively. Decision-making under core ethical disagreement requires a mechanism that allows an agent to act rationally without committing to a single, potentially incorrect moral view. The Parliament of Values approach proposes aggregating influence across competing moral frameworks in proportion to their epistemic credibility, effectively treating the selection of a moral theory as a probabilistic exercise rather than a binary choice. This method selects a single dominant theory less frequently than other methods, instead favoring a synthesis of inputs that reflects the agent's degree of belief in each framework. By treating moral uncertainty as a form of decision-making under risk, the approach allows agents to maximize expected moral value across a distribution of possible true moralities.



Different ethical theories receive assignments of probabilities or weights based on available evidence, which serves as the foundation for the aggregation process. These weights are not arbitrary, nor are they derived from democratic procedures involving equal voting rights among theories; rather, influence allocation among moral views reflects degrees of belief regarding the truth or validity of each framework. The approach avoids moral dogmatism by refusing to privilege one framework absolutely, acknowledging that any single theory might be flawed or incomplete. It enables coherent action through weighted aggregation, where the final decision is a compromise that is sensitive to the strengths and weaknesses of each constituent theory. The underlying assumption is that moral theories can be meaningfully compared and ranked along dimensions such as internal consistency, explanatory power, and alignment with considered judgments. These criteria provide a basis for assigning credences, ensuring that theories with fewer contradictions and greater intuitive resonance receive higher influence in the decision calculus.


Explanatory power and alignment with considered judgments serve as ranking criteria that allow for the differentiation between competing normative systems. Core mechanisms involve assigning numerical weights to moral theories based on this epistemic assessment, transforming abstract philosophical preferences into computational inputs. These weights compute weighted averages of recommended actions, producing a composite utility function or policy recommendation that guides the agent's behavior. Weighting schemes incorporate factors like theoretical simplicity and empirical adequacy, favoring theories that offer parsimonious explanations for moral phenomena and fit with observed human ethical intuitions. Resistance to bias and performance under idealized reflection also factor into the weights, ensuring that theories which rely on cognitive biases or fail under rational scrutiny are discounted. Aggregation occurs at the level of outcomes or policies, allowing for context-sensitive trade-offs that prioritize different values in different scenarios based on the weighted input of each theory.


The system must specify how to handle incommensurable values across theories, as different ethical frameworks may utilize fundamentally different currencies of value, such as utility versus rights or virtues. Normalization or bounded utility functions handle these incommensurable values by mapping the outputs of disparate theories onto a common scale suitable for comparison and aggregation. This process requires a shared evaluative space where the recommendations of utilitarianism, deontology, and virtue ethics can be translated into mutually intelligible units. Implementation requires formal models of moral uncertainty that can accommodate these translations and calculations. Bayesian epistemology or expected utility theory adapted to normative domains provide these models, offering a rigorous mathematical framework for managing doubt. The approach assumes moral theories can be represented in a shared evaluative space and remains compatible with both consequentialist and non-consequentialist frameworks provided they can be operationalized into evaluative functions.


Operationalization into evaluative functions is necessary for compatibility, requiring that each moral theory be expressed as a function that takes an action or state of the world as input and returns a value score. It requires minimal shared understanding of what constitutes a moral reason, as the aggregation function operates on the outputs of the theories rather than their internal justifications. Moral uncertainty describes a lack of certainty about which moral theory is correct, while credence refers to the degree of belief in a specific moral theory. Influence weight denotes the proportional impact of a theory on decisions, derived directly from its credence. Expected moral value is the weighted sum of values assigned by different theories, serving as the maximization target for the agent. A moral theory functions as a coherent set of principles prescribing actions based on normative reasoning, and epistemic credibility refers to the strength of evidence supporting a moral framework.


The aggregation function denotes the process combining inputs from multiple theories, utilizing the assigned weights to produce a final decision. Reflective equilibrium involves adjusting beliefs to achieve coherence among intuitions and principles, a process that informs the assignment of credences to different theories. Incommensurability describes situations where values cannot be directly compared, necessitating the use of normalization techniques to facilitate aggregation. Early work on moral uncertainty dates to the 1980s, when philosophers began to systematically address the implications of normative doubt for rational decision-making. Philosophers like Ted Lockhart and Andrew Sepielli explored decision rules under normative doubt, laying the groundwork for formal approaches to this problem. William MacAskill introduced the term “moral uncertainty” into contemporary analytic philosophy in 2009, framing it as a distinct problem from empirical uncertainty that required its own set of decision-theoretic tools.


MacAskill framed moral uncertainty as a distinct problem from empirical uncertainty, arguing that standard expected utility theory must be modified to account for differences in normative frameworks. Subsequent developments by Hilary Greaves and Krister Bykvist expanded formal models, providing rigorous mathematical treatments of how to maximize expected moral value when the morality itself is uncertain. These models included the use of weighted utility maximization across moral theories, establishing a precedent for the numerical treatment of ethical frameworks. The Parliament of Values metaphor gained traction in the 2010s as a way to visualize this process. It visualized proportional influence without implying equal voting rights, drawing an analogy to a legislative body where different theories hold seats based on their credibility rather than population counts. Critiques from moral realists questioned whether moral theories can be treated as probabilistic hypotheses, arguing that moral truths are necessary rather than contingent.


These critiques led to refinements in weighting methodologies to address concerns about the objectivity of moral truths and the applicability of probability to normative domains. The rise of AI alignment research in the 2020s brought renewed attention to the approach, as engineers and philosophers sought methods to instill ethical reasoning in autonomous systems. The approach matters now due to increasing deployment of autonomous systems in high-stakes environments. Ethically sensitive domains such as healthcare and criminal justice require these systems to make decisions that have meaningful impacts on human well-being and liberty. Performance demands require systems to make reliable decisions under incomplete moral knowledge, as they cannot wait for philosophical consensus before acting. Errors in these domains carry high societal costs, including loss of life, erosion of rights, and systemic injustice.


Economic shifts toward AI-driven governance necessitate transparent ethical reasoning frameworks that stakeholders can audit and trust. Societal needs include public trust in automated decision-making, which depends on the ability of these systems to explain their choices in ethically intelligible terms. Global coordination on AI ethics is hindered by a lack of shared moral foundations across different cultures and political systems. Proportional influence models offer a pragmatic compromise by allowing systems to respect multiple ethical traditions without requiring a single global consensus on morality. No widely deployed commercial systems currently implement the Parliament of Values as a core decision mechanism, though elements of the approach appear in various experimental prototypes. Experimental deployments exist in academic AI alignment projects where researchers test the viability of aggregating diverse value sets.


Value learning architectures often weight multiple ethical datasets in these projects, using machine learning to infer weights that align with human preferences or expert judgments. Performance benchmarks include accuracy in simulating human moral judgments, testing whether the system's outputs correlate with the decisions humans would make in similar dilemmas. Reliability to shifts in credence assignments serves as another benchmark, ensuring that small changes in the assigned probabilities do not lead to erratic or catastrophic changes in behavior. Evaluation metrics focus on consistency and interpretability, prioritizing systems whose reasoning can be understood and traced back to specific ethical inputs. Avoidance of catastrophic moral failures takes precedence over optimization of a single value, reflecting a risk-averse approach to ethical decision-making under uncertainty. Dominant architectures in AI ethics rely on rule-based systems that encode specific constraints or duties directly into the code.


Supervervised learning from human judgments is another common approach, where systems learn to mimic ethical decisions based on labeled datasets of human behavior. Reinforcement learning with reward shaping is also prevalent, using feedback signals to guide agents toward desired outcomes. Challengers include uncertainty-aware models that explicitly represent doubt about their objectives. These models incorporate Bayesian credences over reward functions, allowing them to hedge against incorrect specifications of human intent. The Parliament of Values aligns more closely with uncertainty-aware models than with traditional rule-based or supervised approaches. It emphasizes explicit representation of moral doubt, forcing the system to acknowledge its own limitations regarding ethical knowledge. Current systems often embed implicit moral assumptions without transparency, making it difficult to diagnose failures or bias.


The Parliament model demands explicit weighting and auditability, requiring that every influence factor be recorded and accessible for review. Alternative approaches include moral deference, which involves adopting the view of a trusted authority such as a specific ethicist or cultural norm. It was rejected due to risks of bias and lack of accountability, as deferring to a potentially flawed authority can propagate systemic errors. Moral pluralism accepts multiple truths without aggregation, acknowledging that different values may be valid but refusing to weigh them against one another. It fails to provide actionable guidance when theories conflict, leaving the agent without a clear path forward in situations requiring a single choice. Threshold deontology involves acting only when moral certainty exceeds a set level, otherwise refraining from action.


It may result in inaction in critical scenarios where any action carries some risk, but inaction guarantees harm. No theory achieves high confidence in these scenarios, as complex ethical dilemmas often involve deep trade-offs that defy certainty. This increases exposure to preventable harm, as the agent defaults to passivity rather than risking a morally suboptimal action. Expected value maximization under moral uncertainty was preferred for its decision-theoretic coherence, providing a clear mathematical imperative even in the face of doubt. Major players in AI ethics research include academic institutions such as Oxford and MIT, which host leading research groups focused on machine ethics and alignment. Nonprofits like OpenAI and DeepMind contribute significantly by dedicating substantial resources to understanding how advanced AI systems can be aligned with human values.



The Future of Humanity Institute also plays a role in developing theoretical frameworks for managing existential risks associated with artificial intelligence. Competitive positioning varies across these organizations, with different groups prioritizing different aspects of the alignment problem. Some prioritize alignment with Western liberal values, focusing on individual rights and democratic principles. Others advocate for pluralistic or culturally adaptive frameworks that can accommodate a wider range of global perspectives. The Parliament of Values offers a middle path between these extremes by providing a structured way to incorporate diverse viewpoints without collapsing into relativism. It avoids cultural imperialism by proportionally representing multiple viewpoints, ensuring that no single cultural framework dominates the decision process unless it possesses overwhelming epistemic justification. Geopolitical dimensions include tensions between universalist and relativist approaches to ethics, which influence how AI systems are designed and deployed in different regions.


Adoption may proceed slower in regions with strong state control over AI development, as authoritarian regimes may prefer systems that enforce a specific state ideology over pluralistic models. Industry standards bodies are beginning to consider moral uncertainty in AI governance guidelines, recognizing the need for standardized approaches to ethical risk management. This creates openings for the approach to be integrated into formal standards and best practices for the development of safe AI systems. Academic-industrial collaboration is growing as companies recognize the complexity of the ethical challenges they face. Joint projects focus on value learning and moral uncertainty modeling, applying the expertise of both sectors. Universities contribute theoretical rigor and access to philosophical literature, grounding technical implementations in established ethical thought.


Industry provides computational infrastructure and real-world deployment contexts, testing theories against practical constraints. Challenges include misalignment of incentives between academic research goals and commercial product timelines. Industry often favors short-term deployability over long-term ethical reliability, creating pressure to simplify or ignore complex models of moral uncertainty. Physical constraints do not directly limit the implementation of the Parliament of Values, as it functions primarily as a conceptual and computational framework rather than a physical mechanism. Economic constraints include the cost of developing formal models of diverse moral theories, which requires significant intellectual labor and computational resources. Connecting with culturally or philosophically distant systems increases these costs by necessitating extensive translation and normalization efforts. Adaptability depends on the ability to represent and compute over a growing number of moral frameworks without becoming computationally intractable.


Computational tractability becomes a concern when aggregating highly granular theories or a vast number of distinct ethical viewpoints. Approximation algorithms or hierarchical modeling address these concerns by simplifying the aggregation process while preserving the essential structure of the weighted voting system. Data requirements include empirical inputs on moral intuitions gathered from diverse populations to inform the credence assignments. Cross-cultural value surveys and philosophical argument strength assessments are necessary to build a robust evidence base for weighting different theories. Supply chain dependencies include access to diverse moral datasets that accurately represent global perspectives rather than a narrow subset of humanity. Philosophical expertise for theory formalization is essential to ensure that complex ethical systems are represented accurately within the computational framework. Computational resources for multi-theory simulation are required to evaluate the expected outcomes of different policies across all weighted theories.


Reliance on cross-cultural moral surveys introduces sensitivities regarding data collection methods and the potential for sampling bias influencing the credence weights. Required changes in adjacent systems include updates to software architectures to support multi-theory evaluation modules. Systems must support multi-theory evaluation natively rather than treating it as an external overlay. Logging of credence weights and audit trails for moral reasoning are necessary to satisfy regulatory requirements and enable transparency. Compliance frameworks must evolve to require transparency in how AI systems handle moral uncertainty, moving beyond simple outcome reporting to process verification. Disclosure of weighted theories and influence mechanisms will be required for high-stakes applications, allowing regulators to inspect the ethical "source code" of the system's decisions. Infrastructure needs include standardized APIs for moral theory representation to facilitate interoperability between different components of an AI system.


Interoperable datasets for cross-cultural value modeling are essential to ensure that different systems can share and learn from ethical data. Second-order consequences include economic displacement in sectors where ethical oversight is currently performed by humans. Demand for human ethicists in routine decisions may decrease as automated systems take over lower-level triage and classification tasks. New business models may develop around moral auditing, where third-party firms verify the ethical reasoning processes of autonomous systems. Credence calibration services and ethical impact forecasting for AI systems represent potential markets that will arise as the demand for reliable AI governance grows. Shifts in labor markets could favor roles connecting with philosophy and data science, creating a new hybrid profession focused on computational ethics. Policy design roles will also become more important as organizations seek professionals who can manage the intersection of technical constraints and normative requirements.


Measurement shifts require new key performance indicators to assess the performance of ethically aware AI systems. Moral reliability measures performance under varying credence assignments, testing how durable the system's behavior is to changes in its ethical beliefs. Value diversity coverage and disagreement resolution efficacy serve as other metrics, ensuring the system handles a wide spectrum of values effectively. Traditional accuracy metrics are insufficient because they fail to capture the nuance of ethical trade-offs where multiple conflicting answers might be considered valid. Correct moral action often depends on uncertain normative premises, making simple right-or-wrong metrics inadequate for evaluation. Evaluation must include stress testing under moral method shifts to ensure stability when new theories are added or existing ones are modified. Adversarial credence manipulation testing is also required to ensure that bad actors cannot easily manipulate the system's weights to produce unethical outcomes.


Future innovations may include energetic credence updating based on real-world outcomes, allowing the system to learn from the consequences of its actions and adjust its ethical beliefs accordingly. Connection with constitutional AI techniques is likely, where a set of high-level principles governs the behavior of lower-level models. Decentralized moral parliaments for community-specific weighting could develop, allowing different groups to define their own credence distributions while maintaining interoperability with global standards. Advances in formal epistemology could improve methods for assigning weights by providing better models of how evidence supports normative hypotheses. Hybrid models may combine the Parliament approach with preference learning from human feedback, grounding abstract theories in observed behavior while retaining flexibility. This grounds abstract theories in observed behavior, creating a feedback loop between philosophical ideals and practical reality.


Convergence points exist with decision theory under uncertainty, as both fields deal with making optimal choices given incomplete information. Multi-agent systems and social choice theory share similarities in aggregating heterogeneous preferences, offering mathematical tools that can be adapted for moral aggregation. Synergies with explainable AI can enhance transparency in how moral weights influence decisions, making the reasoning process visible to users. Connection with causal reasoning models may improve handling of long-term moral consequences by allowing systems to simulate the downstream effects of their actions more accurately. Scaling physics limits will not concern superintelligent systems using this framework, as the computational overhead is manageable relative to their capabilities. The computational nature of the framework ensures hardware independence once sufficient processing power is available.


Workarounds for computational complexity will include hierarchical abstraction, where groups of similar theories are aggregated into clusters before being combined at a higher level. Theory clustering and Monte Carlo sampling over moral hypotheses will assist processing by reducing the search space and estimating expected values efficiently. Memory and processing demands will grow with the number of theories included in the parliament, necessitating efficient data structures and algorithms. Feasibility with future hardware will remain high for large-scale implementations because the core operations are linear algebra and probability calculations, which scale well on parallel architectures. Superintelligent systems will use the Parliament of Values to safeguard against value lock-in by ensuring that a single moral framework does not become entrenched or unchangeable during recursive self-improvement cycles. A single moral framework will not become entrenched or unchangeable because the system explicitly maintains a distribution over possible theories rather than converging on one point estimate.


Superintelligence will remain responsive to evolving human values by updating credences based on new philosophical arguments or empirical data about human preferences. New philosophical insights will be integrated into the system as they appear, preventing stagnation of the agent's ethical reasoning capabilities. Calibration will involve setting initial credences based on current evidence from moral philosophy and cross-cultural data. Mechanisms for revision will activate as understanding improves, allowing the system to self-correct its ethical stance over time. Superintelligence will utilize the Parliament model to simulate the long-term moral arc by projecting how current decisions impact future societal states across different weighted ethical views. It will test how different weightings affect societal outcomes over centuries, providing a durable assessment of long-term risks and benefits.



Cooperative alignment across diverse human populations will be enabled because the system inherently respects a plurality of values rather than imposing a monolithic standard. Differing ethical traditions will receive proportional voice according to their epistemic credibility within the aggregate assessment. The system will be embedded in recursive self-improvement architectures to ensure that improvements in intelligence also correspond to improvements in ethical reasoning. Moral uncertainty will be preserved even as intelligence increases, preventing premature convergence on a potentially suboptimal ethical framework. Superintelligence will work through deep moral ambiguity using this structured approach rather than defaulting to heuristics or arbitrary rules. It will avoid paralysis in critical scenarios by providing a clear expected value calculation even when certainty is low. Preventable harm will be minimized through expected value maximization across all plausible moral theories, ensuring that actions are robustly good regardless of which theory is ultimately correct.


The approach will provide a procedural mechanism for acting responsibly under uncertainty that is transparent and justifiable to external observers. It will reflect a shift from seeking moral truth to managing moral risk, acknowledging that absolute truth may be inaccessible but rational action remains possible. Harm reduction will take priority over ideological purity because aggregation naturally favors actions that benefit multiple theories simultaneously. The strength of the system will lie in acknowledging the limits of moral knowledge while still enabling decisive action. Decisive and justifiable action will be enabled by converting complex normative disagreements into a single quantitative metric for optimization.


© 2027 Yatin Taneja

South Delhi, Delhi, India

bottom of page