Original research report

Estimating and Examining the Replicability of Belief System Networks

Authors: {'first_name': 'Mark J.', 'last_name': 'Brandt'}


Belief system structure can be investigated by estimating belief systems as networks of interacting political attitudes, but we do not know if these estimates are replicable. In a sample of 31 countries from the World Values Survey (N = 52,826), I find that countries’ belief system networks are relatively replicable in terms of connectivity, proportion of positive edges, some centrality measures (e.g., expected influence), and the estimates of individual edges. Betweenness, closeness, and strength centrality estimates are more unstable. Belief system networks estimated with smaller samples or in countries with more unstable political systems tend to be less replicable than networks estimated with larger samples in stable political systems. Although these analyses are restricted to the items available in the World Values Survey, they show that belief system networks can be replicable, but that this replicability is related to features of the study design and the political system.

Keywords: Belief systemsnetworksreplicationpolitical stability 
DOI: http://doi.org/10.1525/collabra.312
 Accepted on 01 Apr 2020            Submitted on 03 Jan 2020

Estimating the structure of belief systems is a central activity in political psychology, political science, and sociology (e.g., Barker & Tinnick, 2006; Converse, 1964; Johnston & Ollerenshaw, 2020; Kinder & Kalmoe, 2017; Malka et al., 2019). Multiple teams have begun to conceptualize (Friedkin et al., 2016) and estimate (Fishman & Davis, 2019; Boutyline & Vaisey, 2017; Brandt et al., 2019) the structure of political belief systems as networks of relevant political attitudes and identities (for work looking at individual attitudes see Dalege et al., 2016; for work looking at moral values see Turner-Zwinkels et al., in press). The attitudes and identities of the belief system are the nodes of the network and the connections between them are the edges. After estimating the belief system network, the teams use centrality metrics from network science to identify the most central components of the belief system in the population (e.g., Boutyline & Vaisey, 2017; Brandt et al., 2019) or compare belief system density between different subgroups (e.g., Fishman & Davis, 2019). Although these teams focused on centrality and density, other edge, node and network characteristics could also be used to understand the structure of political belief systems, just as they have been used to understand the structure of other psychological constructs (e.g., psychopathology, Fried et al., 2018).

Prior work on belief system structure often assesses the association between pairs of beliefs in a population to inform how a belief system is structured (e.g., Chen & Goren, 2016; Kinder & Kalmoe, 2017; Malka et al., 2019). For example, Malka and colleagues (Malka et al., 2019), recently demonstrated that the link between economic beliefs and cultural beliefs are not always positive when looking across countries. These emerging methods rooted in network science allow researchers to go beyond pairs of associations to analyze the entire belief system simultaneously (rather than just two or three nodes at a time). This allows individual cultural and economic beliefs, such as those used by Malka and colleagues, to be situated with the other beliefs and identities in the belief system (Boutyline & Vaisey, 2017; Brandt et al., 2019). This also allows scholars to use modeling techniques that best match the tendency to theorize about belief systems as if they are networks. The purpose of this paper is to document and explore the replicability of belief system networks in a range of countries, so that researchers interested in these methods and ideas have necessary information about the replicability of the technique.

What is the Technique?

There are multiple methods for analyzing belief systems as networks depending on one’s theoretical assumptions (e.g., Boutyline & Vaisey, 2017; Brandt et al., 2019). One approach (Brandt et al., 2019), builds on work conceptualizing a variety of psychological constructs as networks (e.g., Borsboom & Cramer, 2013; Costantini & Perugini, 2016; Dalege et al., 2016; Sayans-Jiménez et al., 2019) and models belief systems using a partial correlation approach. This approach assumes that nodes that are positively connected want to be like one another, that connected nodes reciprocally affect one another, and that nodes that are unconnected are independent conditional on all of the other nodes of the network. These assumptions are consistent with a pairwise random Markov field and can be estimated as Gaussian graphical models (Epskamp, & Fried, 2018; Lauritzen, 1996).1

Theoretically, this approach is consistent with the idea that people prefer to have consistent belief systems and worldviews (Festinger, 1957; Gawronski et al., 2012; Randles et al., 2015) and with theory (e.g., Converse, 1964; Gerring, 1997) and quantitative models (e.g., Friedkin et al., 2016) that conceptualize belief systems as interconnected political attitudes and beliefs. Empirically, this approach estimates partial correlations between all of the variables (i.e. nodes) in the network and adopts regularization and model selection techniques to reduce the size of the parameter space and decrease false discovery rate (Epskamp, Borsboom, & Fried, 2018; Epskamp, & Fried, 2018; Williams, 2018; Williams et al., 2018). These estimates become the edges (or paths) in the belief system networks. They can be the target of investigations in and of themselves, or be used to calculate other features of the belief system (e.g., connectivity, centrality of a node). We use these methods to estimate the overall belief system in countries and test the replicability of these estimates.

Why Investigate Replicability?

Belief system networks are a recent methodological technique adopted from the psychopathology literature. Although this technique may be promising, it is important to understand the extent estimates from belief system networks replicate before extending the technique to study a wide range of phenomenon in the belief system literature. Assessing the replicability of estimates of belief system networks can provide justification for the use (or abolishment) of these estimates in the political psychology and political science literatures, but can also help address theoretical predictions about the stability of belief system structure.

I aim to address two questions with this study. First, I aim to document the extent to which belief system networks are replicable. I will do this by comparing belief systems estimated for a country at two different time points and assessing how similar the estimates of edge weights, centrality metrics, connectivity, and other features are between the two time points. Second, I aim to explore how methodological features and characteristics of the political systems are associated with the replicability of political belief system networks.

Are Belief System Networks Replicable?

It is important to document the replicability of belief system networks because we do not know the extent we can expect a belief system estimated in a population to replicate in the same population. There is some indication that belief systems are replicable. For example, in both the United States and New Zealand operational components of belief systems (i.e. political policies) tend to be less central than symbolic components of belief systems (i.e. identification with political symbols) across multiple time points (Brandt et al., 2019; Fishman & Davis, 2019). Moreover, when researchers test network replicability in the network psychopathy literature, the typical result is that the networks are similar in different samples (Borsboom et al., 2017; Fried et al., 2018; Jones, Williams, & McNally, 2019), although others disagree (Forbes et al., 2017; Forbes et al., in press). Together, one might suspect that the estimation of belief system networks is replicable.

However, there are also reasons to suspect that belief system networks are difficult to estimate reliably. One challenge is that they require researchers to estimate a large number of parameters. A 10-node belief system has 45 potential edges between nodes. A 20-node belief system has 190 potential edges. And a 30-node belief system has 435 potential edges. Although the estimation techniques are designed to reliably estimate the networks, even when faced with many parameters (e.g., Epskamp, Borsboom, & Fried, 2018; Epskamp, & Fried, 2018; Williams, 2018; Williams et al., 2018), it is not yet clear if this is the case with real political data. A second challenge is that researchers typically use single items to estimate each node in a belief system network. This is because the instruments used to assess belief systems are not designed to assess each potential node with multiple items. Instead, each possible policy or identity is typically represented with just one item. This practice may result in less replicable networks overall.

Another reason that political belief system networks may not be replicable is that the political system changes over time. This may be due to shifts in political coalitions, the salience of particular issues, high profile discrete events (e.g., a terrorist attack), or the experience of large-scale social upheavals (e.g., economic recessions). Such changes are likely to be reflected in the structure of the belief systems themselves (e.g., Ciuk & Yost, 2016; Converse, 1964; Federico & Malka, 2018). For example, partisan cues about changing the New Zealand flag shifted the link between party identity and support for changing the flag (Satherley et al., 2018). Similarly, major societal events like wars, economic, recessions, and major terrorist events can shift political attitudes (Van de Vyver et al., 2016; Zaller, 1992). Zaller (1992), for example, highlights how the link between political dispositions and support for the Vietnam war changes as the elite rhetoric changes. These are examples that might lead to less replicable belief system networks over time and highlight the importance of the political context for understanding the structure of political belief systems.

What Predicts Belief System Network Replicability?

The methodological and theoretical reasons to expect that belief systems may not be replicable can also be used to generate expectations for what might be related to belief system network replicability.

Methodologically, sample size and time between assessment may affect belief system replicability. First, larger samples can increase replicability by helping to precisely estimate the large number of parameters in the belief system network. This should increase the chances that a belief system network is replicable (Epskamp & Fried, 2018). Moreover, prior meta-science research suggests that the sample size of the original study is associated with its replicability (Open Science Collaboration, 2015). Second, belief system networks may be less replicable when there is more time between their estimation because of a variety of subtle and not-so-subtle changes in the political system. Just as the correlation between personality assessments decreases with time (Anusic & Schimmack, 2016), so might the replicability of belief system networks in the countries we examine.

Theoretically, indicators of instability in a political system and the fragmentation of political parties may be associated with less replicable belief systems. First, countries with more political changes should have less replicable belief system estimates. Although some political systems are relatively stable overtime, other political systems are not (e.g., Carlsen & Bruggemann, 2017). Such instability may result in belief systems that are less replicable. This may be because the salient issues in the system change overtime (e.g., Ciuk & Yost, 2016; Zaller, 1992), or because precise packages of beliefs propagated by elites shifts with the shifting political system (e.g., Converse, 1964; Federico & Malka, 2018; Zaller, 1992). Second, party fragmentation (Gallagher & Mitchell, 2008; Laakso & Taagepera, 1979) may also be associated with less replicable belief system estimates. Party fragmentation occurs when there are more political parties competing for votes. When there are more political parties each vying for votes, influence, coalitions, and legitimacy there may be more elite packages of beliefs to choose from leading to less replicability over time. Whereas the stability of the political system may produce less replicable belief system estimates due to changes in the political system, party fragmentation may produce less replicable belief system estimates due to the greater availability of belief system packages (i.e. issue combinations) in the system.2

The Current Study

To answer my two key questions, I analyze data from the World Values Survey. This allows me to estimate belief system networks for a variety of countries using exactly the same items for multiple countries at multiple points of time. This means that any differences in the belief system estimates cannot be attributable to differences in the items. I estimate the replicability of belief system networks by comparing belief system networks estimated from a single country at multiple time points. I assess how stable the edge-characteristics (e.g., size of the edges), node-level characteristics (e.g., centrality), and overall network characteristics (e.g., connectivity) are across time. In addition to mapping on to the metrics used in the belief system network and psychological network literatures, this broad selection of metrics allows us to ascertain if some aspects of the belief system (e.g., overall characteristics) are more replicable than others (e.g., edge-characteristics). By holding method constant and only varying time and country, we are able to estimate and compare replicability between countries.

After examining overall rates of replicability, we test if sample size, years between assessments, the stability of the political system, and party fragmentation is associated with variation in replicability across countries. In addition to furthering our understanding of belief system networks, these latter analyses also build on work on the replicability of psychological networks (Borsboom et al., 2017; Forbes et al., 2017; Forbes et al., in press; Fried et al., 2018; Jones, Williams, & McNally, 2019). Only by estimating networks for more samples than is typical (e.g., Fried et al., 2018 investigated four samples) are we able to investigate the correlates of network replicability.


Participants and Procedure

I used data from the World Values Survey (Inglehart et al., 2014). After excluding countries, waves, and participants who did not complete all of the relevant measures, my analyses included data from 52,826 participants (52% men, 48% women, 0.001% missing gender data, M age = 42.0, SD = 15.9) from 31 countries (mean N/country/wave = 724, SD = 349) who were part of the 3rd (1995–1998), 4th (1999–2004), and 5th (2005–2009) waves of the World Values Survey (see Table S1 for list of sample sizes and countries). This allows me to estimate the replicability of belief system networks estimated at Wave 3 by comparing it with those estimated at Waves 4 and 5, and the replicability of belief system networks estimated at Wave 4 by comparing it with those estimated at Wave 5. For narrative simplicity, I refer to the earlier network (i.e. Wave 3 or Wave 4 networks depending on the comparison) as the “original network”.


Belief System Measures

I included 19 items assessing political attitudes and identities in the belief system networks. I chose items if they were available across the three waves and if they were measures of political attitudes. One challenge for item selection is that we ideally would include items that are relevant in the countries, yet countries have different relevant issues. To guard against this issue, we included a broader array of items than past work on political beliefs using the World Values Survey (e.g., Malka et al., 2019).3 The items we chose included items assessing social issues (e.g., immigration policy, justifiability of euthanasia), economic issues (e.g., the role of government in businesses, inequality), environmental issues (e.g., protecting the environment vs. economic growth), government types (e.g., preference for army rule, democracy), and self-identification as right-wing or left-wing (all items are in Table 1). Ideological identification, economic issues, environmental issues, and social issues were all recoded so that higher scores indicated more traditionally right-wing positions. Governing types were scored so that higher scores indicated more support for non-democratic and anti-democratic governing types.

Table 1

Items used to estimate the belief system networks.

Number Items

Ideological Identification
1 In political matters, people talk of “the left” and “the right.” How would you place your views on this scale, generally speaking?” (1 = left, 10 = right).
Economic Issues
2 1 = Incomes should be made more equal, 10 = We need larger income differences as incentives for individual effort
3 1 = People should take more responsibility to provide for themselves, 10 = The government should take more responsibility to ensure that everyone is provided for; reverse scored
4 1 = Private ownership of business and industry should be increased, 10 = Government ownership of business and industry should be increased; reverse scored
5 1 = Competition is good, 10 = Competition is harmful; reverse scored
Environmental Issues
6 Increase in taxes if used to prevent environmental pollution (1 = Strongly agree, 2 = Agree, 3 = Disagree, 4 = Strongly disagree)
7 Here are two statements people sometimes make when discussing the environment and economic growth. Which of them comes closer to your own point of view? 1 = Protecting the environment should be given priority, even if it causes slower economic growth and some loss of jobs, 2 = No answer, 3 = Economic growth and creating jobs should be the top priority, even if the environment suffers to some extent
Social Beliefs
8 Euthanasia (1 = Never justifiable, 10 = Always justifiable); reverse scored
9 Prostitution (1 = Never justifiable, 10 = Always justifiable); reverse scored
10 Homosexuality (1 = Never justifiable, 10 = Always justifiable); reverse scored
11 Abortion (1 = Never justifiable, 10 = Always justifiable); reverse scored
12 Men make better political leaders than women do (1 = Strongly agree, 2 = Agree, 3 = Disagree, 4 = Strongly disagree); reverse scored
13 When jobs are scarce, men should have more right to a job than women (1 = Disagree, 2 = Neither, 3 = Agree)
14 When jobs are scarce, employers should give priority to people of this country over immigrants (1 = Disagree, 2 = Neither, 3 = Agree)
15 How about people from other countries coming here to work. Which one of the following do you think the government should do? (1 = Let anyone come who wants to? 2 = Let people come as long as there are jobs available? 3 = Place strict limits on the number of foreigners who can come here? 4 = Prohibit people coming here from other countries?)
Governing Types
16 Having a strong leader (1 = Very good, 2 = Fairly good, 3 = Fairly bad, 4 = Very bad); reverse scored
17 Having experts make decisions (1 = Very good, 2 = Fairly good, 3 = Fairly bad, 4 = Very bad); reverse scored
18 Have the army rule (1 = Very good, 2 = Fairly good, 3 = Fairly bad, 4 = Very bad); reverse scored
19 Having a democratic political system (1 = Very good, 2 = Fairly good, 3 = Fairly bad, 4 = Very bad)

Note: Numbers are used to label nodes in network figures in the supplemental materials.

Country-Level Measures

To explore correlates of replicability across countries, I used the original network’s sample size (see Table S1), years between assessments, indicators of the stability of the country’s political system, and the number of effective parties in the political system.

I assessed the stability of the country’s political system two ways. First, I used the 2006 values of the Fragile States Index. This index uses a variety of content analyses, qualitative data, and quantitative data to assess the stability of the country’s political, economic, and social system. Because the different facets of the index all tap into issues that could affect the items in the belief systems I estimate, I use the total score for the Fragile States Index (Fragile States Index, 2006). Ideally, I would have used values from the same years I have data for the countries; however, 2006 was the oldest available data of this index. Second, I used changes in the levels of democracy between waves to assess overall changes in the political system. To estimate democracy, I used the average democracy as assessed by the Varieties of Democracy project (Coppedge et al., 2019) and calculate the absolute value of the difference between two waves as an indicator of political change. In addition, I explored if the level of democracy (rather than change) is associated with belief system replicability by using the democracy estimates from the initial year in the replicability comparison (e.g., Wave 3 democracy to predict Wave 3 to Wave 4 replicability). Party fragmentation was assessed using the effective number of parties at the parliamentary or legislative level (Gallagher, 2019; Gallagher & Mitchell, 2008; Laakso & Taagepera, 1979). This measure is a combination of the number of parties and relative party size within a political system. Higher numbers indicate greater fragmentation.

Estimating Belief System Networks

Belief system networks were estimated for each country/wave combination, 73 networks in total. Partial correlation networks, that meet the assumptions outlined in the introduction, are a type of Gaussian graphical model that can be used to estimate networks that meet these assumptions (Epskamp, & Fried, 2018; Lauritzen, 1996). Although Gaussian graphical models can be encoded in a partial correlation matrix where each edge in the network is a partial correlation, due to the number of parameters it is necessary to adopt techniques that do well with a large number of parameters. There are multiple methods that do this (e.g., Epskamp & Fried, 2018).

We chose a Bayesian estimation technique with a Wishart prior distribution (Williams, 2018) implemented in the R package BGGM (Williams & Mulder, 2019). This method has an acceptable false discovery rate, is computationally efficient, and efficiently incorporates network comparisons (a key aspect of this study). All of the belief system measures are included in the analysis. The output is a matrix of edges between all of the nodes in the network (i.e., the partial correlations between all of the issues and identities in the belief system). We keep edges (i.e. partial correlations) in the belief system network when the probability of a positive or negative effect is 95% and set the remaining edges to zero (see e.g., Williams, 2018). These are the belief system networks and they can be interpreted as the partial correlations between issues and identities when controlling for all of the other nodes in the network. The networks are visualized in Figures S1–S3.

I compare belief system networks from the same country estimated in different years. This analysis holds the country constant and examines how similar the belief system networks are at different time points. I compare overall features of the edge-level, node-level and overall network metrics. For each metric, I also computed benchmark expectations using simulations.

Description of Benchmark Simulations

It is not clear how replicable we should expect belief system networks to be. The complexity of belief system networks and their estimation makes it impossible to rely on a typical null distribution or benchmarks developed for assessing the replicability of psychological scales. Therefore, I conducted simulations to identify replicability benchmarks. I first simulated 1000 random graph networks (Erdős & Rényi, 1960; Yin & Li, 2011) using BDgraph’s (Mohammadi & Wit, 2019) bdgraph.sim function. Then, I simulated two datasets based on this network and estimate the network using the same methods described above. Finally, I calculate the similarity between the two estimated networks using the same methods used to compute replicability (see below). For each network, the probability of nodes connecting randomly was randomly determined and could take on the values [.6, .7, .8, .9]. The sample sizes for the simulated datasets were randomly chosen from the Wave 3 Ns (simulated dataset 1) and the Wave 5 Ns (simulated dataset 2).


Replication code is available here: https://osf.io/csx2g/.

Replicability of Edge-Level Metrics

First, I compared networks on edge-level metrics. I correlated the edges from the original network to the edges from subsequent networks from the same country. This assesses how replicable the connections between the nodes (i.e. the partial correlations) are across time. Figure 1 shows these correlations. The median wave-to-wave correlation ranges between .65 and .69. Although this suggests that for most countries there is some correspondence between edges at two time points, all of these estimates are below the median benchmark and nearly all are outside of the benchmark expectations. There is variation in the edge-to-edge correlation between countries. For example, Montenegro’s wave 3 to wave 4 correlation is .38, suggesting relatively less correspondence between edges at two time points. Other countries, such as Albania, India, and Moldova observed correlations less than .50. On the other side, across all three comparisons the estimate for the United States is within the benchmark expectations and is greater than .83.

Figure 1 

Boxplots of the stability of edges. High values imply higher replicability. The y-axis is the correlation between the edges in the original network and replication network. The top and bottom edges of the box indicate the 75th and 25th percentiles, respectively, and the black line near the middle of the box is the 50th percentile. The whiskers represent the lowest and highest data points within 1.5 times the interquartile range of the lowest quartile and the highest quartile, respectively. Points are horizontally jittered to improve clarify. Country abbreviations are in Table S1. Horizontal dashed grey lines are median from benchmark simulations. Horizontal dotted grey lines the 2.5% and 97.5% percentiles from the benchmark simulations.

A more direct way to test if edges differ between waves is to directly compare them using Bayesian hypothesis testing. Here I follow the example of Jones and colleagues (2019) and I compute Bayes Factors (H0 = equality, H1 = not equal) for each edge. I used a somewhat unrestricted and uninformative prior (sd = .35) that is agnostic to the size of the edges (a less informative prior finds higher replication rates). For each edge, I can see if evidence is primarily in favor of equality, inequality, or if it’s inconclusive. I use a Bayes Factor of 3 to make this determination.

The results of these comparisons are summarized in Figure 2. On average, approximately 73% of the edges were equal when comparing belief system networks across waves (i.e. a Bayes factor >3 for the “equal” hypothesis; Median range [.71, .75]), whereas approximately 5% were not equal (i.e. a Bayes factor >3 for the “not equal” hypothesis) (Medians range [.05, .06]). The remaining edges were inconclusively equal or not equal (Median range [.18, .20]). That is, across countries there is relatively little evidence of dramatic differences in edges across waves. The edges of belief system networks are largely stable. As before, there is variation in these estimates. Belief system networks in countries like India, Moldova, and Montenegro tended to have fewer edges identified as equal and more edges identified as not equal or inconclusive. The proportion of equal and inconclusive edges are similar to the benchmarked estimates; however, the proportion of not equal edges generally appears to exceed benchmarked estimates suggesting that the edges that are not equal may indicate genuine changes in the underlying network (i.e. a genuine change in belief system structure in the country from wave to wave).

Figure 2 

Boxplots of the proportion of edges identified as equal, inconclusive, and not equal. The top and bottom edges of the box indicate the 75th and 25th percentiles, respectively, and the black line near the middle of the box is the 50th percentile. The whiskers represent the lowest and highest data points within 1.5 times the interquartile range of the lowest quartile and the highest quartile, respectively. Points are horizontally jittered to improve clarify. Country abbreviations are in Table S1. Horizontal dashed grey lines are median from benchmark simulations. Horizontal dotted grey lines the 2.5% and 97.5% percentiles from the benchmark simulations.

Replicability of Node-Level Metrics

Second, I compare the networks on node-level metrics. Although the node-level metrics are often composites of the edges compared above, it is necessary to also estimate the replicability of the node-level metrics. This is because these metrics are sometimes used as outcomes in and of themselves (e.g., centrality estimates found in Brandt et al., 2019) and because researchers in other domains have noted instances where some node-level metrics (e.g., betweenness centrality) are unreplicable even when the edges are replicable (e.g., Epskamp, Borsboom, & Fried, 2018; Fried et al., 2018). I calculate betweenness centrality, closeness centrality, strength centrality, eigenvector centrality, 1-step expected influence, and 2-step expected influence for each node (see Table 2; Bonacich, 1987; Epskamp & Fried, 2018; Opsahl, Agneessens, & Skvoretz, 2010; Robinaugh, Millner, & McNally, 2016). These metrics give an indication of the centrality of the node in the network and its potential for influencing the nodes around it. They have all been used in research on psychological and belief system-related networks (Boutyline & Vaisey, 2017; Brandt et al., 2019; Epskamp & Fried, 2018; Robinaugh et al., 2016).

Table 2

Summary of measures of centrality.

Centrality Metric Definition

Betweenness The number of times a node sits on the shortest path between two other nodes in the network.
Closeness The inverse of the total length between a node and all other nodes in the network.
Strength The sum of the absolute value of the connections between a node and its immediate neighbors.
Eigenvector The extent a node is connected to other prominent nodes in the network.
Expected Influence (1 Step) The sum of the value of the connections between a node and its immediate neighbors.
Expected Influence (2 Step) A node’s 1-Step Expected Influence plus the 1-Step Expected Influence of the other nodes in the network weighted by their connections with the target node.

Note: Consistent with practices in the field, the first four centrality metrics treat all edge weights as positive (i.e. it take the absolute value of all the edges). The last two centrality metrics use both positive and negative edges.

I also calculate the Bayesian R2 for each node (i.e. the percent variance explained in the node by all of the other nodes in the network). This gives an estimation of the upper bound on the extent of controllability of the node (i.e. if all edges go towards this node, this tells how much we can influence the node by changing its neighbors; Haslbeck & Waldorp, 2018). For each of the node-level metrics, I correlate the node-level metrics from one belief system network to the node-level metrics of the belief system network of the same country from another wave (e.g., Argentina’s betweenness centrality on all nodes at Wave 3 correlated with Argentina’s betweenness centrality on all nodes at Wave 4). Higher correlations indicate greater replicability.

Results of the node-level comparisons are in Figure 3. Although median stability is greater than zero across all of the metrics and comparisons (Median range [.21, .93]), there is substantial variability across each of the measures. For example, betweenness centrality ranges from –.17 (Moldova) to .68 (Norway) for the wave 3 to wave 5 comparison, suggesting anything from slight anti-stability to moderate stability. Although the medians are generally higher, similarly wide ranges are found for closeness, strength, and eigenvector centrality. Of the centrality metrics, eigenvector centrality had the highest overall stability (Median range [.75, .81]). The two expected influence metrics had moderate overall stability (Median range [.63, .75]), although the 2-step version had somewhat higher stability. Node predictability tended to be relatively high (Median range [.88, .92]). Notably, the replicability of betweenness, closeness, strength, and expected influence (1 step) metrics were typically outside the benchmarked expectations. The replicability of eigenvector centrality, expected influence (2 step), and predictability were typically within the benchmarked expectations. These findings suggests that betweenness and closeness centrality, which have featured prominently in work on belief system networks (Boutyline & Vaisey, 2017; Brandt et al., 2019), should be treated with caution.

Figure 3 

Boxplots of the stability of node-level characteristics. High values imply higher replicability. Each panel shows the correlation between the original network and replication network for each node-level metric. The top and bottom edges of the box indicate the 75th and 25th percentiles, respectively, and the black line near the middle of the box is the 50th percentile. The whiskers represent the lowest and highest data points within 1.5 times the interquartile range of the lowest quartile and the highest quartile, respectively. Points are horizontally jittered to improve clarify. Country abbreviations are in Table S1. Horizontal dashed grey lines are median from benchmark simulations. Horizontal dotted grey lines the 2.5% and 97.5% percentiles from the benchmark simulations.

Replicability of Overall Network Characteristics

Third, I compare the networks on overall features of the network. These include the overall connectivity and the proportion of positive edges. Average shortest path length was used as the measure of network connectivity (Dalege et al., 2018; Wasserman & Faust, 1994). Shortest path length was calculated using Dijkstra’s algorithm. This algorithm minimizes the inverse distance between two nodes using the absolute value of the edge weights. Higher connectivity is indicated by lower average shortest path length. The proportion of positive edges is simply the proportion of edges greater than zero. This indicates whether the overall “logic” of the network is replicable (cf. Boutyline & Vaisey, 2017). I examine how large the absolute value of the difference of connectivity and the proportion of positive connections are between waves. Larger differences indicate worse replicability.

The differences in connectivity and proportion positive edges of the networks are compared in Figure 4. Across the three possible comparisons (Wave 3 to 4, Wave 3 to 5, and Wave 4 to 5), the median connectivity differences are relatively small (Median range [1.05, 1.64]), but primarily fall outside of the benchmark expectations.4 The median differences in the proportion of positive edges is small (Median range [.02, .04]) and all estimates are consistent with the benchmark expectations. These data suggest relative similarity between belief system networks estimated at different time points. That said, there is variability with some countries (e.g., Montenegro, Philippines, Serbia) showing larger differences in connectivity and the proportion of positive edges.

Figure 4 

Boxplots of the stability of overall network characteristics. Low values imply higher replicability. The top panel shows the absolute value of the difference in connectivity. The bottom panel shows the absolute value of the difference in the proportion of positive connections. The top and bottom edges of the box indicate the 75th and 25th percentiles, respectively, and the black line near the middle of the box is the 50th percentile. The whiskers represent the lowest and highest data points within 1.5 times the interquartile range of the lowest quartile and the highest quartile, respectively. Points are horizontally jittered to improve clarify. Country abbreviations are in Table S1. Horizontal dashed grey lines are median from benchmark simulations. Horizontal dotted grey lines the 2.5% and 97.5% percentiles from the benchmark simulations.

Robustness Check

I tested if the replicability estimates in the prior sections are consistent with replicability estimates for the same data and networks after removing a subset of items. This helps us understand if the replicability estimates are due to the specific combination of items. For these checks, I randomly removed 4 of the items, reestimated all of the networks for each country/wave combination, and estimated the replicability metrics in the prior sections. These estimates are presented in Figures S4–S7 in the supplemental materials. The distribution of replicability estimates in the original networks (examined above) and the networks using a subset of items are highly overlapping. This suggests that the replicability estimates are not due to the specific combination of items.

Replicability Associations

Belief system networks appear to be relatively replicable in absolute terms. However, average levels of replicability mask underlying variation: belief system networks tend to be highly replicable in some countries and seem to be substantially less replicable in others. This could mean that belief systems are meaningfully different across countries, but it may also indicate that differences in precise methodological details could play a role. I tested if sample size, years between assessments, the stability of the countries’ political system, and the number of effective parties in the countries’ political system are associated with replicability.

I use the original network’s sample size, the number of years between assessments, and the countries’ scores on either Fragile States Index, changes in democracy, overall levels of democracy, or the number of effective parties in the countries’ political system to predict replicability on all of the indexes included here (i.e. the y-axis in Figures 1, 2, 3, 4). I regressed replicability for each index on sample size, number of years between assessments, and either Fragile States Index, changes in democracy, overall levels of democracy, or the number of effective parties in the countries’ political system using a multilevel model estimated with lme4 (Bates et al., 2015) and lmerTest (Kuznetsova et al., 2017) and nesting observations within the three wave-comparisons groups (48 models total). All predictors and outcomes were rescaled to range from 0 to 1. I reverse scored the replicability indices for the overall network so that higher scores indicated more replicability. Finally, I averaged all of the indicators of replicability to create an aggregate index of replicability across all possible metrics.

The results of these analyses are in Figure 5. The confidence intervals for sample size, difference in years, and changes in democracy are, in general, relatively wide which means that these estimates are unlikely to be precise. The confidence intervals for state fragility, overall democracy, and number of effective parties were relatively more precise. In general, we see that higher sample sizes are associated with replicability, more unstable states are associated with unreplicability, and more democracy is associated with replicability. Changes in democracy, the number of effective parties, and the difference in years did not have clear effects. These overall impressions should be interpreted cautiously given the wide confidence intervals; however, it does appear that sample size and political context are associated with replicability.

Figure 5 

Multilevel estimate and 95% confidence interval of sample size and state fragility index on replicability. All replicability outcomes were scored so that higher scores indicate higher replicability. All variables were rescored so to range from zero to one.


Conceptualizing and analyzing belief systems as networks can give insight into the overall structure of belief systems, including its central components (Brandt et al., 2019b) and how it changes over time (Fishman & Davis, 2019). However, to be confident in these insights, we need to know how replicable the method is. I find that belief system networks are, on average, replicable across a range of countries. For five of the 11 metrics, I find that the median replicability fell within the expected range of the benchmark simulations. For the remaining six metrics, the median fell outside of the expected range; however, it often at least represented a moderate correlation between waves (e.g., the median replicability for expected influence [1 step] was >.60). It does appear that estimates based on the identification of shortest paths through the network (e.g., connectivity, betweenness centrality, and closeness centrality) tend to be less replicable than estimates that are based on directly connected edges (e.g., strength centrality, predictability).

The overall relative replicability of the belief system networks masks underlying variation in replicability. For each indicator of replicability there was variation in replicability across countries. We found that this variation in replicability was associated with state fragility, overall democracy, and the sample size. When sample sizes are small and political systems are unstable, we should not expect the estimate of the belief system network to be stable. Theoretically, these results suggest that the stability of the belief system corresponds to the stability of the broader political system. It is, of course, possible for belief systems to change in stable political systems, however, these changes appear to be larger and more readily apparent in unstable political systems. This is consistent with work suggesting that changes in the political environment can shift the structure of the belief system (e.g., Zaller, 1992). It also suggests that work on belief systems in less stable contexts may be less stable overall and is an empirical finding in need of study in and of itself. In practice, these results (re)highlight simulations which have shown the need for large samples in order to estimate replicable belief system networks (Epskamp, Borsboom, & Fried, 2018; Epskamp & Fried, 2018). By analyzing a large number of belief system networks, I was able empirically show that small sample sizes are associated with less replicable network estimates.

By taking advantage of the World Values Survey, I was able to estimate replicability across a range of countries in representative samples (for importance of representative samples when studying ideology see, Kalmoe, in press); however, these data did not allow me to incorporate other important features of political belief systems like partisan identities or all possible relevant political beliefs for all countries. For example, if a subset of particularly relevant beliefs was not included in a particular country and this belief system network of this subset was more replicable than the less relevant political beliefs, this might have underestimated replicability for such countries. The finding that belief system networks with randomly chosen four fewer items had similar replicability to the full belief system networks suggests that the specific columns do not have large effects on replicability indices (although it may have effects on the specific structure in specific countries, something that I did not examine).

I was also able to analyze differences between countries that are associated with replicability; however, other issues, such as the proximity to an election, may also induce belief system change and affect replicability. This is an important question subject to ongoing research (e.g., Fishman & Davis, 2019). Moreover, my study uses between-subject associations between variables, which highlight belief cleavages in society (Martin, 2000), rather than the belief system “in someone’s head”. Future work may take advantage of intensive longitudinal designs (e.g. >20 waves) to begin to estimate and assess the stability and heterogeneity of individual-level belief systems. Despite these limitations, the current study shows that belief system networks are largely replicable, although the replicable varies by both features of the sample and the political system.

Data Accessibility Statement

Data is publicly available. This is detailed along with replication code at https://osf.io/csx2g/?view_only=0233b894fd1b40e391175e84f22b312a.

Additional File

The additional file for this article can be found as follows:

Supplemental Materials

Additional methodological and analytic details. DOI: https://doi.org/10.1525/collabra.312.s1


1Boutyline and Vaisey (2017) make a different set of assumptions and so adopt a different analytic approach. They assume that belief systems start with a single ancestor belief. Overtime, subsequent generations build off of this single ancestor and branch out from it. A person’s position on each new belief corresponds to their position on the parental belief, plus error representing imperfect inferences from parental beliefs to their ancestors. Conceptually, it is a tree network or a directed acyclic graph with a potentially infinite number of generations. Empirically, this work estimates the network using correlations between nodes because – when the assumptions of the approach hold – the most between central node is also the original ancestor belief. While acknowledging the conceptual similarities with and intellectual debt to this approach, we focus on partial correlation networks because the assumptions are more flexible and the model has already been applied to a large number of psychological constructs. 

2Personality traits are not as reliable in low and middle-income countries compared to high-income countries (Laajaj et al., 2019). This is due to several interrelated reasons (e.g., survey enumerators, education levels). Finding that countries with less stable political systems have less stable belief systems overtime might be another manifestation of the personality trait finding. However, it is important to note that the personality traits were examined within one time point, showing that they did not have the factor structure typical in the United States. However, whether the factor structure was replicable overtime within the same country was not tested. My research question is more conceptually similar to whether the factor structure is the same across time than if the structure is the same between countries, albeit I use a different theoretical and empirical approach. 

3This issue would be more consequential if we were comparing the content of the belief systems (e.g., the centrality of ideological identification across countries). However, this issue should be less consequential for testing the replicability of the belief systems within countries. 

4The minimum possible average shortest path length (i.e. maximum connectivity) is 1 and the approximate maximum average shortest path length is 100 (i.e. when all edges are .001) suggesting that the maximum possible difference is ~99. Because the observed differences were between 1.06% and 1.66% of this maximum, I interpret these differences as relatively small. 


This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 759320). Nick Davis provided helpful comments on a prior version of this manuscript.

Competing Interests

The author has no competing interests to declare.

Author Contributions

Mark Brandt is responsible for the contents of this manuscript.


  1. Anusic, I., & Schimmack, U. (2016). Stability and change of personality traits, self-esteem, and well-being: Introducing the meta-analytic stability and change model of retest correlations. Journal of Personality and Social Psychology, 110, 766–781. DOI: https://doi.org/10.1037/pspp0000066 

  2. Barker, D. C., & Tinnick, J. D. (2006). Competing visions of parental roles and ideological constraint. American Political Science Review, 100, 249–263. DOI: https://doi.org/10.1017/S0003055406062149 

  3. Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1–48. DOI: https://doi.org/10.18637/jss.v067.i01 

  4. Bonacich, P. (1987). Power and centrality: A family of measures. American Journal of Sociology, 92, 1170–1182. DOI: https://doi.org/10.1086/228631 

  5. Borsboom, D., & Cramer, A. O. (2013). Network analysis: an integrative approach to the structure of psychopathology. Annual Review of Clinical Psychology, 9, 91–121. DOI: https://doi.org/10.1146/annurev-clinpsy-050212-185608 

  6. Borsboom, D., Fried, E. I., Epskamp, S., Waldorp, L. J., van Borkulo, C. D., van der Maas, H. L. J., & Cramer, A. O. J. (2017). False alarm? A comprehensive reanalysis of “Evidence that psychopathology symptom networks have limited replicability” by Forbes, Wright, Markon, and Krueger (2017). Journal of Abnormal Psychology, 126, 989–999. DOI: https://doi.org/10.1037/abn0000306 

  7. Boutyline, A., & Vaisey, S. (2017). Belief network analysis: A relational approach to understanding the structure of attitudes. American Journal of Sociology, 122, 1371–1447. DOI: https://doi.org/10.1086/691274 

  8. Brandt, M. J., Sibley, C. G., & Osborne, D. (2019). What Is Central to Political Belief System Networks? Personality and Social Psychology Bulletin, 45, 1352–1364. DOI: https://doi.org/10.1177/0146167218824354 

  9. Carlsen, L., & Bruggemann, R. (2017). Fragile state index: Trends and developments. A partial order data analysis. Social Indicators Research, 133, 1–14. DOI: https://doi.org/10.1007/s11205-016-1353-y 

  10. Chen, P. G., & Goren, P. N. (2016). Operational ideology and party identification: A dynamic model of individual-level change in partisan and ideological predispositions. Political Research Quarterly, 69, 703–715. DOI: https://doi.org/10.1177/1065912916658551 

  11. Ciuk, D. J., & Yost, B. A. (2016). The effects of issue salience, elite influence, and policy content on public opinion. Political Communication, 33, 328–345. DOI: https://doi.org/10.1080/10584609.2015.1017629 

  12. Converse, P. E. (1964). The nature of belief systems in mass publics. In D. E. Apter (Ed.), Ideology and discontent (pp. 206–261). New York, NY: The Free Press. 

  13. Coppedge, M., Gerring, J., Knutsen, C. H., Lindberg, S. I., Teorell, J., Altman, D., … Ziblatt, D. (2019). V-Dem [Country-Year/Country-Date] Dataset v9. Varieties of Democracy (V-Dem) Project. DOI: https://doi.org/10.23696/vdemcy19 

  14. Costantini, G., & Perugini, M. (2016). The network of conscientiousness. Journal of Research in Personality, 65, 68–88. DOI: https://doi.org/10.1016/j.jrp.2016.10.003 

  15. Dalege, J., Borsboom, D., van Harreveld, F., van den Berg, H., Conner, M., & van der Maas, H. L. (2016). Toward a formalized account of attitudes: The Causal Attitude Network (CAN) model. Psychological Review, 123, 2–22. DOI: https://doi.org/10.1037/a0039802 

  16. Dalege, J., Borsboom, D., van Harreveld, F., & van der Maas, H. L. (2018). A network perspective on attitude strength: Testing the connectivity hypothesis. Social Psychological and Personality Science, 10, 746–756. DOI: https://doi.org/10.1177/1948550618781062 

  17. Epskamp, S., Borsboom, D., & Fried, E. I. (2018). Estimating psychological networks and their accuracy: A tutorial paper. Behavior Research Methods, 50, 195–212. DOI: https://doi.org/10.3758/s13428-017-0862-1 

  18. Epskamp, S., & Fried, E. I. (2018). A tutorial on regularized partial correlation networks. Psychological Methods, 23, 617–634. DOI: https://doi.org/10.1037/met0000167 

  19. Erdős, P., & Rényi, A. (1960). On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences, 5, 17–60. 

  20. Federico, C. M., & Malka, A. (2018). The contingent, contextual nature of the relationship between needs for security and certainty and political preferences: Evidence and implications. Political Psychology, 39, 3–48. DOI: https://doi.org/10.1111/pops.12477 

  21. Festinger, L. (1957). A theory of cognitive dissonance. Stanford, CA: Stanford University Press. 

  22. Fishman, N., & Davis, N. T. (2019). Change we can believe in: Structural and content dynamics within belief networks. DOI: https://doi.org/10.31235/osf.io/6r3ym 

  23. Forbes, M. K., Wright, A. G., Markon, K. E., & Krueger, R. F. (2017). Evidence that psychopathology symptom networks have limited replicability. Journal of Abnormal Psychology, 126, 969–988. DOI: https://doi.org/10.1037/abn0000276 

  24. Forbes, M. K., Wright, A. G., Markon, K. E., & Krueger, R. F. (in press). Quantifying the replicability and replicability of psychopathology network characteristics. Multivariate Behavioral Research. DOI: https://doi.org/10.1080/00273171.2019.1616526 

  25. Fragile States Index. (2006). FSI 2006. Retrieved from https://fragilestatesindex.org/excel/ 

  26. Fried, E. I., Eidhof, M. B., Palic, S., Costantini, G., Huisman-van Dijk, H. M., Bockting, C. L., … & Karstoft, K. I. (2018). Replicability and generalizability of posttraumatic stress disorder (PTSD) networks: a cross-cultural multisite study of PTSD symptoms in four trauma patient samples. Clinical Psychological Science, 6, 335–351. DOI: https://doi.org/10.1177/2167702617745092 

  27. Friedkin, N. E., Proskurnikov, A. V., Tempo, R., & Parsegov, S. E. (2016). Network science on belief system dynamics under logic constraints. Science, 354, 321–326. DOI: https://doi.org/10.1126/science.aag2624 

  28. Gallagher, M. (2019). Electoral systems. Retrieved from https://www.tcd.ie/Political_Science/people/michael_gallagher/ElSystems/index.php 

  29. Gallagher, M., & Mitchell, P. (2008). The politics of electoral systems. Oxford University Press. DOI: https://doi.org/10.1093/0199257566.001.0001 

  30. Gawronski, B., Brochu, P. M., Sritharan, R., & Strack, F. (2012). Cognitive consistency in prejudice-related belief systems: Integrating old-fashioned, modern, aversive, and implicit forms of prejudice. In B. Gawronski & F. Strack (Eds.), Cognitive consistency: A fundamental principle in social cognition (pp. 369–389). New York, NY, US: Guilford Press. 

  31. Gerring, J. (1997). Ideology: A definitional analysis. Political Research Quarterly, 50, 957–994. DOI: https://doi.org/10.1177/106591299705000412 

  32. Haslbeck, J. M., & Waldorp, L. J. (2018). How well do network models predict observations? On the importance of predictability in network models. Behavior Research Methods, 50, 853–861. DOI: https://doi.org/10.3758/s13428-017-0910-x 

  33. Inglehart, R., Haerpfer, C., Moreno, A., Welzel, C., Kizilova, K., Diez-Medrano, J., Lagos, M., Norris, P., Ponarin, E., & Puranen, B., et al. (2014). World Values Survey: All Rounds – Country-Pooled Datafile (Version v20180912) [Data set]. Retrieve from http://www.worldvaluessurvey.org/WVSDocumentationWVL.jsp 

  34. Johnston, C. D., & Ollerenshaw, T. (2020). How different are cultural and economic ideology? Current Opinion in Behavioral Sciences, 34, 94–101. DOI: https://doi.org/10.1016/j.cobeha.2020.01.008 

  35. Jones, P. J., Williams, D. R., & McNally, R. J. (2019). Sampling variability is not nonreplication: A Bayesian reanalysis of Forbes, Wright, Markon, & Krueger. DOI: https://doi.org/10.31234/osf.io/egwfj 

  36. Kalmoe, N. P. (in press). Uses and abuses of ideology in political psychology. Political Psychology. 

  37. Kinder, D. R., & Kalmoe, N. P. (2017). Neither liberal nor conservative: Ideological innocence in the American public. University of Chicago Press. 

  38. Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest package: tests in linear mixed effects models. Journal of Statistical Software, 82, 1–26. DOI: https://doi.org/10.18637/jss.v082.i13 

  39. Laajaj, R., Macours, K., Hernandez, D. A. P., Arias, O., Gosling, S. D., Potter, J., … & Vakis, R. (2019). Challenges to capture the big five personality traits in non-WEIRD populations. Science Advances, 5, eaaw5226. DOI: https://doi.org/10.1126/sciadv.aaw5226 

  40. Laakso, M., & Taagepera, R. (1979). “Effective” number of parties: a measure with application to West Europe. Comparative Political Studies, 12, 3–27. DOI: https://doi.org/10.1177/001041407901200101 

  41. Malka, A., Lelkes, Y., & Soto, C. J. (2019). Are cultural and economic conservatism positively correlated? A large-scale cross-national test. British Journal of Political Science, 49, 1045–1069. DOI: https://doi.org/10.1017/S0007123417000072 

  42. Martin, J. L. (2000). The relation of aggregate statistics on beliefs to culture and cognition. Poetics, 28, 5–20. DOI: https://doi.org/10.1016/S0304-422X(00)00010-3 

  43. Mohammadi, R., & Wit, E. C. (2019). BDgraph: An R Package for Bayesian Structure Learning in Graphical Models. Journal of Statistical Software, 89, 1–30. DOI: https://doi.org/10.18637/jss.v089.i03 

  44. Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349, aac4716. DOI: https://doi.org/10.1126/science.aac4716 

  45. Opsahl, T., Agneessens, F., & Skvoretz, J. (2010). Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks, 32, 245–251. DOI: https://doi.org/10.1016/j.socnet.2010.03.006 

  46. Randles, D., Inzlicht, M., Proulx, T., Tullett, A. M., & Heine, S. J. (2015). Is dissonance reduction a special case of fluid compensation? Evidence that dissonant cognitions cause compensatory affirmation and abstraction. Journal of Personality and Social Psychology, 108, 697–710. DOI: https://doi.org/10.1037/a0038933 

  47. Robinaugh, D. J., Millner, A. J., & McNally, R. J. (2016). Identifying highly influential nodes in the complicated grief network. Journal of Abnormal Psychology, 125, 747–757. DOI: https://doi.org/10.1037/abn0000181 

  48. Satherley, N., Yogeeswaran, K., Osborne, D., & Sibley, C. G. (2018). If they say “yes,” we say “no”: Partisan cues increase polarization over national symbols. Psychological Science, 29, 1996–2009. DOI: https://doi.org/10.1177/0956797618805420 

  49. Sayans-Jiménez, P., van Harreveld, F., Dalege, J., & Rojas Tejada, A. J. (2019). Investigating stereotype structure with empirical network models. European Journal of Social Psychology, 49, 604–621. DOI: https://doi.org/10.1002/ejsp.2505 

  50. Turner-Zwinkels, F. M., Sibley, C. G., Johnson, B. B., & Brandt, M. J. (in press). Conservatives moral foundations are more densely connected than liberals’ moral foundations. Personality and Social Psychology Bulletin. 

  51. Van de Vyver, J., Houston, D. M., Abrams, D., & Vasiljevic, M. (2016). Boosting belligerence: How the July 7, 2005, London bombings affected liberals’ moral foundations and prejudice. Psychological Science, 27, 169–177. DOI: https://doi.org/10.1177/0956797615615584 

  52. Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applications. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511815478 

  53. Williams, D. R. (2018, September 20). Bayesian estimation for Gaussian graphical models: Structure learning, predictability, and network comparisons. DOI: https://doi.org/10.31234/osf.io/x8dpr 

  54. Williams, D. R., & Mulder, J. (2019, June 14). BGGM: A R package for Bayesian Gaussian Graphical Models. DOI: https://doi.org/10.31234/osf.io/3b5hf 

  55. Williams, D. R., Rast, P., Pericchi, L., & Mulder, J. (2019, February 26). Comparing Gaussian Graphical Models with the posterior predictive distribution and Bayesian model selection. DOI: https://doi.org/10.31234/osf.io/yt386 

  56. Zaller, J. R. (1992). The nature and origins of mass opinion. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511818691