Forum:2017 Community Survey Results

I totally forgot to upload these results. Discuss freely! 22:17, 15 January 2018 (UTC)
 * It's a redlink here. Wilder Bicycle 22:27, 15 January 2018 (UTC)

= Raw Results = Check 'em out here:


 * Google's autogenerated results
 * Google Sheet of results

= Main conclusions = RationalWiki editors are:


 * younger than the average internet user (median age 25)
 * very US-centric (over 60% Americans)
 * very non-cisgendered (14% trans or nonbinary)
 * very non-heterosexual (36% bi-, homo-, pan-, a-, or other-sexual)
 * whiter than the average American (75% white only)
 * very non-religious (fully 75% are irreligious, though 25% are religious)
 * very anti-authority

10:26, 16 January 2018 (UTC)

= Demographics =

Age
Editors are younger than noneditors, but not by much. The median age is about 26 and the mean age is about 31. 22:31, 15 January 2018 (UTC)

Country


Most respondents live in the United States or other English-speaking countries. 22:42, 15 January 2018 (UTC)



Over 60% of editors live in the United States, dwarfing all other countries. In contrast, noneditors live in a broader spread of countries. 22:42, 15 January 2018 (UTC)



As a proportion of their country's population, however, other English-speaking countries outperformed the United States among overall respondents. 22:42, 15 January 2018 (UTC)



However, again, editors were more likely to come from the United States (excluding Georgia, New Zealand, and Costa Rica, which all had N=1). In contrast, the top-10 noneditor countries had a wider distribution over English-speaking countries and European countries. 22:42, 15 January 2018 (UTC)

Education


Editors were most likely to report having less than a high school diploma, a high school diploma, a bachelor's degree, and a master's degree, in that order. Noneditors were less likely than editors to report having less than a high school diploma, and were more likely to report having a high school diploma or bachelor's degree.

In the future, I'd love to have a question that better covers this topic without being so Americentric. Any suggestions? 22:54, 15 January 2018 (UTC)


 * The answer is in the chart: change the question to number of years of formal education with hints to some common national equivalencies. Number of years is probably more general even in the US, as some people get multiple master's degrees, go directly from bachelor's to PhD, or in rare cases go directly to law school from high school. Bongolian (talk) 07:38, 16 January 2018 (UTC)


 * Dang. Right under my nose, this whole time! It might be interesting to split "education" into a "number of years" and "degree" question, which is what this tried to combine. 07:44, 16 January 2018 (UTC)

Gender


About 75% of editors were male (cis), 8% female (cis), 8% female (trans), 5% nonbinary, 2.5% male (trans), and 1% another gender.

Non-editors were more likely to be male (cis) (80%) or female (cis) (over 10%) and less likely to be trans or nonbinary than editors. 23:02, 15 January 2018 (UTC)

Sexuality


About 65% of editors were heterosexual, 15% bisexual, 8% homosexual, 5% asexual, 5% pansexual, and 4% another sexuality.

Editors and noneditors were about equally likely to report any given sexuality except heterosexuality (only 65% among editors, 70% among noneditors) and bisexuality (only 10% among noneditors, 15% among editors). 23:02, 15 January 2018 (UTC)

Gender or Sexual Minority
sexuality Cisgender Not cisgender 1:    Heterosexual 0.6071429    0.02380952 2: Not heterosexual 0.2500000   0.11904762

About 40% of editors were gender or sexual minorities (GSMs).

sexuality Cisgender Not cisgender 1:    Heterosexual 0.6954887   0.007518797 2: Not heterosexual 0.2142857  0.082706767

In comparison, only about 30% of noneditors were gender or sexual minorities (GSMs).

For comparison, somewhere between 1% (very low) and 20% (very high) of US adults are GSMs. A reasonable number is 10%. 23:02, 15 January 2018 (UTC)

Race


About 75% of editors were White only, 8% multiracial, 6% another race, and less than 5% Hispanic, Asian, or Black. Noneditors were slightly more likely to be white (80%). In comparison, the United States (where 60% of editors reside) is about 65% White only.

In the future, are there less-Americentric terms to use for "race"? 23:08, 15 January 2018 (UTC)


 * As I recall, I was impressed by how inclusive and non-US centric that part was. After all, you have to include the common American terms as well as common British, Australian and Canadian ones. I suggest having separate options for African American, Black British, Afro-Caribbean, Black African and Black other. I'm sure that an increasing number of people in the UK would choose to identify as Black British, rather than Afro-Caribbean or Black African, in the future. Spud (talk) 03:50, 17 January 2018 (UTC)

= Religion =

Religious identity & denomination


Here are the words people used to describe their religious identity, scaled to the square of frequency (if a word is 2x as large, it was 4x as common). The most common word by far was "none", followed by "atheist" or "atheism", followed closely by "agnostic", in turn followed closely by "Christianity" or "Christian". After a sharp decline in popularity, "humanism" and "athiest" (sic) were the most-common words, after which no words clearly dominated.



When forced to use the top world religions as defined by Pew (while replacing "unaffiliated" with "irreligious" to more clearly indicate nonreligion) (and with specific subdivisions for Christianity, Islam, "irreligious"), respondents replicated the results of the wordcloud. 50.6% of editors reported being a part of irreligious atheism, 19.2% irreligious agnosticism, 5.5% Protestant Christianity, 5.5% other Christianity, 4.2% Catholic Christianity, 4.2% other irreligion, 4.2% Buddhism, 4.2% other religion, and 2.7% Judaism. Noneditors were somewhat more likely to be Protestant or Muslim and less likely to be other Christian.



Irreligion dominates on RationalWiki, with about 75% of both editors and noneditors identifying as one of the three irreligious denominations ("irreligious atheism", "irreligious agnosticism", "other irreligion"). For comparison, just 5% of Americans and between 0-20% of Europeans identify as "atheist", and just 25% of Americans and between 0-60% of Europeans identify as one of "atheist", "agnostic", or "nonreligious". 07:14, 16 January 2018 (UTC)

Centrality of Religiosity Scale (CRS) scores
The Centrality of Religiosity Scale (CRS) attempts to measure religiosity. The CRS appears to be a highly-valid measure of religiosity across several (mostly-European) cultures. Usually, the CRS-5 asks these five questions:


 * 1) (Intellect) How often do you think about religious issues?
 * 2) (Ideology) To what extent do you believe that God or something divine exists?
 * 3) (Public practice) How often do you take part in religious services?
 * 4) (Private practice) How often do you pray? 04b: How often do you meditate?
 * 5) (Experience) How often do you experience situations in which you have the feeling that God or something divine intervenes in your life?

However, because RationalWiki is an insular community dedicated to exploring pseudoscience and authoritarianism, both of which are frequently tied to religion, the "intellect" question seemed like to get high false positives. (For example, I am sure that Matt Dillahunty often thinks about religious issues, despite being staunchly atheist.) As such, I added one question from the CRS-10 to create the following five questions:


 * 1) (Ideology) To what extent do you believe that God or something divine exists?
 * 2) (Ideology) To what extent do you believe in an afterlife — e.g. immortality of the soul, resurrection of the dead, or reincarnation?
 * 3) (Public practice) How often do you take part in religious services?
 * 4) (Private practice) How often do you pray? 04b: How often do you meditate?
 * 5) (Experience) How often do you experience situations in which you have the feeling that God or something divine intervenes in your life?

I have graphed each of these questions for "Nonreligious" and "Religious" responses (as defined by one's stated religious denomination), as well as each of the three irreligious responses. (I chose not to break down by religion because sample size was too small for most religions.) 07:28, 16 January 2018 (UTC)

God


Only about 60% of nonreligious respondents believe "not at all" that God or something divine exists -- while about 10% of nonreligious respondents believe "moderately" to "very much so" that God or something divine exists. In contrast, 40% of religious respondents believe "very much so" that God or something divine exists -- while fully 20% of religious respondents believe "not at all" that God or something divine exists.



About 80% of atheists believe "not at all" that God or something divine exists, leaving nearly 20% of atheists in the "not very much" or "moderately" categories. In contrast, just below 35% of agnostics believe "not at all" that God or something divine exists, although the majority (over 50%) believe "not very much" that God or something divine exists. Responses from other irreligious groups were more likely than agnostics (but less likely than atheists) to believe "not at all" that God or something divine exists, while at the same time many more responses -- about 30% -- fell into the "moderately" to "very much so" categories.

This lines up with my personal experience that "atheist" tends to just mean "strong rejection of the existence of the divine", while "agnostic" just means "less strong rejection of the existence of the divine". 07:42, 16 January 2018 (UTC)

Afterlife


As before, only about 60% of nonreligious respondents believe "not at all" that an afterlife exists. About 5% fewer religious respondents believe "very much so" that an afterlife exists as compared to God or something divine (about 35% versus 40%).



A similar pattern as before emerges: atheists are very skeptical of an afterlife, agnostics less so, and other irreligion both more skeptical of an afterlife and more likely to strongly believe in it. 07:50, 16 January 2018 (UTC)

Religious services


Almost 70% of nonreligious respondents "never" attend religious services. (Nonreligious respondents appear more likely to believe in religious tenets than to participate in religious practices.) Interestingly, almost 50% of religious respondents "never" attend religious services or attend them less often than once per year!



Unlike before, agnostics and atheists exhibit the same broad pattern of religious attendance (though agnostics are about 10% more likely to attend religious services at all, rather than "never"). Respondents in another irreligious denomination were much more likely to say that they attend religious services. 07:56, 16 January 2018 (UTC)

Prayer


Moreso than previous responses, 80% nonreligious respondents say that they "never" pray. (This is 10% more than those who "never" attend religious services and 20% more than those who "not at all" believe in God or something divine or an afterlife.) The most common response among nonreligious respondents was "never" at over 25%, with the next most popular as "more than once a day" at almost 20%.



Again, agnostics and atheists exhibit the same broad pattern of prayer. However, atheists were much more likely to "never" pray (96%) than agnostics (80%). [Aside: I'm not sure why the graph doesn't display "atheists" up to 96%. Blame the ggplot2 gods.] As with religious services, other irreligious denomination responses were more likely to pray frequently, with just 66.7% saying that they "never" prayed. 08:06, 16 January 2018 (UTC)

Divine situations


80% of nonreligious respondents say they "never" experience divine situations. In contrast, just 35% of religious respondents agree. Fully 10% of religious respondents "very often" feel divine situations in their life.



In a fairly linear fashion, atheists were less likely to experience divine situations than agnostics, while agnostics were less likely to experience divine situations than other irreligious. 08:11, 16 January 2018 (UTC)

CRS scores


A CRS score of 1.0 to 2.0 is defined as "not-religious", 2.2 to 3.8 as "religious", and 4.0 to 5.0 as "highly-religious". As you would expect, nonreligious respondents were much more likely to be "not-religious" (91%) than religious. Interestingly, not a single nonreligious respondent fell into the "highly-religious" category. Religious respondents were about equally likely to fall into any of the three categories.



As the previous graphs suggested, atheists (98% non-religious) tended to be less religious than agnostics (80% non-religious), and agnostics less than other irreligious respondents (66.7% non-religious). Interestingly, the rate of "religious" was not much lower among other irreligious respondents (33.3%) than among religious respondents respondents (36%).



These observations are supported by violinplots of the data. While most of the religious denominations encompass a roughly-equal-in-size range of religiosity from 1 to 5, irreligious atheism is highly clustered towards 1 and agnosticism towards about 1.5, while other irreligion visually appears more similar to a low-religiosity religious distribution (roughly rectangular; compare Judaism) than to the downward-skewed distributions of atheism or agnosticism.



Indeed, the difference in CRS scores between atheism and agnosticism is outside their standard errors of the mean. However, agnosticism and other irreligion are not distinguishable in this graph. Interestingly, the only religions within the SEM of other irreligion were Sunni Islam (N = 2 and unreliable), Buddhism, and Judaism. 09:06, 16 January 2018 (UTC)

= Politics =

Political + Party Identity


Here are the words people used to describe their political identity, scaled to the square of frequency. Only words mentioned 2 times or more are included.



Here are the words people used to describe their political identity, scaled to the square of frequency. Only words mentioned 2 times or more are included. 09:19, 16 January 2018 (UTC)

Authoritarianism-Conservatism-Traditionalism (ACT) scale
The Authoritarianism-Conservatism-Traditionalism Model is based on Bob Altemeyer's theory of Right-Wing Authoritarianism. Both this newer model and the RWA model have tons of empirical research backing up their validity. Recently, the theory has gotten attention as a predictor for Trump votes. In this survey, I asked 18 questions from the reduced ACT questionaire, 9 of them regular-coded and 9 of them reverse-coded (R):

Conservatism ("Authoritarian Submission"):


 * 1) It’s great that many young people today are prepared to defy authority (R).
 * 2) What our country needs most is discipline, with everyone following our leaders in unity.
 * 3) Students at high schools and at university must be encouraged to challenge, criticize, and confront established authorities (R).
 * 4) Obedience and respect for authority are the most important virtues children should learn.
 * 5) Our country will be great if we show respect for authority and obey our leaders.
 * 6) People should be ready to protest against and challenge laws they don’t agree with (R).

Traditionalism:


 * 1) Nobody should stick to the “straight and narrow.” Instead people should break loose and try out lots of different ideas and experiences (R).
 * 2) The “old-fashioned ways” and “old-fashioned values” still show the best way to live.
 * 3) God’s laws about abortion, pornography, and marriage must be strictly followed before it is too late.
 * 4) There is absolutely nothing wrong with nudist camps (R).
 * 5) This country will flourish if young people stop experimenting with drugs, alcohol, and sex, and pay more attention to family values.
 * 6) There is nothing wrong with premarital sexual intercourse (R).

Authoritarianism ("Authoritarian Aggression"):


 * 1) Strong, tough government will harm not help our country (R).
 * 2) Being kind to loafers or criminals will only encourage them to take advantage of your weakness, so it’s best to use a firm, tough hand when dealing with them.
 * 3) Our society does NOT need tougher government and stricter laws (R.)
 * 4) The facts on crime and the recent public disorders show we have to crack down harder on troublemakers, if we are going preserve law and order.
 * 5) Our prisons are a shocking disgrace. Criminals are unfortunate people who deserve much better care, instead of so much punishment (R).
 * 6) The way things are going in this country, it’s going to take a lot of “strong medicine” to straighten out the troublemakers, criminals, and perverts.

Unfortunately, I didn't ask any other political questions along which we could examine these results -- as such, I have only displayed the ACT scores alongside other measures, such as religion. 09:28, 16 January 2018 (UTC)

Individual questions


The above bargraph compiles all 18 question into one. The first three questions are the reverse-coded conservatism questions; the second three are the regular-coded conservatism questions; and so on. There aren't too many interesting trends on particular questions, with two exceptions. For the question about "God's laws about abortion etc. should be followed", almost 80% chose "strongly disagree", as opposed to the more spread-out distributions for other questions. For the first question about "strong, tough government", however, the distribution was more spread out than for most questions. To me, this suggests that RationalWiki users are more unified on questions of secularism than on questions about the role of government. This is supported by the generally-more-tight distributions for questions 7-12 (traditionalism) than questions 13-18 (authoritarianism). 09:46, 16 January 2018 (UTC)

T scores, C scores, and A scores
For each score, I summed the response to each question (1 to 9 for normal-coded questions, 9 to 1 for reverse-coded questions) and normalized the result to a -1 to 1 range.









RationalWiki users had the lowest Traditionalism score, then the lowest Conservatism scores, and then the lowest Authoritarianism scores. In addition, the Traditionalism scores are more tightly clustered toward the bottom than the Conservatism scores, and the same is true of Conservatism relative to Authoritarianism. This suggests that RationalWiki is more strongly anti-tradition than it is anti-obeying-authority (and more in agreement about the former), and is more strongly anti-tradition or anti-obeying-authority than it is anti-enforcing-authority.

Of course, all of the above assumes that we should except equivalent distributions for each of these questions. If you think otherwise, then you're free to draw your own conclusions. 10:00, 16 January 2018 (UTC)

ACT scores and religion


Unsurprisingly, lower CRS scores (indicating lower religiosity) and lower ACT scores (indicating lower "respect for power") were significantly correlated. (The regression was just OLS regression, so take with a heap of salt. An increase from minimum to maximum CRS religiosity (1 to 5) would result in an increase of normalized ACT score of 0.508 (approximately half the range of the normalized ACT scale)! However, the r^2 of this correlation was just 0.191, indicating that changes in CRS could explain only about 20% of changes in ACT scores.



Interestingly, Buddhists and Jews had lower median ACT scores than the irreligious, who were lower again than other religions. 10:14, 16 January 2018 (UTC)

ACT scores and age


Interestingly, lower ACT scores were correlated with higher age -- which is the opposite of what one would expect, given that youth are more likely to believe i more liberal/leftist ideologies. However, this relationship has an r^2 of just 0.5%, making it a worthless predictor alone. 10:17, 16 January 2018 (UTC)

Economic scale
Does anyone know of other widely-used and empirically-validated scales to use, especially about economic left-right issues? 09:28, 16 January 2018 (UTC)

= RationalWiki =

Edits


Unsurprisingly, editors were much more likely to state that they edited RationalWiki more frequently. The most common response among editors was "once or more per week", while it was "never" among noneditors. However, over 1 in 10 editors say that they edit RationalWiki more than daily, while 2 in 10 say that they edit RationalWiki more than yearly. This is a huge variance in editing frequencies. 22:42, 15 January 2018 (UTC)

Views


Unsurprisingly again, editors were more likely to state that they visited RationalWiki more frequently. The most common response among editors was "once or more per day" (about one order of magnitude higher than the most-common response for edits) while the most common response among non-editors was "once or more per week". 22:42, 15 January 2018 (UTC)

= Comparison Statistics = Feel free to compare the above statistics with those of YouTube's premiere "skeptics":

AGE 0:30 17 or under	25.8% 18-25	50.6% 26-35	17.7% 36 or over	6.0%

GENDER 1:10 Female (cis)	17.9% Female (trans)	1.4% Male (cis)	78.2% Male (trans)	1.0% Nonbinary	0.8% Other	0.7%

SEXUALITY 1:50 Heterosexual	70.2% Homosexual	5.8% Bisexual/Pansexual	21.5 Other	2.6%

COUNTRY 2:30 United States	50.0% Europe	18.4% United Kingdom	11.0% Canada	8.3% Australia	5.1% South America	1.9% others below 1%

EDUCATION 11:40 Some high school / still a student	24.4% High school or equivalent	16.3% Some College/University or equivalent	34.0% College diploma	6.7% Bachelor's degree	13.6% Master's degree	3.7% PHD, Doctorate or better	1.3%

RELIGIOUS DENOMINATION 6:50 Atheist	50.0% Nonreligious	30.6% Christian	8.3% Other	5.9% Catholic	3.6% others below 1%

"POLITICAL LEANINGS" 9:40 Progressive / Left	5.6% Liberal / Left	28.6% Somewhere In The Center	44.2% Conservative / Right	11.2% Alt-right	1.4% Not very political	9.1%

AGE 0:10 little babby under 14 years	2.6% edgy teenaged asshole 14-19	45.6% oh shit i have to be an adult now	20-30	42.6% you are slowly dying 31-40	6.7% why are you on the internet grandpa? over 40	2.5%

GENDER Female	13.5% Male	85.3% Nonbinary	1.1%

"POLITICAL LEANINGS" 3:20 left	14.0% right	11.8% libertarian	11.5% classical liberal disappointed	14.5% alt-right MAGA	3.1% moderate/center	24.33 I don't care about politics just give me the memes mommy	20.9%

AGE 0:50 13-17 male	about 3% 13-17 female	negligible 18-24 male	about 30% 18-24 female	about 3% 25-34 male	about 40% 25-34 female	about 3% 35-44 male	about 15% 35-44 female	negligible 45-54 male	about 3% 45-54 female	negligible 55-64 male	about 1% 55-64 female	negligible 65+ male	about 1% 65+ female	negligible

GENDER 0:50 Male	93.8% Female	6.2%

COUNTRY (views) 1:10 United States	33,500,000 United Kingdom	10,500,000 Canada	5,800,000 Australia	3,500,000 Germany	2,700,000 Sweden	2,400,000 Netherlands	1,300,000 Finland	1,200,000 Norway	900,000 Denmark	870,000

POLITICAL COMPASS TEST QUADRANT 2:00 Libertarian & Left	55% Libertarian & Right	35% Authoritarian & Left	4% Authoritarian & Right	7%

RELIGIOUS DENOMINATION 2:20 Atheist	57% Agnostic	15% Catholic 6% Protestant 6% Other Christian 5% Non-Affiliated Deist	6% Muslim	3% Jewish	1% others less than 1%

22:42, 15 January 2018 (UTC)

= Code =

library(data.table) library(ggplot2) library(treemap) library(cowplot) library(gridExtra) library(grid) library(tm) library(wordcloud)

setwd("C:/Users/.../Downloads") rw_dt = fread("RationalWiki Community Survey 2017 (Responses) - Form Responses 1.csv")

colnames(rw_dt) = substr(tolower(gsub('"', "", gsub(" ", "_", iconv(colnames(rw_dt), "latin1", "ASCII", sub="")))), 1, 16)

table(rw_dt"country") rw_dt = rw_dt[!(country %in% c("Holy See (Vatican City)", "Korea, North", "Bouvet Island")), ] table(rw_dt"age") rw_dt = rw_dt[!(age == 150), ] table(rw_dt"religious_identi") rw_dt = rw_dt[!(religious_identi == "Stalinism"), ]
 * 1) cleaning
 * 2) remove trolls
 * 1) remove trolls
 * 1) remove trolls

rw_dt[, timestamp := as.POSIXct(timestamp, tz = "EST", format = "%m/%d/%Y %H:%M:%S")]
 * 1) convert timestamp to posix

rw_dt[i_am == "", i_am := "No response"][grep("bunch", i_am), i_am := "bunch of numbers"]
 * 1) recode "" as "No response"

from_vec = c("a registered editor", "a sysop", "bunch of numbers", "No response") to_vec = c("editor", "editor", "not an editor", "not an editor") rw_dt[.(i_am = from_vec, to = to_vec), on = "i_am", i_am := i.to]
 * 1) recode registered editors and sysops to "editor"
 * 2) recode bunch of numbers and No response to "not an editor"

ggplot(rw_dt, aes(x = timestamp, fill = i_am)) + stat_ecdf(aes(ymin = 0, ymax = ..y..), geom = "ribbon") + stat_ecdf + coord_cartesian(expand = 0) + facet_wrap( ~ i_am) + theme_minimal + theme(legend.position="none", axis.text.x = element_text(hjust = 0, vjust = 0.5, angle = 90)) + scale_x_datetime(date_breaks = "2 days", date_labels = "%D", limits = c(as.POSIXct("2017-11-10"),as.POSIXct("2017-11-26"))) + labs(title = "Cumulative density of responses over time", x = "date", y = "proportion of responses by category") ggsave("timestamp_cdfplot.png", width = 8, height = 5)
 * 1) timestamps
 * 1) timestamps


 * 1) age
 * 2) # density plot
 * 3) ggplot(rw_dt, aes(x = age, fill = i_am)) +
 * 4)   geom_density +
 * 5)   coord_cartesian(expand = 0) +
 * 6)   scale_x_continuous(breaks = seq(10,78,4), limits = c(10,78)) +
 * 7)   facet_wrap( ~ i_am, scales = "free") +
 * 8)   theme(legend.position = "none")
 * 9) ggsave("age_density.png")
 * 1)   theme(legend.position = "none")
 * 2) ggsave("age_density.png")

ggplot(rw_dt, aes(x = "", y = age, fill = i_am)) + geom_violin + geom_boxplot(width = 0.2, fill = "white") + facet_wrap( ~ i_am) + coord_cartesian(expand = 0) + scale_y_continuous(breaks = seq(10,78,4), limits = c(10,78)) + geom_hline(data = copy(rw_dt)[, mean(age), by = i_am], aes(yintercept = V1)) + theme_minimal + theme(legend.position = "none") + labs(title = "Distribution of respondent age with mean horizontal line", x = "", y = "age in years") ggsave("age_violinplot.png", width = 8, height = 5)
 * 1) violinplot

c_dt = copy(rw_dt)[, .N, by = c("country")][, nproportion := N/sum(N)][country != "", ]
 * 1) country
 * 2) subsetted datatable
 * 1) subsetted datatable
 * 1) subsetted datatable

png("country_treemap.png", width = 8, height = 5, res = 250, units = "in") treemap(c_dt[, country_n := paste0(country,"\n", N)], "country_n", "N",       title = "Number of responses by country (treemap)", force.print.labels = TRUE) dev.off
 * 1) treemap

c_dt = copy(rw_dt)[, .N, by = c("country", "i_am")][, nproportion := N/sum(N), by = c("i_am")][country != "", ] c_dt_e = c_dt[i_am == "editor", ] setorderv(c_dt_e, "N", order = -1L) c_dt_ne = c_dt[i_am == "not an editor", ] setorderv(c_dt_ne, "N", order = -1L) ggplot(c_dt_e, aes(x = reorder(country, -N), y = nproportion)) + geom_bar(stat = "identity") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90)) + coord_cartesian(expand = 0) + labs(title = "Distribution of editor respondent countries", x = "Country", y = "Proportion of responses") ggsave("country_barplot_editor.png", width = 8, height = 5) ggplot(c_dt_ne, aes(x = reorder(country, -N), y = nproportion)) + geom_bar(stat = "identity") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90)) + coord_cartesian(expand = 0) + labs(title = "Distribution of non-editor respondent countries", x = "Country", y = "Proportion of responses") ggsave("country_barplot_noteditor.png", width = 8, height = 5)
 * 1) bargraph

pop_dt = data.table(country = c("United States", "United Kingdom", "Canada", "Australia", "Germany", "Brazil", "Sweden", "South Africa", "Netherlands", "France", "Georgia", "India", "Spain", "Taiwan", "Costa Rica", "Belgium", "New Zealand"),                   pop = c(320.9, 65.1, 35.9, 23.8, 81.7, 206, 9.8, 55, 16.9, 66.6, 3.7, 1309, 46.6, 23.4, 4.8, 11.3, 4.6)) tc_dt = merge(copy(c_dt)[, N := sum(N), by = "country"][i_am == "editor", ][1:10][, country_n := paste0(country, "\n", N)], pop_dt, by = "country", all.x = TRUE)[, Nperpop := N/pop] tc_dt_e = merge(c_dt_e, pop_dt, by = "country", all.x = TRUE)[, Nperpop := N/pop] tc_dt_ne = merge(c_dt_ne[1:10], pop_dt, by = "country", all.x = TRUE)[, Nperpop := N/pop]
 * 1) country proportions
 * 2) bargraph
 * 1) bargraph
 * 1) bargraph

ggplot(tc_dt, aes(x = reorder(country_n, -Nperpop), y = Nperpop)) + geom_bar(stat = "identity") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90)) + coord_cartesian(expand = 0) + labs(title = "Number of respondents divided by 2015 country population ", x = "Country and number of responses", y = "Number of responses per million persons") ggsave("countryproportion_barplot.png", width = 8, height = 5) ggplot(tc_dt_e, aes(x = reorder(country_n, -Nperpop), y = Nperpop)) + geom_bar(stat = "identity") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90)) + coord_cartesian(expand = 0) + labs(title = "Number of editor respondents divided by 2015 country population ", x = "Country and number of responses", y = "Number of responses per million persons") ggsave("countryproportion_barplot_editor.png", width = 8, height = 5) ggplot(tc_dt_ne, aes(x = reorder(country_n, -Nperpop), y = Nperpop)) + geom_bar(stat = "identity") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90)) + coord_cartesian(expand = 0) + labs(title = "Number of editor respondents divided by 2015 country population ", x = "Country and number of responses", y = "Number of responses per million persons") ggsave("countryproportion_barplot_noteditor.png", width = 8, height = 5)

rw_dt[education == "", education := "No response"] e_dt = copy(rw_dt)[, .N, by = c("i_am", "education")][education != "No response", ][, proportion := N/sum(N), by = i_am] e_dt[, education := factor(education, levels = c("Less than a high school diploma", "High school diploma (12+ years)", "Associates degree (14+ years)", "Bachelor's degree (16+ years)", "Master's degree (18+ years)", "Doctoral degree (20+ years)"))] ggplot(e_dt, aes(x = education, y = proportion, fill = education, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ i_am, nrow = 2) + scale_y_continuous(limits = c(0, 0.3), breaks = seq(0, 0.3, by = 0.05)) + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents with a given education", x = "Education level", y = "Proportion of respondents") ggsave("education_barplot.png", width = 8, height = 5)
 * 1) education
 * 2) recode "" as "No response"
 * 1) recode "" as "No response"
 * 1) recode "" as "No response"
 * 1) data.table
 * 1) factor
 * 1) graph

g_dt = copy(rw_dt)[, .N, by = c("i_am", "gender")][gender != "No response", ][, proportion := N/sum(N), by = i_am] g_dt[, gender := factor(gender, levels = c("Male (cis)", "Male (trans)", "Female (cis)", "Female (trans)", "Nonbinary", "Other"))] ggplot(g_dt, aes(x = gender, y = proportion, fill = gender)) + geom_bar(stat = "identity") + facet_wrap(~ i_am, nrow = 2) + scale_y_continuous(limits = c(0, 0.8), breaks = seq(0, 0.8, by = 0.1)) + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents with a given gender", x = "Gender", y = "Proportion of respondents") ggsave("gender_barplot.png", width = 8, height = 5)
 * 1) gender
 * 2) data.table
 * 1) data.table
 * 1) data.table
 * 1) factor
 * 1) graph

rw_dt[`race-ethnicity` == "", `race-ethnicity` := "No response"] r_dt = copy(rw_dt)[, commacount := nchar(as.character(`race-ethnicity`)) - nchar(gsub(",", "", `race-ethnicity`))][commacount > 0, `race-ethnicity` := "Multiple"][, .N, by = c("i_am", "race-ethnicity")][`race-ethnicity` != "No response", ][, proportion := N/sum(N), by = i_am][, `race-ethnicity` := gsub(" /.+", "", `race-ethnicity`)] ggplot(r_dt, aes(x = `race-ethnicity`, y = proportion, fill = `race-ethnicity`)) + geom_bar(stat = "identity") + facet_wrap(~ i_am, nrow = 2) + scale_y_continuous(limits = c(0, 0.85), breaks = seq(0, 0.85, by = 0.1)) + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents with a given race or ethnicity", x = "Race or ethnicity", y = "Proportion of respondents") ggsave("race_barplot.png", width = 8, height = 5)
 * 1) race-ethnicity
 * 2) recode "" as "No response"
 * 1) recode "" as "No response"
 * 1) recode "" as "No response"
 * 1) data.table
 * 1) graph

rw_dt[sexuality == "", sexuality := "No response"] s_dt = copy(rw_dt)[, .N, by = c("i_am", "sexuality")][sexuality != "No response", ][, proportion := N/sum(N), by = i_am] ggplot(s_dt, aes(x = sexuality, y = proportion, fill = sexuality, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ i_am, nrow = 2) + scale_y_continuous(limits = c(0, 0.75), breaks = seq(0, 0.75, by = 0.1)) + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents with a given sexuality", x = "Sexuality", y = "Proportion of respondents") ggsave("sexuality_barplot.png", width = 8, height = 5)
 * 1) sexuality
 * 2) recode "" as "No response"
 * 1) recode "" as "No response"
 * 1) recode "" as "No response"
 * 1) data.table
 * 1) graph

colnames(rw_dt)

gsm_dt = copy(rw_dt) gsm_dt = gsm_dt[gender != "No response"][gender %in% c("Male (cis)", "Female (cis)"), gender := "Cisgender"][gender != "Cisgender", gender := "Not cisgender"] gsm_dt = gsm_dt[sexuality != "No response", ][sexuality != "Heterosexual", sexuality := "Not heterosexual"] gsm_dt = gsm_dt[, .N, by = c("i_am", "gender", "sexuality")][, proportion := N/sum(N), by = i_am] dcast(gsm_dt[i_am == "editor", ], sexuality ~ gender, sum, value.var = "proportion") dcast(gsm_dt[i_am == "not an editor", ], sexuality ~ gender, sum, value.var = "proportion")
 * 1) gender or sexual minority
 * 2) data.table
 * 1) data.table
 * 1) data.table
 * 1) not cis
 * 1) not het
 * 1) data.table by grouping
 * 1) graph
 * 2) png("gsm_table_editor.png")
 * 1) grid.table(dcast(gsm_dt[i_am == "editor", ], sexuality ~ gender, sum, value.var = "proportion"))
 * 2) dev.off
 * 3) png("gsm_table_noteditor.png", width = 8, height = 5, res = 250, units = "in")
 * 1) grid.table(dcast(gsm_dt[i_am == "not an editor", ], sexuality ~ gender, sum, value.var = "proportion"))
 * 2) dev.off

identity = Corpus(VectorSource(rw_dt"religious_identi")) toSpace = content_transformer(function (x, pattern ) gsub(pattern, " ", x)) identity = tm_map(identity, toSpace, "/") identity = tm_map(identity, toSpace, "@") identity = tm_map(identity, toSpace, "\\|") identity = tm_map(identity, content_transformer(tolower)) identity = tm_map(identity, removeNumbers) identity = tm_map(identity, removeWords, stopwords("english")) identity = tm_map(identity, removePunctuation) identity = tm_map(identity, stripWhitespace) dtm = TermDocumentMatrix(identity) m = as.matrix(dtm) v = sort(rowSums(m),decreasing=TRUE) d = data.frame(word = names(v), freq = v) set.seed(1) png("religious_identity_wordcloud.png", width = 8, height = 5, res = 250, units = "in") wordcloud(words = d$word, freq = sqrt(d$freq), min.freq = 1, max.words = 1000, scale = c(6.5,0.1),         random.order = FALSE, rot.per = 0.35, colors = brewer.pal(8, "Dark2")) dev.off
 * 1) religious identity
 * 2) stolen from http://www.sthda.com/english/wiki/text-mining-and-word-cloud-fundamentals-in-r-5-simple-steps-you-should-know
 * 1) stolen from http://www.sthda.com/english/wiki/text-mining-and-word-cloud-fundamentals-in-r-5-simple-steps-you-should-know
 * 1) stolen from http://www.sthda.com/english/wiki/text-mining-and-word-cloud-fundamentals-in-r-5-simple-steps-you-should-know

religious_denomi_levels = c("Catholic Christianity", "Protestant Christianity", "Orthodox Christianity", "Other Christianity", "Sunni Islam", "Shia Islam", "Other Islam", "Hinduism", "Buddhism", "Chinese religion", "Judaism", "Other religion", "Irreligious atheism", "Irreligious agnosticism", "Other irreligion") rw_dt[, religious_denomi := factor(religious_denomi, levels = religious_denomi_levels)]
 * 1) religious denomination
 * 1) religious denomination

rw_dt[religious_denomi == "", religious_denomi := "No response"] r_dt = copy(rw_dt)[, .N, by = c("i_am", "religious_denomi")][religious_denomi != "No response", ][, proportion := N/sum(N), by = i_am] ggplot(r_dt, aes(x = religious_denomi, y = proportion, fill = religious_denomi, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ i_am, nrow = 2) + scale_y_continuous(limits = c(0, 0.51), breaks = seq(0, 0.5, by = 0.125)) + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents with a given religious denomination", x = "Denomination", y = "Proportion of respondents") ggsave("religion_denomination_barplot.png", width = 8, height = 5)
 * 1) recode "" as "No response"
 * 1) data.table
 * 1) factor
 * 2) graph

rw_dt[religious_denomi %in% c("Irreligious atheism", "Irreligious agnosticism", "Other irreligion"), religious := "Nonreligious"][is.na(religious) & religious_denomi != "No response", religious := "Religious"] r_dt = copy(rw_dt)[, .N, by = c("i_am", "religious")][!is.na(religious), ][, proportion := N/sum(N), by = i_am] ggplot(r_dt, aes(x = religious, y = proportion, fill = religious, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ i_am, nrow = 2) + scale_y_continuous(limits = c(0, 1), breaks = seq(0, 1, by = 0.2)) + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents who are religious", x = "", y = "Proportion of respondents") ggsave("religion_barplot.png", width = 8, height = 5)
 * 1) religious or not religious
 * 1) religious or not religious
 * 1) data.table
 * 1) graph

pray_levels = c("More than once a day", "Once a day", "More than once a week", "Once a week", "One to three times a month", "A few times a year", "Less often", "Never") religious_servic_levels = c("More than once a week", "Once a week", "One to three times a month", "A few times a year", "Less often", "Never") divine_situation_levels = c("Very Often", "Often", "Occasionally", "Rarely", "Never") god_afterlife_levels = c("Very much so", "Quite a bit", "Moderately", "Not very much", "Not at all")
 * 1) religiosity
 * 2) https://pdfs.semanticscholar.org/ba9f/86cb4db4ed58b53993a4c06e1d29c79ba786.pdf
 * 1) https://pdfs.semanticscholar.org/ba9f/86cb4db4ed58b53993a4c06e1d29c79ba786.pdf
 * 1) https://pdfs.semanticscholar.org/ba9f/86cb4db4ed58b53993a4c06e1d29c79ba786.pdf

rw_dt[, prayer := factor(prayer, levels = pray_levels)] rw_dt[, religious_servic := factor(religious_servic, levels = religious_servic_levels)] rw_dt[, divine_situation := factor(divine_situation, levels = divine_situation_levels)] rw_dt[, god := factor(god, levels = god_afterlife_levels)] rw_dt[, afterlife := factor(afterlife, levels = god_afterlife_levels)]

rw_dt[religious_denomi %in% c("Irreligious atheism", "Irreligious agnosticism", "Other irreligion"), irreligion := religious_denomi]

r_dt = copy(rw_dt)[, .N, by = c("religious", "prayer")][!is.na(religious) & prayer != "", ][, proportion := N/sum(N), by = religious] ggplot(r_dt, aes(x = prayer, y = proportion, fill = prayer, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ religious, nrow = 2, scales = "free_y") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents who pray with [x] frequency", x = "Frequency", y = "Proportion of respondents") ggsave("religious_prayer_barplot.png", width = 8, height = 5) r_dt = copy(rw_dt)[, .N, by = c("irreligion", "prayer")][!is.na(irreligion) & prayer != "", ][, proportion := N/sum(N), by = irreligion] ggplot(r_dt, aes(x = prayer, y = proportion, fill = prayer, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ irreligion, nrow = 3, scales = "free_y") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + labs(title = "Proportion of respondents who pray with [x] frequency", x = "Frequency", y = "Proportion of respondents") ggsave("irreligious_prayer_barplot.png", width = 8, height = 5)
 * 1) data.table
 * 1) graph
 * 1) data.table
 * 1) graph

r_dt = copy(rw_dt)[, .N, by = c("religious", "religious_servic")][!is.na(religious) & religious_servic != "", ][, proportion := N/sum(N), by = religious] ggplot(r_dt, aes(x = religious_servic, y = proportion, fill = religious_servic, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ religious, nrow = 2, scales = "free_y") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents who attend religious service with [x] frequency", x = "Frequency", y = "Proportion of respondents") ggsave("religious_religious_service_barplot.png", width = 8, height = 5) r_dt = copy(rw_dt)[, .N, by = c("irreligion", "religious_servic")][!is.na(irreligion) & religious_servic != "", ][, proportion := N/sum(N), by = irreligion] ggplot(r_dt, aes(x = religious_servic, y = proportion, fill = religious_servic, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ irreligion, nrow = 3, scales = "free_y") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents who attend religious service with [x] frequency", x = "Frequency", y = "Proportion of respondents") ggsave("irreligious_service_barplot.png", width = 8, height = 5)
 * 1) data.table
 * 1) graph
 * 1) data.table
 * 1) graph

r_dt = copy(rw_dt)[, .N, by = c("religious", "divine_situation")][!is.na(religious) & divine_situation != "", ][, proportion := N/sum(N), by = religious] ggplot(r_dt, aes(x = divine_situation, y = proportion, fill = divine_situation, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ religious, nrow = 2, scales = "free_y") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents who experience divine situations with [x] frequency", x = "Frequency", y = "Proportion of respondents") ggsave("religious_divine_situations_barplot.png", width = 8, height = 5) r_dt = copy(rw_dt)[, .N, by = c("irreligion", "divine_situation")][!is.na(irreligion) & divine_situation != "", ][, proportion := N/sum(N), by = irreligion] ggplot(r_dt, aes(x = divine_situation, y = proportion, fill = divine_situation, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ irreligion, nrow = 3, scales = "free_y") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents who experience divine situations with [x] frequency", x = "Frequency", y = "Proportion of respondents") ggsave("irreligious_divine_situations_barplot.png", width = 8, height = 5)
 * 1) data.table
 * 1) graph
 * 1) data.table
 * 1) graph

r_dt = copy(rw_dt)[, .N, by = c("religious", "god")][!is.na(religious) & god != "", ][, proportion := N/sum(N), by = religious] ggplot(r_dt, aes(x = god, y = proportion, fill = god, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ religious, nrow = 2, scales = "free_y") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents who believe that God or something divine exists with [x] strength", x = "Strength", y = "Proportion of respondents") ggsave("religious_god_barplot.png", width = 8, height = 5) r_dt = copy(rw_dt)[, .N, by = c("irreligion", "god")][!is.na(irreligion) & god != "", ][, proportion := N/sum(N), by = irreligion] ggplot(r_dt, aes(x = god, y = proportion, fill = god, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ irreligion, nrow = 3, scales = "free_y") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents who believe that God or something divine exists with [x] strength", x = "Strength", y = "Proportion of respondents") ggsave("irreligious_god_barplot.png", width = 8, height = 5)
 * 1) data.table
 * 1) graph
 * 1) data.table
 * 1) graph

r_dt = copy(rw_dt)[, .N, by = c("religious", "afterlife")][!is.na(religious) & afterlife != "", ][, proportion := N/sum(N), by = religious] ggplot(r_dt, aes(x = afterlife, y = proportion, fill = afterlife, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ religious, nrow = 2, scales = "free_y") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents who believe that an afterlife exists with [x] strength", x = "Strength", y = "Proportion of respondents") ggsave("religious_afterlife_barplot.png", width = 8, height = 5) r_dt = copy(rw_dt)[, .N, by = c("irreligion", "afterlife")][!is.na(irreligion) & afterlife != "", ][, proportion := N/sum(N), by = irreligion] ggplot(r_dt, aes(x = afterlife, y = proportion, fill = afterlife, group = 1)) + geom_bar(stat = "identity") + facet_wrap(~ irreligion, nrow = 3, scales = "free_y") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents who believe that an afterlife exists with [x] strength", x = "Strength", y = "Proportion of respondents") ggsave("irreligious_afterlife_barplot.png", width = 8, height = 5)
 * 1) data.table
 * 1) graph
 * 1) data.table
 * 1) graph

to_vec = c(1, 1, 2, 3, 3, 4, 4, 5) rw_dt[.(prayer = pray_levels, to = to_vec), on = "prayer", prayer_numeric := i.to] to_vec = c(1, 1, 2, 3, 4, 5) rw_dt[.(religious_servic = religious_servic_levels, to = to_vec), on = "religious_servic", religious_servic_numeric := i.to] rw_dt[, crs_score := 6 - (prayer_numeric + religious_servic_numeric + as.numeric(divine_situation) + as.numeric(god) + as.numeric(afterlife)) / 5] rw_dt[, crs_category := factor(cut(crs_score, c(1, 2.1, 4.0, 6), labels=FALSE, right = FALSE), labels = c("not-religious", "religious", "highly-religious"))] r_dt = copy(rw_dt)[!is.na(crs_category) & !is.na(religious), ][, .N, by = c("crs_category", "religious")][, proportion := N/sum(N), by = religious] ggplot(r_dt[!(religious %in% c("Chinese religion", "Hinduism", "Orthodox Christianity")), ], aes(x = crs_category, y = proportion, fill = crs_category)) + geom_bar(stat = "identity") + facet_wrap(~ religious, nrow = 1, scales = "free_y") + theme_minimal + theme(legend.position = "none") + labs(title = "CRS score categories by religious or nonreligious", x = "CRS category", y = "Proportion of respondents") ggsave("crs_religious_score_barplot.png", width = 8, height = 5) r_dt = copy(rw_dt)[!is.na(crs_category) & !is.na(irreligion), ][, .N, by = c("crs_category", "irreligion")][, proportion := N/sum(N), by = irreligion] ggplot(r_dt, aes(x = crs_category, y = proportion, fill = crs_category)) + geom_bar(stat = "identity") + facet_wrap(~ irreligion, nrow = 1, scales = "free_y") + theme_minimal + theme(legend.position = "none") + labs(title = "CRS score categories by irreligion", x = "CRS category", y = "Proportion of respondents") ggsave("crs_irreligious_score_barplot.png", width = 8, height = 5)
 * 1) recode
 * 1) new variable
 * 1) new variable
 * 1) data.table
 * 1) graph
 * 1) data.table
 * 1) graph

r_dt = copy(rw_dt)[!is.na(crs_score) & religious_denomi != "No response", ][, .(mean = mean(crs_score), sem = sd(crs_score)/sqrt(length(crs_score[!is.na(crs_score)]))), by = c("religious_denomi")] r_dt = r_dt[!(is.na(sem)), ] ggplot(r_dt, aes(x = religious_denomi, y = mean, fill = religious_denomi, group = 1)) + geom_bar(stat = "identity") + geom_errorbar(aes(ymax = mean + sem, ymin = mean - sem)) + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0, ylim = c(1, 5)) + labs(title = "Inverted Centrality of Religious Scale scores (5 = max, 1 = min religiosity)", x = "Religious denomination", y = "Mean score") ggsave("crs_score_denomination_barplot.png", width = 8, height = 5) ggplot(rw_dt[!(religious_denomi %in% c("Chinese religion", "Hinduism", "Orthodox Christianity")) & !is.na(religious_denomi), ], aes(x = religious_denomi, y = crs_score, fill = religious_denomi, group = religious_denomi)) + geom_violin(width = 2) + geom_jitter(width = 0.2, height = 0.05) + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + labs(title = "Inverted Centrality of Religious Scale scores (5 = max, 1 = min religiosity)", x = "Religious denomination", y = "Mean score") ggsave("crs_score_violinplot.png", width = 8, height = 5)
 * 1) data.table
 * 1) graph
 * 1) graph


 * 1) authoritarianism
 * 2) http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9221.2010.00781.x/abstract
 * 1) http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9221.2010.00781.x/abstract
 * 1) http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9221.2010.00781.x/abstract

identity = Corpus(VectorSource(rw_dt"political_identi")) toSpace = content_transformer(function (x, pattern ) gsub(pattern, " ", x)) identity = tm_map(identity, toSpace, "/") identity = tm_map(identity, toSpace, "@") identity = tm_map(identity, toSpace, "\\|") identity = tm_map(identity, content_transformer(tolower)) identity = tm_map(identity, removeNumbers) identity = tm_map(identity, removeWords, stopwords("english")) identity = tm_map(identity, removePunctuation) identity = tm_map(identity, stripWhitespace) dtm = TermDocumentMatrix(identity) m = as.matrix(dtm) v = sort(rowSums(m),decreasing=TRUE) d = data.frame(word = names(v), freq = v) set.seed(1) png("political_identity_wordcloud.png", width = 8, height = 5, res = 250, units = "in") wordcloud(words = d$word, freq = sqrt(d$freq), min.freq = 2, max.words = 200, scale = c(6,0.1),         random.order = FALSE, rot.per = 0.35, colors = brewer.pal(8, "Dark2")) dev.off
 * 1) stolen from http://www.sthda.com/english/wiki/text-mining-and-word-cloud-fundamentals-in-r-5-simple-steps-you-should-know

identity = Corpus(VectorSource(rw_dt"party_identity")) toSpace = content_transformer(function (x, pattern ) gsub(pattern, " ", x)) identity = tm_map(identity, toSpace, "/") identity = tm_map(identity, toSpace, "@") identity = tm_map(identity, toSpace, "\\|") identity = tm_map(identity, content_transformer(tolower)) identity = tm_map(identity, removeNumbers) identity = tm_map(identity, removeWords, stopwords("english")) identity = tm_map(identity, removePunctuation) identity = tm_map(identity, stripWhitespace) dtm = TermDocumentMatrix(identity) m = as.matrix(dtm) v = sort(rowSums(m),decreasing=TRUE) d = data.frame(word = names(v), freq = v) set.seed(1) png("party_identity_wordcloud.png", width = 8, height = 5, res = 250, units = "in") wordcloud(words = d$word, freq = sqrt(d$freq), min.freq = 2, max.words = 200, scale = c(5,0.1),         random.order = FALSE, rot.per = 0.35, colors = brewer.pal(8, "Dark2")) dev.off
 * 1) stolen from http://www.sthda.com/english/wiki/text-mining-and-word-cloud-fundamentals-in-r-5-simple-steps-you-should-know

acs_questions = c("its_great_that_m" = "young should defy authority",                 "people_should_be" = "people should protest unjust laws",                  "students_at_high" = "students should challenge authority",                  "what_our_country" = "country needs discipline + unity",                  "obedience_and_re" = "children should learn obedience",                  "our_country_will" = "country should respect authority",                  "nobody_should_st" = "people should try many experiences",                  "there_is_absolut" = "nothing wrong with nudist camps",                  "there_is_nothing" = "nothing wrong with premarital sex",                  "the_old-fashione" = "old-fashioned ways are best",                  "gods_laws_about_" = "god's laws must be followed",                  "this_country_wil" = "people should follow family values",                  "strong,_tough_go" = "tough government will harm country", "our_society_does" = "tough government is unnecessary", "our_prisons_are_" = "criminals deserve better care", "being_kind_to_lo" = "kindness only encourages criminals", "the_facts_on_cri" = "we should crack down on crime", "the_way_things_a" = "country needs 'strong medicine'") acs_table_maker = function(x) { var = names(acs_questions)[x]  question = acs_questions[x]  dt = data.table(table(rw_dtvar))[, proportion := N / sum(N)][, question := question]  return(dt) } acs_dt = rbindlist(lapply(1:18, acs_table_maker))[, question := factor(question, levels = acs_questions)]
 * 1) graph

ggplot(acs_dt, aes(x = V1, y = proportion, fill = V1)) + geom_bar(stat = "identity") + facet_wrap(~ question, nrow = 6, scales = "free_y") + theme_minimal + theme(legend.position = "none") + coord_cartesian(expand = 0) + labs(title = "Authoritarianism-Conservatism-Traditionalism (ACT) questions", x = "1 = strongly disagree, 9 = strongly agree", y = "Proportion of respondents") ggsave("acs_barplot.png", width = 8, height = 5)

cols = names(acs_questions)[c(1:3, 7:9, 13:15)] rw_dt[, (cols) := lapply(.SD, function(x){return(10 - x)}), .SDcols = cols] rw_dt[, act_score := (rowSums(.SD) - 90)/72, .SDcols = names(acs_questions)] rw_dt[, c_score := (rowSums(.SD) - 30)/24, .SDcols = names(acs_questions)[1:6]] rw_dt[, t_score := (rowSums(.SD) - 30)/24, .SDcols = names(acs_questions)[7:12]] rw_dt[, a_score := (rowSums(.SD) - 30)/24, .SDcols = names(acs_questions)[13:18]] ggplot(rw_dt, aes(x = 0, y = act_score)) + geom_violin + geom_boxplot(width = 0.1) + theme_minimal + theme(legend.position = "none") + coord_cartesian(expand = 0, ylim = c(-1, 1)) + labs(title = "Normalized Authoritarianism-Conservatism-Traditionalism (ACT) scores (min -1, max 1)", x = "Proportion", y = "Score") ggsave("act_score_violinplot.png", width = 8, height = 5) ggplot(rw_dt, aes(x = 0, y = c_score)) + geom_violin + geom_boxplot(width = 0.1) + theme_minimal + theme(legend.position = "none") + geom_hline(yintercept = mean(rw_dt"c_score", na.rm = TRUE)) + coord_cartesian(expand = 0, ylim = c(-1, 1)) + labs(title = "Normalized Conservatism scores (min -1, max 1) with mean line", x = "Proportion", y = "Score") ggsave("c_score_violinplot.png", width = 8, height = 5) ggplot(rw_dt, aes(x = 0, y = t_score)) + geom_violin + geom_boxplot(width = 0.1) + theme_minimal + theme(legend.position = "none") + geom_hline(yintercept = mean(rw_dt"t_score", na.rm = TRUE)) + coord_cartesian(expand = 0, ylim = c(-1, 1)) + labs(title = "Normalized Traditionalism scores (min -1, max 1) with mean line", x = "Proportion", y = "Score") ggsave("t_score_violinplot.png", width = 8, height = 5) ggplot(rw_dt, aes(x = 0, y = a_score)) + geom_violin + geom_boxplot(width = 0.1) + theme_minimal + theme(legend.position = "none") + geom_hline(yintercept = mean(rw_dt"a_score", na.rm = TRUE)) + coord_cartesian(expand = 0, ylim = c(-1, 1)) + labs(title = "Normalized Authoritarianism scores (min -1, max 1) with mean line", x = "Proportion", y = "Score") ggsave("a_score_violinplot.png", width = 8, height = 5) ggplot(rw_dt[!(religious_denomi %in% c("Chinese religion", "Hinduism", "Orthodox Christianity")) & !is.na(religious_denomi), ], aes(x = religious_denomi, y = act_score)) + geom_violin + geom_boxplot(width = 0.1) + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90), legend.position = "none") + coord_cartesian(expand = 0, ylim = c(-1, 1)) + labs(title = "Normalized ACT scores (min -1, max 1) by denomination", x = "Denomination", y = "Score") ggsave("act_score_denomination_violinplot.png", width = 8, height = 5)
 * 1) new variable
 * 1) graph
 * 1) graph
 * 1) graph
 * 1) graph
 * 1) graph

crs_act_lm = lm(act_score ~ crs_score, rw_dt)

ggplot(rw_dt, aes(x = crs_score, y = act_score)) + geom_jitter(width = 0.05, height = 0.05) + geom_abline(slope = crs_act_lm$coefficients[2], intercept = crs_act_lm$coefficients[1]) + theme_minimal + geom_text(label = paste0("act = ", round(crs_act_lm$coefficients[2], 4), " * crs + ", round(crs_act_lm$coefficients[1], 3), "\np-value ~= 0\nr-squared = ", round(summary(crs_act_lm)$r.squared, 3)), aes(x = 2, y = 0.88)) + coord_cartesian(expand = 0.1, ylim = c(-1, 1)) + labs(title = "ACT scores (min -1, max 1) plotted against CRS scores (min 1, max 5)", x = "CRS score", y = "ACT score") ggsave("act_score_scatterplot.png", width = 8, height = 5)

age_act_lm = lm(act_score ~ age, rw_dt)

ggplot(rw_dt, aes(x = age, y = act_score)) + geom_jitter(width = 0.05, height = 0.05) + geom_abline(slope = age_act_lm$coefficients[2], intercept = age_act_lm$coefficients[1]) + theme_minimal + geom_text(label = paste0("act = ", round(age_act_lm$coefficients[2], 4), " * age + ", round(age_act_lm$coefficients[1], 3), "\np-value ~= 0\nr-squared = ", round(summary(age_act_lm)$r.squared, 3)), aes(x = 40, y = 0.88)) + coord_cartesian(expand = 0.1, ylim = c(-1, 1)) + labs(title = "ACT scores (min -1, max 1) plotted against CRS scores (min 1, max 5)", x = "Age", y = "ACT score") ggsave("act_age_scatterplot.png", width = 8, height = 5)

rationalwiki_levels = c("Once or more per hour", "Once or more per day", "Once or more per week", "Once or more per month", "Once or more per year", "Never")
 * 1) rationalwiki
 * 1) rationalwiki

r_dt = rw_dt[i_visit_rational != "", ][, .N, by = c("i_visit_rational", "i_am")][, proportion := N/sum(N), by = i_am] r_dt[, i_visit_rational := factor(i_visit_rational, levels = rationalwiki_levels)] ggplot(r_dt, aes(x = i_visit_rational, y = proportion)) + geom_bar(stat = "identity") + facet_wrap( ~ i_am, scales = "free_y") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90)) + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents who visit RationalWiki with [x] frequency", x = "Frequency", y = "Proportion of responses") ggsave("view_barplot.png", width = 8, height = 5)
 * 1) data.table
 * 1) factor

r_dt = rw_dt[i_edit_rationalw != "", ][, .N, by = c("i_edit_rationalw", "i_am")][, proportion := N/sum(N), by = i_am] r_dt[, i_edit_rationalw := factor(i_edit_rationalw, levels = rationalwiki_levels)] ggplot(r_dt, aes(x = i_edit_rationalw, y = proportion)) + geom_bar(stat = "identity") + facet_wrap( ~ i_am, scales = "free_y") + theme_minimal + theme(axis.text.x = element_text(hjust = 1, vjust = 0.5, angle = 90)) + coord_cartesian(expand = 0) + labs(title = "Proportion of respondents who edit RationalWiki with [x] frequency", x = "Frequency", y = "Proportion of responses") ggsave("edit_barplot.png", width = 8, height = 5)
 * 1) data.table
 * 1) factor

11:15, 16 January 2018 (UTC)