Package 'politeness' reference manual

Title:	Detecting Politeness Features in Text
Description:	Detecting markers of politeness in English natural language. This package allows researchers to easily visualize and quantify politeness between groups of documents. This package combines prior research on the linguistic markers of politeness. We thank the Spencer Foundation, the Hewlett Foundation, and Harvard's Institute for Quantitative Social Science for support.
Authors:	Mike Yeomans, Alejandro Kantor, Dustin Tingley
Maintainer:	Mike Yeomans <[email protected]>
License:	MIT + file LICENSE
Version:	0.9.4
Built:	2025-03-12 06:18:28 UTC
Source:	https://github.com/myeomans/politeness

Purchase offers for bowl

Description

A dataset containing the purchase offer message and a label indicating if the writer was assigned to be warm (1) or tough (0)

Usage

bowl_offers
bowl_offers

Format

A data frame with 70 rows and 2 variables:

message: character of purchase offer message
condition: binary label indicating if message is warm or tough

Source

Jeong, M., Minson, J., Yeomans, M. & Gino, F. (2019).

"Communicating Warmth in Distributed Negotiations is Surprisingly Ineffective." Study 3.

Study 3. https://osf.io/t7sd6/

Find polite text

Description

Finds examples of most or least polite text in a corpus

Usage

exampleTexts(text, covar, type = c("most", "least"), num_docs = 5L)
exampleTexts(text, covar, type = c("most", "least"), num_docs = 5L)

Arguments

`text`	a character vector of texts.
`covar`	a vector of politeness labels (from human or model), or other covariate.
`type`	a string indicating if function should return the most or least polite texts or both. If `length > 1` only first value is used.
`num_docs`	integer of number of documents to be returned. Default is 5.

Details

Function returns a data.frame ranked by (more or least) politeness. If type == 'most', the num_docs most polite texts will be returned. If type == 'least', the num_docs least polite texts will be returned. If type == 'both', both most and least polite text will be returned. if num_docs is even, half will be most and half least polite else half + 1 will be most polite.

df_polite must have the same number of rows as the length(text) and length(covar).

Value

data.frame with texts ranked by (more or least) politeness. See details for more information.

Examples


data("phone_offers")
polite.data<-politeness(phone_offers$message, parser="none",drop_blank=FALSE)

exampleTexts(phone_offers$message,
                phone_offers$condition,
                type = "most",
                num_docs = 5)

exampleTexts(phone_offers$message,
                phone_offers$condition,
                type = "least",
                num_docs = 10)

data("phone_offers")
polite.data<-politeness(phone_offers$message, parser="none",drop_blank=FALSE)

exampleTexts(phone_offers$message,
                phone_offers$condition,
                type = "most",
                num_docs = 5)

exampleTexts(phone_offers$message,
                phone_offers$condition,
                type = "least",
                num_docs = 10)

Table of Politeness Features

Description

This table describes all the text features extracted in this package. See vignette for details.

Usage

feature_table
feature_table

Format

A data.frame with information about the politeness features.

Feature plot

Description

Plots the prevalence of politeness features in documents, divided by a binary covariate.

Usage

featurePlot(
  df_polite,
  split = NULL,
  split_levels = NULL,
  split_name = NULL,
  split_cols = c("firebrick", "navy"),
  top_title = "",
  drop_blank = 0.05,
  middle_out = 0.5,
  features = NULL,
  ordered = FALSE,
  CI = 0.68
)
featurePlot(
  df_polite,
  split = NULL,
  split_levels = NULL,
  split_name = NULL,
  split_cols = c("firebrick", "navy"),
  top_title = "",
  drop_blank = 0.05,
  middle_out = 0.5,
  features = NULL,
  ordered = FALSE,
  CI = 0.68
)

Arguments

`df_polite`	a data.frame with politeness features calculated from a document set, as output by `politeness`.
`split`	a vector of covariate values. must have a length equal to the number of documents included in `df_polite`. No NA values allowed.
`split_levels`	character vector of length 2 default NULL. Labels for covariate levels for legend. If NULL, this will be inferred from `split`.
`split_name`	character default NULL. Name of the covariate for legend.
`split_cols`	character vector of length 2. Name of colors to use.
`top_title`	character default "". Title of plot.
`drop_blank`	Features less prevalent than this in the sample value are excluded from the plot. To include all features, set to `0`
`middle_out`	Features less distinctive than this value (measured by p-value of t-test) are excluded. Defaults to 1 (i.e. include all).
`features`	character vector of feature names. If NULL all will be included.
`ordered`	logical should features be ordered according to features param? default is FALSE.
`CI`	Coverage of error bars. Defaults to 0.68 (i.e. standard error).

Details

Length of split must be the same as number of rows of df_polite. Typically split should be a two-category variable. However, if a continuous covariate is given, then the top and bottom terciles of that distribution are treated as the two categories (while dropping data from the middle tercile).

Value

a ggplot of the prevalence of politeness features, conditional on split. Features are sorted by variance-weighted log odds ratio.

Examples


data("phone_offers")

polite.data<-politeness(phone_offers$message, parser="none", drop_blank=FALSE)

politeness::featurePlot(polite.data,
                           split=phone_offers$condition,
                           split_levels = c("Tough","Warm"),
                           split_name = "Condition",
                           top_title = "Average Feature Counts")


politeness::featurePlot(polite.data,
                           split=phone_offers$condition,
                           split_levels = c("Tough","Warm"),
                           split_name = "Condition",
                           top_title = "Average Feature Counts",
                           features=c("Positive.Emotion","Hedges","Negation"))


polite.data<-politeness(phone_offers$message, parser="none", metric="binary", drop_blank=FALSE)

politeness::featurePlot(polite.data,
                           split=phone_offers$condition,
                           split_levels = c("Tough","Warm"),
                           split_name = "Condition",
                           top_title = "Binary Feature Use")

data("phone_offers")

polite.data<-politeness(phone_offers$message, parser="none", drop_blank=FALSE)

politeness::featurePlot(polite.data,
                           split=phone_offers$condition,
                           split_levels = c("Tough","Warm"),
                           split_name = "Condition",
                           top_title = "Average Feature Counts")


politeness::featurePlot(polite.data,
                           split=phone_offers$condition,
                           split_levels = c("Tough","Warm"),
                           split_name = "Condition",
                           top_title = "Average Feature Counts",
                           features=c("Positive.Emotion","Hedges","Negation"))


polite.data<-politeness(phone_offers$message, parser="none", metric="binary", drop_blank=FALSE)

politeness::featurePlot(polite.data,
                           split=phone_offers$condition,
                           split_levels = c("Tough","Warm"),
                           split_name = "Condition",
                           top_title = "Binary Feature Use")

Find polite text

Description

Deprecated... This function has a new name now. See exampleTexts for details.

Usage

findPoliteTexts(text, covar, ...)
findPoliteTexts(text, covar, ...)

Arguments

`text`	a character vector of texts.
`covar`	a vector of politeness labels, or other covariate.
`...`	other arguments passed on to exampleTexts. See exampleTexts for details.

Value

a ggplot of the prevalence of politeness features, conditional on split. Features are sorted by variance-weighted log odds ratio.

Examples

data("phone_offers")
polite.data<-politeness(phone_offers$message, parser="none",drop_blank=FALSE)

findPoliteTexts(phone_offers$message,
                phone_offers$condition,
                type = "most",
                num_docs = 5)


data("phone_offers")
polite.data<-politeness(phone_offers$message, parser="none",drop_blank=FALSE)

findPoliteTexts(phone_offers$message,
                phone_offers$condition,
                type = "most",
                num_docs = 5)

LASSO Coefficient Plot

Description

Plots feature counts and coefficients from a trained LASSO model

This plots the coefficients from a trained LASSO model.

Usage

modelPlot(model1, counts, model2 = NULL, dat = FALSE)
modelPlot(model1, counts, model2 = NULL, dat = FALSE)

Arguments

`model1`	Trained glmnet model
`counts`	Feature counts - either from training data or test data (choose based on application of interest)
`model2`	Trained glmnet model (optional) If you want the Y axis to reflect a second set of coefficients, instead of feature counts.
`dat`	logical If TRUE, then function will return a list with the data.frame used for plotting, as well as the plot itself.

Value

ggplot object. Layers can be added like any ggplot object

#' Positive Emotions List #' #' Positive words. #' #' @format A list of 2006 positively-valenced words #' "positive_list"

Description

#' Negative Emotions List #' #' Negative words. #' #' @format A list of 4783 negatively-valenced words #' "negative_list"

Usage

phone_offers
phone_offers

Format

A data frame with 355 rows and 2 variables:

message: character of purchase offer message
condition: binary label indicating if message is warm or tough

Details

#' Hedge Words List #' #' Hedges #' #' @format A list of 72 hedging words. #' "hedge_list"

#' Feature Dictionaries #' #' Six dictionary-like features for the detector: Negations; Pauses; Swearing; Pronouns; Formal Titles; and Informal Titles. #' #' @format A list of six quanteda::dictionary objects "polite_dicts" Purchase offers for phone

A dataset containing the purchase offer message and a label indicating if the writer was assigned to be warm (1) or tough (0)

Source

Jeong, M., Minson, J., Yeomans, M. & Gino, F. (2019).

"Communicating Warmth in Distributed Negotiations is Surprisingly Ineffective."

Study 1. https://osf.io/t7sd6/

Pre-Trained Politeness

Description

A dataset to train a model for detecting politeness.

Usage

polite_train
polite_train

Format

list of two objects. x contains pre-calculated politeness features for each document. y contains standardized human annotations for politeness.

Source

Danescu-Niculescu-Mizil, C., Sudhof, M., Jurafsky, D., Leskovec, J. & Potts, C. (2013). A computational approach to politeness with application to social factors. Proc. 51st ACL, 250-259.

Politeness Features

Description

Detects linguistic markers of politeness in natural language. This function is the workhorse of the politeness package, taking an N-length vector of text documents and returning an N-row data.frame of feature counts.

Usage

politeness(
  text,
  parser = c("none", "spacy"),
  metric = c("count", "binary", "average"),
  drop_blank = FALSE,
  uk_english = FALSE,
  num_mc_cores = 1
)
politeness(
  text,
  parser = c("none", "spacy"),
  metric = c("count", "binary", "average"),
  drop_blank = FALSE,
  uk_english = FALSE,
  num_mc_cores = 1
)

Arguments

`text`	character A vector of texts, each of which will be tallied for politeness features.
`parser`	character Name of dependency parser to use (see details). Without a dependency parser, some features will be approximated, while others cannot be calculated at all.
`metric`	character What metric to return? Raw feature count totals, Binary presence/absence of features, or feature counts per 100 words. Default is "count".
`drop_blank`	logical Should features that were not found in any text be removed from the data.frame? Default is FALSE
`uk_english`	logical Does the text contain any British English spelling? Including variants (e.g. Canadian). Default is FALSE
`num_mc_cores`	integer Number of cores for parallelization. Default is 1, but we encourage users to try parallel::detectCores() if possible.

Details

Some politeness features depend on part-of-speech tagged sentences (e.g. "bare commands" are a particular verb class). To include these features in the analysis, a POS tagger must be initialized beforehand - we currently support SpaCy which must be installed separately in Python (see example for implementation).

Value

a data.frame of politeness features, with one row for every item in 'text'. Possible politeness features are listed in feature_table

References

Brown, P., & Levinson, S. C. (1987). Politeness: Some universals in language usage (Vol. 4). Cambridge university press.

Danescu-Niculescu-Mizil, C., Sudhof, M., Jurafsky, D., Leskovec, J., & Potts, C. (2013). A computational approach to politeness with application to social factors. arXiv preprint arXiv:1306.6078.

Voigt, R., Camp, N. P., Prabhakaran, V., Hamilton, W. L., ... & Eberhardt, J. L. (2017). Language from police body camera footage shows racial disparities in officer respect. Proceedings of the National Academy of Sciences, 201702413.

Examples


data("phone_offers")

politeness(phone_offers$message, parser="none",drop_blank=FALSE)

colMeans(politeness(phone_offers$message, parser="none", metric="binary", drop_blank=FALSE))
colMeans(politeness(phone_offers$message, parser="none", metric="count", drop_blank=FALSE))

dim(politeness(phone_offers$message, parser="none",drop_blank=FALSE))
dim(politeness(phone_offers$message, parser="none",drop_blank=TRUE))

## Not run: 
# Detect multiple cores automatically for parallel processing
politeness(phone_offers$message, num_mc_cores=parallel::detectCores())

# Connect to SpaCy installation for part-of-speech features
install.packages("spacyr")
spacyr::spacy_initialize(python_executable = PYTHON_PATH)
politeness(phone_offers$message, parser="spacy",drop_blank=FALSE)


## End(Not run)




data("phone_offers")

politeness(phone_offers$message, parser="none",drop_blank=FALSE)

colMeans(politeness(phone_offers$message, parser="none", metric="binary", drop_blank=FALSE))
colMeans(politeness(phone_offers$message, parser="none", metric="count", drop_blank=FALSE))

dim(politeness(phone_offers$message, parser="none",drop_blank=FALSE))
dim(politeness(phone_offers$message, parser="none",drop_blank=TRUE))

## Not run: 
# Detect multiple cores automatically for parallel processing
politeness(phone_offers$message, num_mc_cores=parallel::detectCores())

# Connect to SpaCy installation for part-of-speech features
install.packages("spacyr")
spacyr::spacy_initialize(python_executable = PYTHON_PATH)
politeness(phone_offers$message, parser="spacy",drop_blank=FALSE)


## End(Not run)

Politeness Features

Description

Detects linguistic markers of politeness in natural language. This function emulates the original features of the Danescu-Niculescu-Mizil Politeness paper. This primarily exists to contrast with the full feature set in the main package, and is not recommended otherwise.

Usage

politenessDNM(text, uk_english = FALSE)
politenessDNM(text, uk_english = FALSE)

Arguments

`text`	character A vector of texts, each of which will be tallied for politeness features.
`uk_english`	logical Does the text contain any British English spelling? Including variants (e.g. Canadian). Default is FALSE

Value

a data.frame of politeness features, with one row for every item in 'text'. The original names are used where possible.

References

Danescu-Niculescu-Mizil, C., Sudhof, M., Jurafsky, D., Leskovec, J., & Potts, C. (2013). A computational approach to politeness with application to social factors. arXiv preprint arXiv:1306.6078.

Examples


## Not run: 
# Connect to SpaCy installation for part-of-speech features
install.packages("spacyr")
spacyr::spacy_initialize(python_executable = PYTHON_PATH)
data("phone_offers")

politeness(phone_offers$message)


## End(Not run)


## Not run: 
# Connect to SpaCy installation for part-of-speech features
install.packages("spacyr")
spacyr::spacy_initialize(python_executable = PYTHON_PATH)
data("phone_offers")

politeness(phone_offers$message)


## End(Not run)

Pre-Trained Politeness Classifier

Description

Pre-trained model to detect politeness based on data from Danescu-Niculescu-Mizil et al. (2013)

Usage

politenessModel(texts, num_mc_cores = 1)
politenessModel(texts, num_mc_cores = 1)

Arguments

`texts`	character A vector of texts, each of which will be given a politeness score.
`num_mc_cores`	integer Number of cores for parallelization.

Details

This is a wrapper around a pre-trained model of "politeness" for all the data from the 2013 DNM et al paper. This model requires grammar parsing via SpaCy. Please see spacyr for details on installation.

Value

a vector with receptiveness scores

References

Danescu-Niculescu-Mizil, C., Sudhof, M., Jurafsky, D., Leskovec, J. & Potts, C. (2013). A computational approach to politeness with application to social factors. Proc. 51st ACL, 250-259.

Examples



## Not run: 
data("phone_offers")

politenessModel(phone_offers$message)


## End(Not run)

## Not run: 
data("phone_offers")

politenessModel(phone_offers$message)


## End(Not run)

Politeness plot

Description

Deprecated... This function has a new name now. See featurePlot for details.

Usage

politenessPlot(df_polite, ...)
politenessPlot(df_polite, ...)

Arguments

`df_polite`	a data.frame with politeness features calculated from a document set, as output by `politeness`.
`...`	other arguments passed on to featurePlot. See featurePlot for details.

Value

a ggplot of the prevalence of politeness features, conditional on split. Features are sorted by variance-weighted log odds ratio.

Examples


data("phone_offers")

polite.data<-politeness(phone_offers$message, parser="none", drop_blank=FALSE)

politeness::politenessPlot(polite.data,
                           split=phone_offers$condition,
                           split_levels = c("Tough","Warm"),
                           split_name = "Condition",
                           top_title = "Average Feature Counts")

data("phone_offers")

polite.data<-politeness(phone_offers$message, parser="none", drop_blank=FALSE)

politeness::politenessPlot(polite.data,
                           split=phone_offers$condition,
                           split_levels = c("Tough","Warm"),
                           split_name = "Condition",
                           top_title = "Average Feature Counts")

Politeness projection

Description

Deprecated. Function is now called trainModel.

Usage

politenessProjection(df_polite_train, covar = NULL, ...)
politenessProjection(df_polite_train, covar = NULL, ...)

Arguments

`df_polite_train`	a data.frame with politeness features as outputed by `politeness` used to train model.
`covar`	a vector of politeness labels, or other covariate.
`...`	additional parameters to be passed. See `trainModel`.

Details

See trainModel for details.

Value

list of model objects.

Examples


data("phone_offers")
data("bowl_offers")

polite.data<-politeness(phone_offers$message, parser="none",drop_blank=FALSE)

polite.holdout<-politeness(bowl_offers$message, parser="none",drop_blank=FALSE)

project<-politenessProjection(polite.data,
                              phone_offers$condition,
                              polite.holdout)

# Difference in average politeness across conditions in the new sample.

mean(project$test_proj[bowl_offers$condition==1])
mean(project$test_proj[bowl_offers$condition==0])

data("phone_offers")
data("bowl_offers")

polite.data<-politeness(phone_offers$message, parser="none",drop_blank=FALSE)

polite.holdout<-politeness(bowl_offers$message, parser="none",drop_blank=FALSE)

project<-politenessProjection(polite.data,
                              phone_offers$condition,
                              polite.holdout)

# Difference in average politeness across conditions in the new sample.

mean(project$test_proj[bowl_offers$condition==1])
mean(project$test_proj[bowl_offers$condition==0])

A pre-trained model for detecting conversational receptiveness. Estimated with glmnet using annotated data from a previous paper. Primarily for use within the receptiveness() function.

Description

A pre-trained model for detecting conversational receptiveness. Estimated with glmnet using annotated data from a previous paper. Primarily for use within the receptiveness() function.

Usage

receptive_model
receptive_model

Format

A fitted glmnet model

Source

Minson, J., Yeomans, M., Collins, H. & Dorison, C.

"Conversational Receptiveness: Improving Engagement with Opposing Views"

This is the list of variables to be extracted for the receptiveness algorithm For internal use only, within the receptiveness() function.

Description

This is the list of variables to be extracted for the receptiveness algorithm For internal use only, within the receptiveness() function.

Usage

receptive_names
receptive_names

Format

Character vector containing variable names

Source

Minson, J., Yeomans, M., Collins, H. & Dorison, C.

"Conversational Receptiveness: Improving Engagement with Opposing Views"

Pre-Trained Receptiveness Data

Description

A dataset to train a model for detecting conversational receptiveness.

Usage

receptive_polite
receptive_polite

Format

Pre-calculated politeness features for the receptive_train dataset

Pre-Trained Receptiveness Data

Description

A dataset to train a model for detecting conversational receptiveness.

Usage

receptive_train
receptive_train

Format

A data frame with 2860 rows and 2 variables:

text: character written response about policy disagreement
receptive: numeric standardized average of annotator ratings for "receptiveness"

Primarily for use within the receptiveness() function. The data was compiled from Studies 1 and 4 of the original paper, as well as an unpublished study with a very similar design, in which text responses were rated by disagreeing others.

Source

Yeomans, M., Minson, J., Collins, H., Chen, F. & Gino, F. (2020).

"Conversational Receptiveness: Improving Engagement with Opposing Views"

https://osf.io/2n59b/

Conversational Receptiveness

Description

Pre-trained model to detect conversational receptiveness

Usage

receptiveness(texts, num_mc_cores = 1)
receptiveness(texts, num_mc_cores = 1)

Arguments

`texts`	character A vector of texts, each of which will be tallied for politeness features.
`num_mc_cores`	integer Number of cores for parallelization.

Details

This is a wrapper around a pre-trained model of "conversational receptiveness". The model trained from Study 1 of that paper can be applied to new text with a single function. This model requires grammar parsing via SpaCy. Please see spacyr for details on installation.

Value

a vector with receptiveness scores

References

Yeomans, M., Minson, J., Collins, H., Chen, F. & Gino, F. (2020). Conversational Receptiveness: Improving Engagement with Opposing Views. OBHDP.

Examples



## Not run: 
data("phone_offers")

receptiveness(phone_offers$message)


## End(Not run)

## Not run: 
data("phone_offers")

receptiveness(phone_offers$message)


## End(Not run)

Train a model with politeness features

Description

Training and projecting a regression model using politeness features.

Usage

trainModel(
  df_polite_train,
  covar = NULL,
  df_polite_test = NULL,
  classifier = c("glmnet", "mnir"),
  cv_folds = NULL,
  ...
)
trainModel(
  df_polite_train,
  covar = NULL,
  df_polite_test = NULL,
  classifier = c("glmnet", "mnir"),
  cv_folds = NULL,
  ...
)

Arguments

`df_polite_train`	a data.frame with politeness features as outputed by `politeness` used to train model.
`covar`	a vector of politeness labels, or other covariate.
`df_polite_test`	optional data.frame with politeness features as outputed by `politeness` used for out-of-sample fitting. Must have same feature set as polite_train (most easily achieved by setting `dropblank=FALSE` in both calls to `politeness`).
`classifier`	name of classification algorithm. Defaults to "glmnet" (see `glmnet`) but "mnir" (see `mnlm`) is also available.
`cv_folds`	Number of outer folds for projection of training data. Default is NULL (i.e. no nested cross-validation). However, positive values are highly recommended (e.g. 10) for in-sample accuracy estimation.
`...`	additional parameters to be passed to the classification algorithm.

Details

List:

train_proj projection of politeness model within training set.
test_proj projection of politeness model onto test set (i.e. out-of-sample).
train_coef coefficients from the trained model.
train_model The LASSO model itself (for modelPlot)

Value

List of df_polite_train and df_polite_test with projection. See details.

Examples


data("phone_offers")
data("bowl_offers")

polite.data<-politeness(phone_offers$message, parser="none",drop_blank=FALSE)

polite.holdout<-politeness(bowl_offers$message, parser="none",drop_blank=FALSE)

project<-trainModel(polite.data,
                              phone_offers$condition,
                              polite.holdout)

# Difference in average politeness across conditions in the new sample.

mean(project$test_proj[bowl_offers$condition==1])
mean(project$test_proj[bowl_offers$condition==0])

data("phone_offers")
data("bowl_offers")

polite.data<-politeness(phone_offers$message, parser="none",drop_blank=FALSE)

polite.holdout<-politeness(bowl_offers$message, parser="none",drop_blank=FALSE)

project<-trainModel(polite.data,
                              phone_offers$condition,
                              polite.holdout)

# Difference in average politeness across conditions in the new sample.

mean(project$test_proj[bowl_offers$condition==1])
mean(project$test_proj[bowl_offers$condition==0])

UK to US Conversion dictionary

Description

For internal use only. This dataset contains a quanteda dictionary for converting UK words to US words. The models in this package were all trained on US English.

Usage

uk2us
uk2us

Format

A quanteda dictionary with named entries. Names are the US version, and entries are the UK version.

Source

Borrowed from the quanteda.dictionaries package on github (from user kbenoit)

Package 'politeness'

Help Index

Purchase offers for bowl

Description

Usage

Format

Source

Find polite text

Description

Usage

Arguments

Details

Value

Examples

Table of Politeness Features

Description

Usage

Format

Feature plot

Description

Usage

Arguments

Details

Value

Examples

Find polite text

Description

Usage

Arguments

Value

Examples

LASSO Coefficient Plot

Description

Usage

Arguments

Value

#' Positive Emotions List #' #' Positive words. #' #' @format A list of 2006 positively-valenced words #' "positive_list"

Description

Usage

Format

Details

Source

Pre-Trained Politeness

Description

Usage

Format

Source

Politeness Features

Description

Usage

Arguments

Details

Value

References

Examples

Politeness Features

Description

Usage

Arguments

Value

References

Examples

Pre-Trained Politeness Classifier

Description

Usage

Arguments

Details

Value

References

Examples

Politeness plot

Description

Usage

Arguments

Value

Examples

Politeness projection

Description

Usage

Arguments