{"id":2374,"date":"2017-08-10T17:28:14","date_gmt":"2017-08-10T17:28:14","guid":{"rendered":"https:\/\/www.kolabtree.com\/blog\/?p=2374"},"modified":"2017-08-14T10:11:42","modified_gmt":"2017-08-14T10:11:42","slug":"correct-outliers-regression-trumps-vote","status":"publish","type":"post","link":"https:\/\/www.kolabtree.com\/blog\/correct-outliers-regression-trumps-vote\/","title":{"rendered":"How to Correct Outliers in Regression Models: An example with race, education, and the uninsured on Trump\u2019s vote"},"content":{"rendered":"<p><em>This post originally appeared in my column on the site <a href=\"http:\/\/datadrivenjournalism.net\/news_and_analysis\/correcting_outliers_the_effect_of_race_education_and_the_uninsured_on_trump\">data driven journalism<\/a>.<\/em><\/p>\n<p>In my <a href=\"http:\/\/datadrivenjournalism.net\/news_and_analysis\/regression_for_journalists\">last post I talked about how regression<\/a> can be a useful tool to tease apart the different relationships between correlational variables. I also talked about how outliers can be problematic. One way of dealing with an outlier is simply to delete it from the analysis. Doing so decreases statistical power (the probability of finding significant predictor when it does exist) and removes potentially valuable information from the model. It could be a more fruitful endeavor as valuable information can be gained. I did this in my post on how Washington, DC differs from the other states and it did give me an idea for another covariate that should be considered in addition the ones already considered: concentration of hate groups, % uninsured, % with a bachelor\u2019s degree or higher, and % in poverty.<\/p>\n<p>In my <a href=\"http:\/\/datadrivenjournalism.net\/news_and_analysis\/how_is_washington_dc_an_outlier_lets_count_the_ways\">post on the characteristics of Washington, DC as an outlier<\/a> I found that it is the least white compared to any of the states considered. Only 40.2% of the districts population identifies as white or Caucasian there. Only Hawaii had a smaller % white at 25.4%. In the exit poll for last year\u2019s election, 60% of white women without a college education voted for Trump while 71% of white males without a college education did. 74% of nonwhites voted for Clinton.<\/p>\n<p>Adding that to the model significantly improved the precision of the model with DC included with 78.5% of the variability in Trump\u2019s vote accounted for. The variables for hate groups and % poverty were not significant and were excluded as having them in the model decreases statistical power. The variables % bachelor\u2019s, % White, and % uninsured were significant (meaning the p-value is less than 0.05 I will explain in a future post), the other\u2019s weren\u2019t. The output from most statistical packages:<\/p>\n<table class=\"m_-6998272163864735663ydp443f502dMsoTableGrid m_-6998272163864735663ydpfe588df9yahoo-compose-table-card\" border=\"1\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"center\"><i>78.5% of the variability <\/i><\/p>\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"center\"><i>accounted for<\/i><\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"center\"><i>Coefficients<\/i><\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"center\"><i>Standard Error<\/i><\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"center\"><i>t Stat<\/i><\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"center\"><i>P-value<\/i><\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"center\"><i>Lower <\/i><\/p>\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"center\"><i>95%<\/i><\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"center\"><i>Upper <\/i><\/p>\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"center\"><i>95%<\/i><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\">Intercept<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">51.55<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">8.92<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">5.78<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">5.75E-07<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">33.61<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">69.48<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\">% bachelor\u2019s degree<\/p>\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\">or higher<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">-1.11<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">0.15<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">-7.55<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">1.2E-09<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">-1.41<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">-0.82<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\">% White<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">0.31<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">0.06<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">4.95<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">1.01E-05<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">0.18<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">0.43<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\">% uninsured<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">0.74<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">0.26<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">2.86<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">0.006319<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">0.22<\/p>\n<\/td>\n<td valign=\"top\" nowrap=\"nowrap\">\n<p class=\"m_-6998272163864735663ydp443f502dMsoNormal\" align=\"right\">1.26<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The column labeled \u201ccoefficients\u201d gives the estimated values for the regression equation that I spelled out in previous posts. The current equation reads:<\/p>\n<p>Trump % of the vote = 51.55 \u2013 1.11*(% bachelor\u2019s) + 0.31*(% White) + 0.74*(% Uninsured)<\/p>\n<p>This says that when all of the covariates are equal to zero, Trump is predicted to have 51.55% of the vote. For every 1% increase in the % bachelors there is an estimated 1.11% decrease in Trumps vote. For every 1% increase in the % white population in the state there is an estimated increase of 0.31% and for every 1% increase in the % uninsured in the state.<\/p>\n<p>The column labeled \u201cstandard error\u201d is an estimate of the uncertainty in the coefficients. The column labeled \u201ct stat\u201d is the test statistic for determining whether the coefficients are significantly different from zero. The \u201cp-value\u201d is the estimated probability of observing this estimated coefficient when the true coefficient is zero. By convention, when the p-value is less than 0.05 we conclude that it the true coefficient is different from zero. The last two columns show the upper and lower bounds for a 95% confidence interval for a coefficient. The confidence interval says that 95% of the time that the estimates are made, the true coefficient will be between the upper and lower bounds. In this case, if the upper and lower bounds do not straddle the number zero, that is equivalent to the coefficient being significantly different from zero.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-2384 size-large\" src=\"https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-State-Race-1024x744.png\" alt=\"\" width=\"702\" height=\"510\" srcset=\"https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-State-Race-1024x744.png 1024w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-State-Race-300x218.png 300w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-State-Race-768x558.png 768w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-State-Race-1080x785.png 1080w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-State-Race.png 1423w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-State-Race-300x218@2x.png 600w\" sizes=\"(max-width: 702px) 100vw, 702px\" \/><\/p>\n<p>The scatterplot above shows the actual (in the blue diamond) and predicted values (in the red squares) for % white and % Trump for the model adjusting for % bachelors and % uninsured. The actual and predicted values for the District of Columbia (DC) and Hawaii are very close to each other which suggest good fit. One state that is poorly fit is Vermont where the actual vote for Trump is 10% lower than the predicted vote which can be seen directly above the blue diamond for Vermont.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-2385 size-large\" src=\"https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-educ-race-unins-1024x744.png\" alt=\"\" width=\"702\" height=\"510\" srcset=\"https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-educ-race-unins-1024x744.png 1024w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-educ-race-unins-300x218.png 300w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-educ-race-unins-768x558.png 768w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-educ-race-unins-1080x785.png 1080w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-educ-race-unins.png 1423w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-educ-race-unins-300x218@2x.png 600w\" sizes=\"(max-width: 702px) 100vw, 702px\" \/><\/p>\n<p>The scatter plot for % bachelor\u2019s degree or higher suggests that the fit is not as good as it is for the one for % white as the predictor. This is reflected in the greater standard error for this predictor (0.15) than for % white (0.06). The prediction for DC is not as good for this predictor as it has the highest. The trend is still significant in the negative direction.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-large wp-image-2386\" src=\"https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-unins-educ-race-1024x744.png\" alt=\"\" width=\"702\" height=\"510\" srcset=\"https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-unins-educ-race-1024x744.png 1024w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-unins-educ-race-300x218.png 300w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-unins-educ-race-768x558.png 768w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-unins-educ-race-1080x785.png 1080w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-unins-educ-race.png 1423w, https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Trump-unins-educ-race-300x218@2x.png 600w\" sizes=\"(max-width: 702px) 100vw, 702px\" \/><\/p>\n<p>The scatterplot for % uninsured as a predictor shows even less fit for Trump\u2019s % of the vote. DC and Alaska are poorly fit points for this predictor among many other states. The standard error for this predictor shows even less fit (0.26) for the other predictors though it\u2019s still statistically significant.<\/p>\n<p>Multiple regression is a potentially powerful tool for teasing apart the relationships between predictor variables for a specific outcome when conducted correctly. Adding the right covariates such as race can help alleviate the effects of an outlier such as Washington, DC. It\u2019s always better to include all of the data to give the most complete picture of it as possible.<\/p>\n<p>We now see that as the as the % of the population of a state with a bachelor\u2019s degree or higher increases the % of the vote for Trump decreases. As at the same time, as the percentages of the white and uninsured in a state, increase the % of Trump\u2019s vote increases. In the presence of these variables the concentration of hate groups and the % of the state in poverty are no longer significant predictors of Trump\u2019s vote.<\/p>\n<p>As Trump and the Republican controlled congress prepare to repeal the Affordable Care Act (ACA or as the GOP says Obamacare), the Congressional Budget Office estimates that 23 million Americans will lose their health insurance in the House version of the bill and an estimated 22 million will lose it in the Senate version. In this model the uninsured rate in each state is positively correlated with Trump\u2019s vote. Does Trump believe that increasing the uninsured rate will increase their share of the vote in 2020?<\/p>\n<p>Poverty was not associated with Trump\u2019s vote in 2016. The decrease in uninsured estimates since the ACA went into effect in 2014 is mostly due to Medicaid expansion for the poorest individuals and subsidies which allow lower income individuals to purchase health insurance. Increasing the number of uninsured may not decrease Trump\u2019s vote but it is unlikely to increase it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This post originally appeared in my column on the site data driven journalism. In my last post I talked about how regression can be a useful tool to tease apart the different relationships between correlational variables. I also talked about how outliers can be problematic. One way of dealing with an outlier is simply to<\/p>\n<div class=\"read-more\"><a href=\"https:\/\/www.kolabtree.com\/blog\/correct-outliers-regression-trumps-vote\/\" title=\"Read More\">Read More<\/a><\/div>\n","protected":false},"author":31,"featured_media":2400,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[247],"tags":[360,175,361,246,362],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v20.1 (Yoast SEO v20.1) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How to Correct Outliers in Regression Models: An analysis of Trump&#039;s vote<\/title>\n<meta name=\"description\" content=\"Using Trump&#039;s vote as an example, Paul Ricci writes about how introducing the right covariate can correct for outliers in regression models\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.kolabtree.com\/blog\/correct-outliers-regression-trumps-vote\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How to Correct Outliers in Regression Models: An example with race, education, and the uninsured on Trump\u2019s vote\" \/>\n<meta property=\"og:description\" content=\"Using Trump&#039;s vote as an example, Paul Ricci writes about how introducing the right covariate can correct for outliers in regression models\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.kolabtree.com\/blog\/correct-outliers-regression-trumps-vote\/\" \/>\n<meta property=\"og:site_name\" content=\"The Kolabtree Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/kolabtree\" \/>\n<meta property=\"article:published_time\" content=\"2017-08-10T17:28:14+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2017-08-14T10:11:42+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/donald-2147250_1280-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1280\" \/>\n\t<meta property=\"og:image:height\" content=\"853\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Paul Ricci\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@CSIwoDB\" \/>\n<meta name=\"twitter:site\" content=\"@kolabtree\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Paul Ricci\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"How to Correct Outliers in Regression Models: An analysis of Trump's vote","description":"Using Trump's vote as an example, Paul Ricci writes about how introducing the right covariate can correct for outliers in regression models","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.kolabtree.com\/blog\/correct-outliers-regression-trumps-vote\/","og_locale":"en_US","og_type":"article","og_title":"How to Correct Outliers in Regression Models: An example with race, education, and the uninsured on Trump\u2019s vote","og_description":"Using Trump's vote as an example, Paul Ricci writes about how introducing the right covariate can correct for outliers in regression models","og_url":"https:\/\/www.kolabtree.com\/blog\/correct-outliers-regression-trumps-vote\/","og_site_name":"The Kolabtree Blog","article_publisher":"https:\/\/www.facebook.com\/kolabtree","article_published_time":"2017-08-10T17:28:14+00:00","article_modified_time":"2017-08-14T10:11:42+00:00","og_image":[{"width":1280,"height":853,"url":"https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/donald-2147250_1280-1.jpg","type":"image\/jpeg"}],"author":"Paul Ricci","twitter_card":"summary_large_image","twitter_creator":"@CSIwoDB","twitter_site":"@kolabtree","twitter_misc":{"Written by":"Paul Ricci","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.kolabtree.com\/blog\/pt\/correct-outliers-regression-trumps-vote\/#article","isPartOf":{"@id":"https:\/\/www.kolabtree.com\/blog\/pt\/correct-outliers-regression-trumps-vote\/"},"author":{"name":"Paul Ricci","@id":"https:\/\/www.kolabtree.com\/blog\/#\/schema\/person\/d3ae828656a4c84a3a1b7cdba371820f"},"headline":"How to Correct Outliers in Regression Models: An example with race, education, and the uninsured on Trump\u2019s vote","datePublished":"2017-08-10T17:28:14+00:00","dateModified":"2017-08-14T10:11:42+00:00","mainEntityOfPage":{"@id":"https:\/\/www.kolabtree.com\/blog\/pt\/correct-outliers-regression-trumps-vote\/"},"wordCount":1063,"commentCount":0,"publisher":{"@id":"https:\/\/www.kolabtree.com\/blog\/#organization"},"keywords":["analytics","Data Science","statistics","trump","US elections"],"articleSection":["Guest posts"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.kolabtree.com\/blog\/pt\/correct-outliers-regression-trumps-vote\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.kolabtree.com\/blog\/pt\/correct-outliers-regression-trumps-vote\/","url":"https:\/\/www.kolabtree.com\/blog\/pt\/correct-outliers-regression-trumps-vote\/","name":"How to Correct Outliers in Regression Models: An analysis of Trump's vote","isPartOf":{"@id":"https:\/\/www.kolabtree.com\/blog\/#website"},"datePublished":"2017-08-10T17:28:14+00:00","dateModified":"2017-08-14T10:11:42+00:00","description":"Using Trump's vote as an example, Paul Ricci writes about how introducing the right covariate can correct for outliers in regression models","breadcrumb":{"@id":"https:\/\/www.kolabtree.com\/blog\/pt\/correct-outliers-regression-trumps-vote\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.kolabtree.com\/blog\/pt\/correct-outliers-regression-trumps-vote\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.kolabtree.com\/blog\/pt\/correct-outliers-regression-trumps-vote\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.kolabtree.com\/blog\/"},{"@type":"ListItem","position":2,"name":"How to Correct Outliers in Regression Models: An example with race, education, and the uninsured on Trump\u2019s vote"}]},{"@type":"WebSite","@id":"https:\/\/www.kolabtree.com\/blog\/#website","url":"https:\/\/www.kolabtree.com\/blog\/","name":"The Kolabtree Blog","description":"Expert Views on Science, Innovation and Product Development","publisher":{"@id":"https:\/\/www.kolabtree.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.kolabtree.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.kolabtree.com\/blog\/#organization","name":"Kolabtree","url":"https:\/\/www.kolabtree.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.kolabtree.com\/blog\/#\/schema\/logo\/image\/","url":"","contentUrl":"","caption":"Kolabtree"},"image":{"@id":"https:\/\/www.kolabtree.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/kolabtree","https:\/\/twitter.com\/kolabtree","https:\/\/instagram.com\/kolabtree","https:\/\/www.linkedin.com\/company\/kolabtree","https:\/\/en.m.wikipedia.org\/wiki\/Kolabtree"]},{"@type":"Person","@id":"https:\/\/www.kolabtree.com\/blog\/#\/schema\/person\/d3ae828656a4c84a3a1b7cdba371820f","name":"Paul Ricci","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.kolabtree.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Race-4-the-Cure-006-96x96.jpg","contentUrl":"https:\/\/www.kolabtree.com\/blog\/wp-content\/uploads\/2017\/08\/Race-4-the-Cure-006-96x96.jpg","caption":"Paul Ricci"},"description":"Paul Ricci is a statistician, neuropsychologist and data analyst based in the USA. He writes a regular column for the website Data Driven Journalism and has an MA in Research Methodology and Neuroscience, and an MS in Biostatistics.","sameAs":["https:\/\/csiwodeadbodies.blogspot.com\/","https:\/\/twitter.com\/@CSIwoDB"],"url":"https:\/\/www.kolabtree.com\/blog\/author\/paul-ricci\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.kolabtree.com\/blog\/wp-json\/wp\/v2\/posts\/2374"}],"collection":[{"href":"https:\/\/www.kolabtree.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kolabtree.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kolabtree.com\/blog\/wp-json\/wp\/v2\/users\/31"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kolabtree.com\/blog\/wp-json\/wp\/v2\/comments?post=2374"}],"version-history":[{"count":7,"href":"https:\/\/www.kolabtree.com\/blog\/wp-json\/wp\/v2\/posts\/2374\/revisions"}],"predecessor-version":[{"id":2401,"href":"https:\/\/www.kolabtree.com\/blog\/wp-json\/wp\/v2\/posts\/2374\/revisions\/2401"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.kolabtree.com\/blog\/wp-json\/wp\/v2\/media\/2400"}],"wp:attachment":[{"href":"https:\/\/www.kolabtree.com\/blog\/wp-json\/wp\/v2\/media?parent=2374"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kolabtree.com\/blog\/wp-json\/wp\/v2\/categories?post=2374"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kolabtree.com\/blog\/wp-json\/wp\/v2\/tags?post=2374"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}