EViews 8 Command and Programming Reference - MBA智库文档

How to manage multiple categorical variables (with multiple levels) in STATA

Hi peers,
I am starting to run an econometric analysis considering the outcomes of a nationwide survey about the role of pesticides in the beekeeping sector. In the survey, people were asked to assign a value for each environmental hazard(from 1 to 4 from "less risky" to "higher riskiness" -> for those I created four dummy variables for each category). The dependent variable is also a dummy variable, pesticides (pest_d1 where 1=high risk perceived and 0=otherwise).
Then I run a logit model in this way (every dependent variable has 3 dummy variables (4-1 to avoi dmulticollinearity))
logit pest_d cc_d1 cc_d2 cc_d3 l_habitatd1 l_habitatd2 l_habitatd3 org_agrid1 org_agrid2 org_agrid3
and then I made Odds ratio and I got this
Iteration 0: log likelihood = -197.92648 Iteration 1: log likelihood = -140.03682 Iteration 2: log likelihood = -133.65316 Iteration 3: log likelihood = -133.54577 Iteration 4: log likelihood = -133.54573 Iteration 5: log likelihood = -133.54573 Logistic regression Number of obs = 360 LR chi2(8) = 128.76 Prob > chi2 = 0.0000 Log likelihood = -133.54573 Pseudo R2 = 0.3253 ------------------------------------------------------------------------------ pest_d | Odds ratio Std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- cc_d1 | .1522803 .3224346 -0.89 0.374 .0024007 9.65955 cc_d2 | .1455828 .0669988 -4.19 0.000 .0590714 .3587919 cc_d3 | .2762304 .0966967 -3.68 0.000 .1390918 .5485818 l_habitatd1 | .0609412 .0550912 -3.09 0.002 .0103614 .3584287 l_habitatd2 | .0629113 .0306199 -5.68 0.000 .0242345 .1633134 l_habitatd3 | .123277 .0463748 -5.56 0.000 .0589758 .2576857 org_agrid1 | .1900488 .1100751 -2.87 0.004 .0610737 .5913926 org_agrid2 | .3860638 .1414074 -2.60 0.009 .1883133 .7914748 _cons | 42.90251 16.91879 9.53 0.000 19.8065 92.93038 ------------------------------------------------------------------------------ Note: _cons estimates baseline odds. 
My questions are:
- Am I doing the right analysis by using the "logit" command? Cause I actually don't know how to manage categorical variables with multiple levels; - I was thinking of doing a marginsplot to show the relations among those variables, but still, I don't know how to make it with categorical variables w/ multiple levels.
Could you help me suggest any kind of analysis? Thanks in advance
submitted by chiaranota to stata [link] [comments]

Qualtrics separates categorical select all questions into discrete variables for each answer. For example, instead of Q1 with three answers been by one variable, it is Q1-1, Q1-2, Q1-3 as three separate variables. How do I combine these under one question variable in Stata?

submitted by Waterspritehere to stata [link] [comments]

Recoding Categorical variables to be binary

Hi, all I'm trying to recode my categorical variable to be binary. Right now the dataset is in Y and N format and I want to change it to have 1's and 0's. When I run the code to recode the variable it comes up with an error. For context, the name of the dataset is bots1 and the variable I'm trying to recode is named "Bot"
This is the line of code I'm trying to run to recode:
bots1$Bot = recode(bots1$Bot, "Y"="1", "N"="0")
Here's the error:
Error in recode(bots1$Bot, Y = "1", N = "0") :
unused arguments (Y = "1", N = "0")
Does anyone know what I'm doing wrong? I already have dplyr installed and imported into my R script.
submitted by kornlovespoop to rstats [link] [comments]

How can I test the total effect of an interaction term between two categorical variables in STATA ?

Hi there :)
I have a negative binomial regression model with two categorical variables and their interaction. However STATA's output displays comparisons between the coefficients of the levels of the categorical variable, instead of the total effect. For instance, my one variable is Profile with 5 categories and what STATA gives me is the effect of each category as compared to the category set as reference point. Similar logic is applied for the interaction term. Does anyone know how can I find out the total effect of the interaction?
submitted by Kytherian_Aphrodite to rstats [link] [comments]


submitted by Bongie1233012 to stata [link] [comments]

Stata treating categorical variables as measurements

My analysis includes the need for BankID and CountryID. Each bank and country is assigned a number. Problem is, stata treats these as measurements and not categories. is there a way to fix this?
Thank you.
submitted by Random8765434567 to stata [link] [comments]

Have a bunch of date:time variables in stata (non-string, already converted): eg January 1 2020 23:58. How do I categorize shift times (eg am shift 8a-4p, pm shift 4p-12a, and night shift 12a-08a)? Gracias!

submitted by goldendado82 to stata [link] [comments]

[Q] How do I recode a scale variable (that I made from multiple categorical variables) into a dichotomous variable on SPSS?

TLDR; After hours of playing around on SPSS I still cannot recode my newly created scale variable into a dichotomous one, as I keep getting the Error # 4631 and Warning # 4461. Any tips on how to solve these?
Hi everyone, before anything I know this sounds like I messed up (and I have), but I am in desperate need of some guidance right now. Some quick background; my analysis uses the ESS data set from 2016 (which is basically a ton of Likert scales). I'm looking into the potential influence of climate anxiety on domestic political trust within the Netherlands, and decided to recode a bunch of political trust variables into a single scale variable. I thought all was going well until I got some absurd values in my one-way Anova, and it turns out that my data is extremely heteroscedastic AND non-normally distributed. After hours of playing around I still cannot manage to recode my newly created scale variable into a dichotomous one, as I keep getting the Error # 4631 and Warning # 4461. I am fully aware that I will loose valuable information (variability), forced to work with attenuated correlation coefficients, etc.. (please don't hate me) but I do not see any other way of performing a regression analysis. Has anyone got any tips on how to solve this? Thanks in advance for any help!!
submitted by Complexicity to statistics [link] [comments]

How do I recode a date variable into a categorical one?

I have a date variable currently in format of dd-mm-yyyy hh:mm:ss. I want to create a new variable with 3 categories based on date ranges, 1 July – 30 November 2020, 1 December 2020 - 30 April 2021,1 May - 30 September 2021.
I have tried the syntax below (variable name is 'StartDate'), but although this is creating a new variable in the data file ('Treatment') there's no values in it it's blank.
if (StartDate >= date.dmy(1,7,20) & StartDate < date.dmy(30,11,20)) Treatment = 1.
if (StartDate >= date.dmy(1,12,20) & StartDate < date.dmy(30,4,21)) Treatment = 2.
if (StartDate >= date.dmy(1,5,21) & StartDate < date.dmy(30,9,21)) Treatment = 3.

Appreciate it if anyone can help with this problem.

submitted by WJ_L to spss [link] [comments]

What test for 2 categorical variables when one category has 3 choices vs 2 in the other category.

I am looking to test significance between Group 1 (n=60), Group 2 (=15) and Group 3 (=20). Testing for difference between a yes and no response for the questions at hand.

What is the proper analysis to run in SPSS.

I am comfortable with chi-square for testing 2 categorical variables but get confused when the you add the third category. The groups are exclusive can only be in either Group 1 2 or 3.

Thank you so much
submitted by cgd5054 to spss [link] [comments]

Recoding Date variables into categorical in SPSS?

Anyone know the best way for me to pull this off?

I've tried the syntax below, but it's not working.

Thank you!
do if (referral_date>=date.ddmmyy(02,Apr,12) and referral_date < date.ddmmyy(26,Nov,18)).
comp refdateREC = 0.
else if (referral_date>=date.ddmmyy(01,Jul,14) and referral_date < date.ddmmyy(26,Nov,18)).
comp refdateREC = 1.
end if.
submitted by lumberepi to statistics [link] [comments]

[Q] Categorical Variables for Inference in Regression - Resources and Techniques

Hi there,
I'm prepping for a take-home assignment for an interview that will likely involve regression analysis. I am familiar with how to work with continuous variables, but lack some sophistication in working with categorical variables.
How can I decide whether or not to include categorical variables in a regression model? I was thinking of using F-tests to see if there is a difference between models with or without a categorical variable, although I seem to be doing it wrong. Would appreciate thoughts on this.
How do I find out of categorical variables are highly correlated with each other or other continuous variables?
Does anyone have any general tips and resources for how to incorporate categorical variables into a regression analysis?
submitted by i_am_baldilocks to statistics [link] [comments]

Looking for Stata code, to generate new variable in a sequence based on specific criteria

I'm trying to figure out a way to identify the first time a number in a sequence by group (ID) is greater than 29 (number). For example, I would like to end up with the following:
ID Number NewVar Date
1 28 0 11/4/21
1 28 0 11/5/21
1 30 1 11/6/21
1 28 0 11/7/21
1 30 0 11/8/21
However what I have been doing, was generating just a regular code, bysort ID: gen NewVar=1 if Number>29, but that doesn't solve my problem that I want only the FIRST time a number is greater than 29 to generate NewVar=1. Is this possible to do in Stata?
submitted by Reader_West7112 to stata [link] [comments]

[D] dealing with categorical variables in a classification problem

Iam working on a model where I have to predict whether the outcome is a yes or no. The data I have is completely categorical. Like age in certain brackets, gender, has license etc. The metric to evaluate models is log_loss metric. I have used logistic regression as my first model and received a log loss score of 0.68. thats termed as a baseline score with log loss metric. In the sense thats the score we get when we give a probability of 0.5 for each row in test data.
I have used chi square test to see the relation between independents and dependent. There are few variables that are highly interacting with the dependent. But the model with just those independents is also getting a similar score.
I want to know what all the transformations or preprocessing I can do on this dataset? I have searched for the resources but I couldn't find any?
Would appreciate if someone can put their thoughts on this. Thank you.
submitted by yagami_light_1210 to statistics [link] [comments]

[D] dealing with categorical variables in a classification problem

submitted by yagami_light_1210 to AskStatistics [link] [comments]

Selecting the 50 most common possibile values (from a total of 71) of a categorical variable I'll cover 95% of the dataset. I would like to group all the others under an "Others" label. Doing this the "Others" value will be the most common in the dataset. How could i deal with this problem?

Selecting the 50 most common possibile values (from a total of 71) of a categorical variable I'll cover 95% of the dataset. I would like to group all the others under an submitted by MysteriousWealth2423 to learnmachinelearning [link] [comments]

[Q] Can I use a discrete value (categorical) that has is 0 and 1 as dependant variable in linear regression in SPSS?

I’m trying to predict the toxicity in some of the patients receiving a treatment.
So I wrote the toxicity as 1, 0.
1= patients having toxicity, while 0 = is not having any toxicity.

And for the independent variables used to complete the analysis all of them are continuous.
submitted by iNooon to statistics [link] [comments]

Tidying up the data - multiple categorical variables in different columns - for Likert scale

Hi redditors of Rstats,
I have trouble wrapping my head around this.
I have the following raw dataset (df):
Participant No Question 1 Question 2 Question 3
1 Agree Strongly Agree Neutral
2 Strongly Agree Neutral Agree
3 Strongly Disagree Disagree Disagree
4 Neutral Strongly Agree Agree
... ... ... ...
Answers to Questions 1, 2, and 3 are categorical variables/factors with 5 levels (Strongly Disagree-Disagree-Neutral-Agree-Strongly Agree).
I want to summarize the answers to each question and get the final output in a table like this:
Item Strongly Agree Agree Neutral Disagree Strongly Disagree
Question 1 20% 15% 30% 20% 15%
Question 2 10% 10% 10% 30% 40%
Question 3 20% 30% 40% 0% 10%
What would be the best way to tidy up the data in R?
I tried to create counts with dplyr and create prop.tables, but I need to iterate this across numerous questions and get the output in a single table with leveled factors in a manner similar to table shown above.
submitted by hurmash1ca to rstats [link] [comments]

I have a continuous variable in which there are several missing values. They are not random, but indicate that the variable in such cases can not be computed. How can I deal with a situation like this one? (I.E. If It was a categorical variable I could create a new category)

submitted by MysteriousWealth2423 to learnmachinelearning [link] [comments]

Categorical variable in regression and pairwise comparisons

When running a regression with dummy coded predictors, that have 3 or more factors, we end up comparing n-1 factors to 1 baseline factor only.
Is there a way to do a pairwise analysis in regression analysis? Like in post-hoc Anova analysis.
Thanks in advance!
Edit: I'm running an ordered regression actually, with a 5-point likert item as response variable and a factor as IV, with 5 levels (5 groups/conditions). I want to do a pairwise comparison between groups.
submitted by nirvana5b to AskStatistics [link] [comments]

recode variables based on a table in excel

recode variables based on a table in excel
I have a database with many variables that I need to recode their values, I have an excel table with the recoded values. How can I do to recode quickly and automatically without having to do it manually?
I give a small and easy example. I have this base:
sport music city    1 soccer tango San Pablo 2 voley pop Buenos Aires 3 rugby heavy metal Buenos Aires 4 basket bossa nova Buenos Aires 5 tenis rock San Pablo 6 tenis pop Buenos Aires 7 voley bossa nova London 8 rugby heavy metal London 9 basket tango San Pablo 10 tenis bossa nova San Pablo 11 basket heavy metal Paris 12 basket tango Paris 13 basket tango Buenos Aires 14 voley tango Buenos Aires 15 tenis heavy metal London 
And here are the excel charts with how to recode.

This example can be done manually without problem, but I'm posing the question for a 20-variable scenario where each variable has 20 values ​​to recode.
submitted by International_Mud141 to rstats [link] [comments]

Can I run Binary Logistic Regression in which one independent variable(categorical) is a part of another independent variable (categorical)

Hi, so the research I'm doing is about bladder cancer and it's association with various factors such as age, gender, risk factors, associated symptoms, smoking history etc.
I am applying binary logistic regression on these factors to calculate odds ratio for bladder cancer.
Now the problem is there are a total of 8 risk factors in my study of 259 patients. However, some risk factors occur only a few times (1-5). So these low number of cases can't yeild reliable odds ratio? (I think there has to be atleast 10 or some rule like that?)
Initially I just did binary logistic with presence of risk factor (yes/no) to having bladder cancer (yes/no). But I want to separately calculate for one of the most commonly occuring risk factors such as stones(around 20 cases) or UTI(around 20 cases).
Now can I include presence of risk factors (Y/N), Presence of Stones (Y/N) and UTI (Y/N)? Does it violate the assumptions of logistic regression? (The indepdent variables are correlated?)
Or do I have to separate it as Presence of Stones (Y/N), Presence of UTI (Y/N) and "Risk Factors Other than UTI & Stones" (Y/N)
submitted by GogaReborn to biostatistics [link] [comments]

Statistics 101: Linear Regression, The Very Basics - YouTube

EViews 7 Command Ref - Free ebook download as PDF File (.pdf), Text File (.txt) or read book online for free. About Quick-R. R is an elegant and comprehensive statistical and graphical programming language. Unfortunately, it can also have a steep learning curve.I created this website for both current R users, and experienced users of other statistical packages (e.g., SAS, SPSS, Stata) who would like to transition to R. The use of control variables and program control statements is discussed in detail in the programming guide in Chapter 6, “EViews Programming”, on page 83. The following sections provide an overview of the first four types of commands. Object Declaration The first step is to create or declare an object. A simple declaration has the form: object_type object_name where object_name is the ... Comandos de Eviews - Free ebook download as PDF File (.pdf), Text File (.txt) or read book online for free. EViews 8 Command and Programming Reference .pdf. bjq 2015-05-27 12:57

[index] [6680] [23525] [12940] [12266] [3297] [19317] [18124] [347] [21071] [8104]

Statistics 101: Linear Regression, The Very Basics - YouTube

This is the first Statistics 101 video in what will be, or is (depending on when you are watching this) a multi part video series about Simple Linear Regression...