Specifically, I was wondering whether count() wouldn't take string variables in earlier versions, but I think that it did in 13.1. Active 3 years, 6 months ago. Let’s see how _n and _N work. Stata has two built-in variables called _n and _N. _n is Stata notation for the current observation number. Descriptive statistics give you a basic understanding one or more variables and how they relate to each other. I have a dataset where each row is a firm, year pair with a firmid that is a string. Stata: using egen group() to create unique identifiers. Given an instruction to calculate maximums, it does that by group and for the total dataset. Type the following into Stata to see how. If there were three oldid ==1 observations followed by two oldid ==2 observations in the dataset, _n would take on the values 1, 2, 3, 1, 2. If you are new to Stata we strongly recommend reading all the articles in the Stata Basics section. Regarding. When _n is combined with by, however, _n is the observation number within by-group, in this case, within oldid. Moreover, as bug fixes and new features are issued frequently by StataCorp, make sure that you update your Stata before posting a query, as your problem may already have been solved. MA and MN passed a law in 2009. Ask Question Asked 6 years, 8 months ago. 1. Stata's answer in table is arguably what would be expected. Stata foreach loop for aggregating variables using egen total(var),by(level) 2 Stata: tag all values in a group based on a characteristic of any values in the group Notice that numbering restarts based on group. egen max_X = max(X), by(G) is a safer way to do it. I want one graph per group of states that passed a mandate in the same mandate year. Besides I would like to account for missing values, so if all values of var1 for the company x are missing variable, sum1 for company x and specific interval must contain missing values and not 0. Viewed 19k times 2. Total 2443.45946 73 33.4720474 Bartlett’s test for equal variances: chi2(1) = 3.4818 Prob>chi2 = 0.062 The F statistic is 13.18, and the difference between foreign and domestic cars’ mileage ratings is This article is part of the Stata for Students series. Stata for Students: Descriptive Statistics. _N is Stata notation for the total number of observations. bysort G (X) : gen max_X = X[_N] would do it if no X were ever missing. Stuart Topics Covered in this Section Nick [email protected] Owen Corrigan My data contains individual observations (taking a value 0-8 on indep variable X) divided into small unequal groups, where each group is uniquely identified by a grouping variable (G). If I get back to my previous example: CA and CO passed a law in 2008. _n is 1 in the first observation, 2 in the second, 3 in the third, and so on. I am using egen total() with Stata 10.1. bysort id eventid: egen _sum = total(var1) or more simply. If I do . The following works correctly but does not store the result in a variable so I can use it: total X if stu_id==710740 & hsflag==1 The following produces missing values: egen points=total(X) if stu_id==710740 & hsflag==1 What am I doing wrong? Using by causes this numbering to occur independently by group. Create New, or Modify Existing, Variables: Commands generate/replace and egen. egen _sum = total(var1) , by(id eventid) should both give you the total you want. You want the maximums by group, but also to see their total or sum. Using _N _N gives a count of the total number of observations being worked with. That seems puzzling, but it can be done indirectly: I want one graph with CA and CO only that shows the total number of people enrolled in each type of plan (HMO etc) across interview years.