*Total 80 pts;
*This workshop is to practice dataset concatenation and transpose;
*Q1. (20pts) Add proper statements in the two data steps below to read the instream
data properly. Check the provided data descriptions and make sure the created
variable names are intuitive.
Hint: when reading instream data using DATALINES statement, if need to specify a
delimiter other than a blank, add an INFILE statement following the example below:
INFILE DATALINES DLM=YourDelimiter;
* The use of this INFILE statement and DLM= option is similar to that when reading
external data files;
* Below is an annual GDP data (in million dollars) for selected countries from 2017
to 2022. Add proper statements to complete the data step;
data GDP;
datalines;
Australia 1220649.3 1291061.7 1368951.8 1382826.1 1557538.5 1809327.4
Canada 1765763.2 1852987.8 1899695 1847838.8 2133084.7 2415839.7
Germany 4386729.2 4576056.5 4840311.8 4815445 5153137.1 5582292.1
Japan 5262255 5344060.9 5404463.3 5358320.7 5599031.8 5895686.8
United Kingdom 3041891.8 3129647.8 3335878.3 3220295.9 3541780.7 3847919.4
United States 19612102.5 20656515.5 21521395 21322949.5 23594030.8 25744108.3
India 8053222.3 8919533.1 9375503.1 8832913.6 10088351.4 11515567.6
;
run;
* Below is the population for selected countries from 2019 to 2022. A three-letter
country code is included in addition to the country names. Add proper statements
to complete the data step;
data Population;
datalines;
Japan JPN 126633000 126261000 125681593 125124989
United Kingdom GBR 66836327 67081234 67026292 66971395
United States USA 328329953 331511512 332031554 333287557
Canada CAN 37601230 38007166 38226498 38929902
Australia AUS 25334826 25649248 25685412 26005540
Germany DEU 83092962 83160871 83196078 83797985
India IND 1383112050 1396387127 1407563842 1417173173
China CHN 1407745000 1411100000 1412360000 1412175000
;
run;
*Q2. (20pts)Concatenate the two datasets above, so that for each country, there is one
observation for GDP amount and another observation for population.
In the same data step, use proper method to create a charater variable to indicate
which observations are GDP and which obeservations are population.;
*Q3. (20pts)The final goal is to calculate GDP per capita(in Q4), which is GDP amount
divided by population for the corresponding year. Utilize the concatenated data
created in Q2 and restructure it in a way that’s easier to calculate GDP per capita.
Consider what your desired data structure is and leverage proc transpose to achieve
it.
The output data only needs to cover the years when both GDP and population values
are available.;
*Q4. (20pts)Calculate GDP per capita as described above using the data created in Q3.
Note that in the initial instream data, GDP is recorded in million dollars.
In the same data step, create a NUMERIC variable called Year to indicate the
correspoinding year for GPD per capita values;