Module 10 focuses on the use of macro variables, which can add a lot of flexibility to your SAS coding.
- Let’s continue to work on the M10 Workshop Q2.
- In the Workshop, you found the variables from MON1001 data in SASHELP library with no missing values. However, they may not be enough for a comprehensive analysis. Let’s expand the selection criteria to include the variables with <=10 missing values. Use same steps as Workshop Q2.1 – Q2.5 to create a data set with this updated logic, and call it WorkingData.
- How many variables did you get this time? What you have to change in the Workshop code to make this update?
- Now we have more variables with a few missing values. Let’s try to fill the missing values with mean of the variable. We want to set up the code flexibly to fit any interested variables with minimum input.
The first step is to calculate the mean for all the variables. Use a PROC MEANS to do this and save the result in a dataset.
- Similar to the workshop, we also need to transpose the means data to manage easily.
- This time, the goal is to treat one variable at a time to fill the missing. Use a PROC SQL to create two series of macro variables: mean1 to meanN and var1 to varN, where N is the number of variables in your workingdata. You should already have created this macro variable in the last step of Q1.1 above. Do not hard code this number.
To clarify further, you will create N*2 macro variables in total. The value of your macro variable var1 should be the name of the first variable. The value of your macro variable mean1 should be the mean value of the first variable, and so force. Use %PUT statement to spot check the results.
- Assign a value 3 to a macro variable and call it i. We are going to fill the missing values for the third variable without explicitly hard code the variable name. This would allow for flexible changes of the target variable.
Then, check the class material regarding indirect reference of macro variables. Write a macro variable reference expression combining i and the macro variable series you created in 1.5 above. Do this for the variable names and mean values series separately. So that with any value of i, one of your reference expression will be resolved to the name of the i-th variable; and the other expression will be resolved to the mean of the i-th variable.
Use %PUT statement to check the result. Your only manual input should be the number i to specify one variable. Leave the expressions in the %PUT statement as the answer of this question.
- In a data step, create a new dataset and read your WorkingData in. Then use the macro reference expression above to point to the target variable, and fill any missing values with its mean. Create a new variable to store the values and add a suffix _fixed to the target variable name as the name of the new variable. E.g. if you are processing a variable called T. You will need to add a variable called T_fixed to store the values with all missing value filled with mean.
- You would want to verify if the result is as expected. To verify, use a PROC PRINT to display the rows with either unequal values between the target variable and the fixed variable, or the rows with missing values in the fixed variable. Add a title to display the value of i, the name and the mean value of the target variable. Why checking these two areas? What are you expecting to see?
- How many observations did you see in Q1.8 when checking the result? Do they make sense to you?
Change the value of the macro variable i to be 6 then rerun the code. How many observations did you see in the checking data this time?
You don’t need to make a copy of any code to process the new target variable. Just change the value of i and run the rest of the program again to check the result.
We will learn macro code in the next class, which would enable you to do this process iteratively for all the variables. However, the common practice to design a macro loop is to first make sure the code runs well on one iteration like this assignment leads you to do, then change it to a loop. After learning the next class, you should be able to convert this code to be a macro easily.