1. (80 pts) Continue to work with the 6 datasets of a bookstore: customers, items, list, orders, prices, and salesrep. Use PROC SQL only for this question. No data steps or other PROCs are allowed.
- You’ve worked to see which rep sold the most copies of books. In the real world of business, it’s all about $ of sales. There is enough info in the provided data tables to calculate the $ sales of each order. Your task is to find out the annual $ sales of each sales representative. You may notice that prices change periodically per book. Be sure to fetch the right price for the orders based on the order date.
- Considering all the years together, which sales rep has the most $ sales?
- Comparing total sales of all years may not make sense, because there may be sales reps who joined the sales force later than others. Which summary statistic would be more appropriate then? Carry out that calculation, and the find the best sales rep.
- For each sales rep, which year is his personal best year?
- Consider just the personal best years. Which year is the most common?
6. (20 pts) Create an experimental design for a clinical trial comparing two drugs.
6.1: Create three SAS datasets (using DATA steps):
– the 1st called site, having the character variable center with values DE, PA, NJ, and MD. You have four sites listed in this dataset.
– the 2nd called arm, having the numeric variable trt with values 0 and 1. 0 will be the control group, and 1 will be the treatment group. This dataset has 2 obs.
– the 3rd called dose containing the character variable dosage with values Low , Medium, and High.
Make sure that the character variable has sufficient length.
6.2: Using a single PROC SQL procedure statement, display all possible combinations (4x2x3) of the three variables center, trt and dosage, ordered first by ascending order of center, descending order of trt, and then randomly ordered within each value of dosage.
Hint: how to sort randomly? Use ranuni(12345) as you would use a variable name in the ‘order by’ clause. You do not need to include this as a variable in your dataset. You can use any other seed number instead of 12345, such as your birthday.
The function ranuni() provides a random draw from a uniform distribution using the integer value provided in parentheses as the seed for the random number generation. By specifying the same random seed, each run of the SAS code will produce the same result.