M7 Assignment STAT674

/*Q1: Customer information in a bank is separated in two datasets as below.*/

data customer_demo;
length ID 3 Gender $1 income $5;
input ID Gender $ income $;
datalines;
123 F 10k
234 M 15K
345 F 20K
678 F 21K
987 M 25K
;
run;

data customer_balance;
length ID 3 zipcode $5 balance $6;
input ID zipcode $ balance $;
datalines;
123 14528 500k
234 37482 650k
987 63783 540k
678 78392 320k
345 02934 700k
;
run;

*Q1.1 (10pts)Merge the two datasets using an appropriate data step.;

*Q1.2 (10pts)Now try a different way to combine the datasets – concatenate the provided two datasets above. ;

*Q1.3 (10pts)Compare the output data to Q1.1. what’s the difference between the merged result and the concatenated result?
           Which approach is better to analyze this data?;
          
*Q1.4 (10pts)Both income and balance are represented as character variables in the data, which doesn’t really make sense.
           In your combined dataset, use one or more appropriate SAS function to convert them to numeric. Then, use appropriate
           PROCs to find out whether females have higher income on average or males have. ;
          
*Q1.5 (5pts)Do females have higher balance on average or do males?  The answer will be limited to whatever data we have
           here. There is no need to generalize to the bigger population.;



/*Q2: below you will create two datasets about a store’s sales information. Use them to find the answers to the questions
below. You may need multiple data steps and PROC steps for each question. This time, you’ll have fewer guidance than before
and you may want to scope a step-by-step plan for yourself.
You can put your codes together for all five questions, then answer them together. There is no need to create a separate
coding block for each question.*/

*Q2.1 (10pts)find out the total sales amount in terms of dollars for each item.;
*Q2.2 (5pts)Which item sold the best in terms of quantity? ;
*Q2.3 (5pts)Which item sold the best in terms of cash flow (total sale amount)?;
*Q2.4 (10pts)Which item sold the best in terms of total profit for the store?;
*Q2.5 (10pts)Which material sold the best in terms of total profit for the store?;

data items;
input Item $ 1-7 Material $ 8-14 cost 16-18 price 19-21;
datalines;
shirts silk    31 79
pants  cotton  25 58
suits  cotton  78 158
belts  leather 16 29
shoes  leather 32 65
;
run;


data SalesReport;
input Item $ 1-7 sales_quantity 8-15;
datalines;
suits  832
shoes  1656
shirts 1820
pants  2532
belts  3350
;
run;


/*Q3:      (15pts)A company hired new employees and new interns this year. There are three datasets describing the new
           employees, new interns and department managers in this company. Use one or more data steps or PROC steps to
           figure out which manager will have the most new hires (including new employees and new interns). 
           Again, you may want to first layout the steps needed to answer the question.*/

data New_Employee;
length name $15;
input name $ team training_course work_experience;
datalines;
Bohr,Neils 1 0 3
Zook,Carla 2 2 2
Penrose,Roger 1 3 1
Martinez,Maria 3 1 2
Orfali,Philip 2 1 1
;
run;

data New_Intern;
length name $15 major $20;
input name $ team training_course major $;
datalines;
Capalleti,Jimmy 3 1 FashionDesign
Chen,Len 2 0 BusinessAnalytics
Cannon,Annie 1 0 Mathematics
Davis,Brad 3 0 Art
Einstein,Albert 1 1 ComputerScience
;
run;

data Manager;
length name $15 department $10;
input name $ team department $;
datalines;
Wilson,Kenneth 1 Operation
Bardeen,John 2 Marketing
Sorrell,Joseph 3 Design
;
run;







 

Click here to order similar paper @Udessaywriters.com.100% Original.Written from scratch by professional writers.

You May Also Like

About the Author: admin