Displaying Absent Combinations of Classification Variables with Proc SUMMARY
Proc SUMMARY with the NWAY option will produce a summary of a dataset based on all the combinations of the CLASS variables found in the data.
Proc summary data=spend_info nway; class accnum date; var spend; output out=sum_spend(drop=_freq_ _type_) sum=; run;
The example above produces the following table
accnum date spend 111111 JAN2009 147.56 111111 FEB2009 385.99 111111 APR2009 252.04 222222 JAN2009 70 222222 FEB2009 302.1 222222 MAR2009 682.84 222222 APR2009 552.31 222222 JUN2009 486.55 333333 FEB2009 315.44
The table produced lists all of the combinations of accnum and date that occur within the dataset, looking at the data it is possible to see that there are dates for some accounts that are not listed against others (i.e. ’MAR2009’ for accno ‘111111’)
You may wish to see all possible combinations of the values within the classification variables listed, even if that combination does not appear in the data so that a row will appear for the combination where date is ’MAR2009’ and accno is ‘111111’.
This can be achieved using the COMPLETETYPES option as in the example below.
Proc summary data=spend_info completetypes nway; class accnum date; var spend; output out=sum_spend2(drop=_freq_ _type_) sum=; run;
Proc SUMMARY has listed out all of the possible combinations of the classification variables based on the values occurring throughout the whole dataset, where a particular combination does not actually appear in the data, a missing value is generated in the analysis variable.
accnum date spend 111111 JAN2009 147.56 111111 FEB2009 385.99 111111 MAR2009 . 111111 APR2009 252.04 111111 MAY2009 . 111111 JUN2009 . 111111 JUL2009 . 222222 JAN2009 70 222222 FEB2009 302.1
Here we can see that MAR2009 is matched with account 111111 even though there is no data for that combination in the dataset.