Amadeus cookies policy - you'll see this message only once.

Amadeus use cookies on this website. They help us to know a little bit about you and how you use our website, which improves the browsing experience and marketing - both for you and for others. They are stored locally on your computer or mobile device. To accept cookies, continue browsing as normal. Or, go to the privacy policy for more information.

Data Management Studio – Performing Tasks on Multiple Variables

Data jobs within SAS Data Management Studio allow users to perform a whole raft of data processing tasks on data items, these include applying business rules and monitoring data. Typically, the actions are performed on specified fields within specified datasets. However, it is possible to make your data jobs and process jobs more versatile and perform tasks on multiple variables.

This tip shows an example of using a data job to list all of the variables within a dataset. Subsequent tasks can then use this list to cycle through each variable and perform the same task on each. This is something that a SAS programmer would routinely code, but it is less intuitive using Data Management Studio. A typical use of this would be to apply a business to each variable.

Data Management Studio Image 1

The data job consists of the following nodes:

  • Read the first row of data 
  • List the variables in the dataset
  • Output the required rows
  • Create a work dataset with a list of variables

Read the first row of data

Use the SQL query node to read all of the fields from the data.

Data Management Studio Image 2

The query can be made more efficient by reading in just the first row of data, as all we are interested in is the variable names. This is set in the ‘Advanced Properties’.

Data Management Studio Image 3

List the variables in the dataset

Here we use an expression node (list the variables in the dataset) which makes use of the fieldcount, fieldname and fieldtype functions in the Expression Engine Language (EEL) to extract the number of variables, variable name and variable type from the dataset. The pre-expression initialises variables to be added, and the expression loops around each variable and assigns values to fields ‘column_name’ and ‘column_type’.

Data Management Studio Image 4

The pushrow function pushes current values to a stack.

Data Management Studio Image 5

When the pushed row status field is set via the ‘Settings’ tab, a defined field (‘prow’ in this case) is set to ‘true’ or ‘false’ for pushed rows. The output dataset contains added columns ‘column_name’, ‘column_type’ and ‘prow’, which can be viewed via the 'Preview' tab:

Data Management Studio Image 6

Output the required rows

Note that the final row from the previous node is not a pushed row. To avoid a duplicate value in the list of variables this is dealt with via a further expression node (Output if pushrow is true). This node uses the return function to determine which rows are included in the output, in this case only pushed rows from the previous expression.

Data Management Studio Image 7
Note, if you are only interested in performing subsequent tasks on a certain type of variable, these can be filtered here by modifying the expression. So, for example to see ‘real’ variables only modify the expression to:

Data Management Studio Image 8

Create a work dataset with a list of variables

The final step is to write the output variables to a work dataset. Achieved simply with a ‘Work Table Writer’ node. The work table includes just the variable ‘column_name’. The node has properties as illustrated.

Data Management Studio Image 9

The data job is now complete. Typically, this can be used within a process job. Process jobs allow looping, so it’s possible to use the work dataset output from this data job, to loop through each variable in the dataset and perform a task. A typical example is to apply a custom metric to check the attributes of each value for each variable. A schematic for an example process job is illustrated below.

Data Management Studio Image 10