Amadeus cookies policy - you'll see this message only once.

Amadeus use cookies on this website. They help us to know a little bit about you and how you use our website, which improves the browsing experience and marketing - both for you and for others. They are stored locally on your computer or mobile device. To accept cookies, continue browsing as normal. Or, go to the privacy policy for more information.

Using Macros to Input Multiple CSV Files into a Data Set

This tip highlights how simple it can be to read in multiple CSV files and output them as a single SAS Data Set utilising the FILENAME command and the PIPE device-type keyword.

The PIPE device-type keyword in the FILENAME statement allows SAS to execute a command outside of SAS and redirect its results to SAS.

Here, we use the FILENAME PIPE combination to run the "DIR" command which is a Windows command which returns the files and directory listings of a specified file path. This example concatenates multiple CSV files using just two data steps; the first data set records all of the data set names whilst the second data step performs multiple concatenations.

A high school uses CSV files to track the grades of students in separate classes. Each class has their own CSV file which details the grades of the students. Providing these files are saved in a single location, the data can be easily combined:

Using Macros to Input Multiple CSV Files into a Single Data Set Image 1

We assume that all CSV files follow the same format - they consist of: gender, first name, surname and the grades the students have attained in five different subjects.

Using Macros to Input Multiple CSV Files into a Single Data Set Image 2

Three files have been created to simulate three different classes; 5J, 5L and 5M. Each file has been saved as a CSV file in the same directory C:Exam_Results. A macro variable is then created in SAS which points to this directory:

Using Macros to Input Multiple CSV Files into a Single Data Set Image 4

The macro variable is then utilised in a filename statement along with the PIPE device-type keyword to read from the windows “DIR” command results to identify all of the CSV files within this location. The /b option allows for only the filenames to be picked up from the directory and does not include the directory path (See Note).

Using Macros to Input Multiple CSV Files into a Single Data Set Image 5

The data set file_list can now be created. This data set contains the names of each of the files that will be input into the main data set. This data set identifies the three CSV files saved within the C:Exam_Results directory.

Using Macros to Input Multiple CSV Files into a Single Data Set Image 6

Using Macros to Input Multiple CSV Files into a Single Data Set Image 7

A separate data statement can then be issued in order to read all of the file names from the file_list data set. Each file name identified is input into the filepath variable to detect its location. The do loop ensures that the data set inputs every variable within each CSV file until the end of the file_list data set is reached. This inputs all of the students' details into the all_students data set and also includes a classname variable derived from the filename to identify the class of each student.

Using Macros to Input Multiple CSV Files into a Single Data Set Image 8

The output data set all_students contains the contents of all three of the CSV files in the order in which they are input. This data can be used to perform further analysis.

Using Macros to Input Multiple CSV Files into a Single Data Set Image 9

Note: In order to use the filename pipe statement you will need to ensure that XCMD is enabled. This can be checked using a proc options.

Using Macros to Input Multiple CSV Files into a Single Data Set Image 10

If XCMD is enabled the following will be visible within the log:

Using Macros to Input Multiple CSV Files into a Single Data Set Image 11