PharmaSUG 2013 - Paper CC11
Combining SAS ® LIBNAME and VBA Macro to Import Excel ® file in an
Intriguing, Efficient way
Ajay Gupta, PPD Inc, Morrisville, NC
ABSTRACT
There are different methods such PROC IMPORT/LIBNAME to Import Excel file using SAS. While importing the data
in SAS, due to technical limitations these traditional methods might corrupt your data by applying their own automatic
formatting. Few examples are, importing column with mixed data types in numeric format or importing the regular
data containing hyphen/slash/colon/letter ‘e’ into date/time/scientific formats. The best way to avoid automatic
formatting is to import Excel data into pre-defined format for e.g. character. This paper will introduce a unique method
where Excel file is imported using a combination of Visual Basic for Applications (VBA) macro via Dynamic Data
Exchange (DDE) in SAS and SAS LIBNAME statement. This convenient and reliable solution will help SAS
Programmers/Statisticians to have better control over the quality of data and save significant time with minimal
coding.
INTRODUCTION
While SAS can import from a variety of formats, Excel files (XLS) are among the most common. Currently there are
many ways to import Excel data into SAS: The Import Wizard, PROC IMPORT, and using a LIBNAME statement.
But, due to technical limitations in SAS these traditional methods might create missing values while reading the
columns in Excel file containing mixed data types (both numeric and character). For e.g. we receive clinical lab data
from different vendors in Excel format which normally has the mixed data in the results column.
In particular, when the Excel file is large then it is hard to debug the loss of data. These methods mostly depend
upon SAS to decide the formats of the data and can create different attributes for similar variables when same Excel
file is read by different traditional methods. So, both programmer and validator might need to do some extra
programming to match the data which can be time consuming. If the header in the Excel file is not on the first line
then it will require further processing of the data to get the right header which might be time consuming. Due to the
above mentioned issues there is a desperate need for a new method to import Excel which can import the data
properly and programmers can predefine the variable formats with minimal coding.
In my previous paper, to meet the above challenge I had introduced a method to read excel using Microsoft Access.
However, previous method requires having Microsoft Access on your PC and also Access is not very popular among
the SAS programmers. There was a need to have a simpler and cost effective method with minimal interaction with
other system and SAS.
This paper introduces another useful (and probably simpler) method to import data from Excel file using SAS. For
Import, the entire process involves SAS, Excel VBA macro which is automated by a SAS macro called
%Convert_Excel2_SAS. It will import the entire contents while reserving the Excel layout and structures, including
tables with multiple columns, into SAS data. Since variable formats will be predefined as character using the VBA
macro, this will avoid any formatting changes occur to the data during the conversion process. In this macro the user
can define the line number containing the header and the macro will assign the header variables from the given line
number.
TECHNIQUE & MECHANISM
The general process to convert an Excel file into a SAS dataset is as follows (note that the first item is a one-time
setup, and the rest can be managed via SAS code):
1. Record a macro ‘IMPORT’ in Excel which will perform the following functions:
Open the target file in Excel and insert a line on the row one. After insertion you will have a blank line on row
one.
Insert a character for e.g. ‘C’ in blank row (here less than 256 columns). This will later help to predefine
Excel column as character.
Save the macro in Personal.xls and stop recording.
2. Open the relevant Excel file ‘Raw.xls’ and execute the VBA macro ‘IMPORT’. The updated Excel file will have an
extra first row with character ‘C’. Further, save the Excel as ‘Raw_1’.xl