Lesson 20: More on Importing Data -- Part I

Printer-friendly versionPrinter-friendly version

Introduction

In Stat 480, we learned how to read only the most basic data files into a SAS data set. In this lesson (and the next), we'll extend our knowledge in this area by learning how to read just about any data file into SAS — no matter how messy or unstructured the input data file. In most cases, the data files will be raw ascii data files that are obtained from exporting data from some other PC software.

Learning objectives & outcomes

Upon completing this lesson, you should be able to do the following:

  • read raw data separated by spaces into a SAS data set (that is, use list input)
  • read raw data arranged in columns into a SAS data set (that is, use column input)
  • read raw data not in standard format into a SAS data set (that is, use formatted input)
  • mix list, column, and formatted input styles to read raw data into a SAS data set
  • be able to determine when list input, column input, formatted input or some combination of the three styles should be used to read in a raw data file
  • understand that the lengths of numeric variables are set to 8 by default and therefore do not necessarily coincide with the widths of the numeric informats used in an INPUT statement
  • know the differene between fixed-length record data files and variable-length record data files
  • know when it is appropriate, and how, to use the INFILE statement's PAD option
  • know when it is appropriate, and how, to use the INFILE statement's MISSOVER option
  • know when it is appropriate, and how, to use the INFILE statement's DLM= option
  • know when it is appropriate, and how, to use the INFILE statement's DSD option
  • know when it is appropriate, and how, to use the INFILE statement's FIRSTOBS= option
  • know how to read missing values when using list input
  • know when it is appropriate, and how, to specify a range of numeric or character variables in the INPUT statement
  • know how to use the LENGTH statement to modify the length of a character or numeric variable
  • use the ampersand (&) modifier with list input to read character values that contain embedded blanks
  • use the colon (:) modifier with list input to read nonstandard data values and character values that are longer than eight characters, but which have no embedded blanks
  • know that with formatted input, the informat determines both the length of character variables and the number of columns that are read
  • know that the informat in modified list input determines only the length of the modified variable, not the number of columns that are read

Our "to do" list for this lesson

In order to complete the lesson you should:

  1. check Read the on-line lesson pages that follow.
  2. Type up your answers to the homework problems that follow the lesson in a Word file named homework20_yourPSUloginid. That is, if your PSU user id is xyz123, then name your file homework20_xyz123. Upload the file to the Lesson #20 Homework Dropbox.
  3. Post any questions or comments you have concerning the lesson's material to the Lesson #20 General Discussion Board.
  4. Take the Lesson #20 Mastery Quiz. Remember two things: i) You have 20 minutes to complete the quiz, and ii) as soon as you hit the "submit" button, your answers are submitted and graded, and the quiz becomes closed to you.