SAS Notebook

Yuan Tian Posted at — May 15, 2019

1 My Little SAS Book

1 My Little SAS Book

The aim is to document code chunks that are likely to be re-used for fast searching and indexing.

1.1 Using `PROC SORT` to remove duplicates

There are three options that might be helpful: DUPOUT=, NODUPRECS, and NODUPKEYS.Code example are from this article:

[Recommended] NODUPKEYS (or NODUPKEY) option with PROC SORT removes observations with duplicate keys. Specify the keys, that uniquely identify a observation, in the by statement. In the example below, variable title uniquely identifies a movie.

PROC SORT DATA=Movies
 DUPOUT=Movies_Sorted_Dupout_NoDupkey
 NODUPKEY;
 BY Title;
RUN ;

NODUPRECS option identifies observations with identical values for all columns.

PROC SORT DATA=Movies
 OUT=Movies_Sorted_without_DupRecs
 NODUPRECS ;
 BY Title ;
RUN ;

1.2 `Input()` and `put()` for variable type conversion

input(char,4.) or input(char,datatime20.12) : Char -> Numeric(/Char)
put(numeric,$4.) or put(numeric, datetime19.) : Numeric(/Char) -> Char

1.3 Merging/Stacking Datasets – Truncated Values

Stacking multiple datasets into 1 dataset with variables in different length can be tricky. Here is the solution to resolve it. You need to:

define the length before set statement;
add format _character_.

data stacked_ds;
   length id $20 age 8 comment $200 ;
   set ds1-ds5;
   format _character_ ;
run;