Thursday, February 3, 2022

Generate random names and addresses from SAS

For testing data processing systems (e.g., CRM, record linkage), you may need to generate fake people. SAS makes it uniquely easy to generate an unlimited count of fake US residents because it comes with a data set of US zip codes, which include the city and state name.

The system uses four data sets: first names, last names, street names, and US zip codes. Initials are randomly generated from letters. The street addresses probably do not exist in the given zip codes.

You could extend this by:

  • Add street directions (i.e., N, S, E, W)
  • Add street post type (e.g., Dr., Ct.)
  • Add units (e.g., Apt B, Ste 101)
  • Add post office boxes and private mail boxes
  • Spell out the middle name
  • Add name prefix (e.g., Dr., Mr.)
  • Add name suffix (e.g., Jr., Sr.)