Friday, January 29, 2016

Frequency of individual characters from SAS data set

This script counts the frequencies of individual ASCII characters in a single column in a SAS data set and then prints an easy-to-read report.

My initial motivation relates to delimiters. By default bulkloading data from Netezza to SAS (which is very fast) uses the pipe character as a delimiter, but my data set contained values with the pipe character, so this macro identifies alternative delimiters.

Another potential use is cracking a message encrypted using a simple letter substitution cipher.

To begin, this code creates an example data set courtesy of William Shakespeare.

Why "My Documents" Acts Weird in cmd.exe

On modern Windows, My Documents is not a normal folder . It is a legacy junction kept for old software. A junction is an NTFS reparse point ...