Eduonline 24
Published in

Eduonline 24

Can we get SAS Proc Freq with Python?

In this article, we will discuss the SAS Proc Freq procedure and how we can achieve similar results using Python libraries.

Introduction to Proc Freq

The frequency distribution of categorical variables is quite common in descriptive statistics. SAS provides the freq procedure to achieve these stats in a simple way. There are two most common reasons why you need to know how to get frequency distribution in Python:

  • Python is the most commonly used coding language in Data Science.
  • Your team is migrating the code from SAS to Python.

Applications/Resources Used:

  • SAS OnDemand (earlier University Edition) / — For writing SAS Code
  • Kaggle — For writing python code
  • Pandas — For dataframe operations

1. Simple Proc Freq

SAS

Simple SAS Freq Procedure — Code
Output of simple SAS Proc procedure
Simple SAS Freq Procedure — Output

Python

  1. Importing the CSV
  2. Frequencies sorted by labels
  • The method, value_counts() returns a pandas series containing counts of unique values.
  • sort_index() sorts the data based on labels.

3. Convert the Resultant Series to a Dataframe

  • We will use the DataFrame method for this purpose.
  • Dividing data values by their sum and then multiplying by 100 gives us the percentage for each value. We are using the sum() and cumsum() functions to get the sum and cumulative sums of the variables.
  • Rounding the percentages up to two decimal places to match the SAS output
Python Equivalent of SAS Freq Procedure

Note: The Index can be dropped while exporting the DF to excel/CSV.

2. Include the missing values

SAS: By default, missing values are dropped, use the missing option to include them as a group.

Python: By default, the missing values are dropped, to keep missing values in the frequency table, add the dropna parameter and set it to False.

3. Sort the rows from most frequent to least frequent

SAS: By default, there is no order, we can specify the option order=freq to make it descending.

Proc freq with descending order
SAS Proc Freq in Descending Order

Python: Just drop the sort_index() method.

Python Equivalent of SAS Proc Freq in Descending Order

4. Additional options

SAS: Specify nopercent and nocum options for not printing the percentage and cumulative frequency and percentages, respectively.

SAS Proc Freq with Only Frequency

Python: Just drop the last two columns while converting to dataframe.

Python Equivalent of SAS Proc Freq with Only Frequency

5. Creating a Frequency Cross Tabulation

SAS: var1*var2 and dropping additional details to keep it simple.

Cross Tab of Two Vars in SAS

Python: Using pandas crosstab() method.

  • Specifying margins equal to true for adding row and column totals.
  • By default, the column name for these subtotals is “ALL”, we will change it to “Total” using the margins_name method to match SAS output.
Python Cross Tab of Two Vars Equivalent To SAS

6. Frequency Procedure — Multiple Variables

SAS: use the tables method to apply the freq procedure on multiple variables

Python: we can loop through the variable in the list to get the frequency distribution for multiple variables. The function takes 2 arguments:

A. dataframe
B. a list of columns

That’s all guys for SAS Freq in Python, the one thing we can notice here is that a simple proc freq with more details is simpler in SAS as compared to python. Whereas as we go deeper, the Pandas library does the most of the work for us and we just need to utilise the ready-made methods.

Let me know your thoughts in the comment section.

--

--

--

Articles Related to Web Development, Data Science, Programming, Coding, New Technologies, etc.

Recommended from Medium

💙 QUBE Launchpad х AssetFi 💙 AMA Session is coming tomorrow!

“Selling” automated testing to software engineers

Computer Storage Devices

How to Reveal the Secrets of the Human Body — Part 1

Optimizing Django Admin Paginator

Removing jQuery From Your Project With JavaScript: Part 1

Linux Firewall Management

Working with PDF annotations using C#: Widget Annotation

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Pradeep Singh

Pradeep Singh

Web Dev @ psrajput.com | Writer @ eduonline24.in | Analyst @ Genpact

More from Medium

Repetition Practice - basic syntax python

Add Speed Metrics like ‘Average Moving Speed’ to Strava Activity descriptions Using Python

Understanding Iterators in Python

How to get around of the “UnicodeEncodeError: ‘utf-8’ codec can’t encode character …: surrogates…