How to format numeric values in Python — the most essential methods

Andrea Koltai
3 min readSep 5, 2019

--

In course of working with databases every analyst encounters unformatted, raw values, however for interpretation or business presentation it is essential to convert them into a more readable format. This is just as true for numerical values as for other data types like string objects. I remember when I stared at a long number consisting of several digits and was a nightmare to track the scale of that at the first sight (I worked for a financial institute and I know, that the numbers can be excessively long), or as a result of a data manipulation I obtained a float number which real value was disguised by the countless decimal digits.

For example, can you guess at least the scale of the following numbers?

19519400000000
18707200000000
0.04341643859048916

Fortunately Python provides versatile solutions for these, maybe more that can be unfold within the framework of this blog. Therefore my purpose is to highlight and present the most essential functions and methods a data analyst indisputably should know. So, let see the magic that can save time, enable to focus on the real problem and eventually will make data analysis more enjoyable.

Primarily two functions are used to display numeric values, the well known
str.format() method and the later f-string version. Just worth to look and get more details about this topic from Anastasia Kharina’s blog here.

So, either way we choose to format and print out numeric values, the tricks are embedded within the tiny curly bracket: {}. The difference between the mentioned options is, that the latter version enables us — yet requires — to put the variable or the value directly into the curly brackets resulting a more concise format of our script. In case of str.format() it is not necessary to make a reference to the numeric value as there exists a default setting for it.

If we leave the curly brackets empty or we just put the variable name, reference into it, the printing format will not change. The definition of the desired formatting comes after the colon (‘:’), which actually separates the numeric value from the formatting syntax inside the brackets:

{ numeric : desired_format_specification }

So, what exactly can be modified?

  • set the precision
  • apply thousand separator
  • add + sign for the numbers
  • fill in the empty spaces with any character
  • set a minimum width (which makes sense with alignment or filling)
  • set alignment (^ for center, > for right, < for left)
  • set a type of our numeric value (f for fixed point notation, e for exponent, % for percentage, d for decimal numbers are few among the long list)

Using combination of any of these, we have to be aware that the order of the above mentioned format specification is fixed as it is given by the syntax:

[[fill]align][sign][#][0][width][grouping_option][.precision][type]

In the following examples I focused on the desired format specification leaving the place of numeric values empty. For this purpose I created a data frame containing three columns: one for the raw numbers, one for the format specification (the curly brackets) and one for an explanation. Taking the raw numbers and the format specifications as input — using the apply() method — I converted the raw numbers to the desired format into a new Output column. Et voila`!

Numeric value formatting for integers
Numeric value formatting for floats
19,519,400,000,000
18,707,200,000,000
4.342%

As a result, we obtained a more readable format even for these long numbers. Although it is easily applicable on Pandas DataFrames, it is recommended to use this practice only for displaying data because this formatting method converts the datatype into an object.

What about the Numpy arrays?

The good news is, that the little trick inside the curly bracket are applicable on Numpy arrays as well. We just specify our format and datatype within this numpy method, and the subsequent Numpy arrays will be displayed accordingly.

np.set_printoptions ( formatter = { ‘int ’: ‘{:.2f}’ . format} )

To reset our previous specification we just need to run the same method leaving the place for specification empty as it contains the default settings.

np.set_printoptions ( )

As opposed to the Pandas dataframe, the function modifies only the display of the numeric value and leaves the value itself unchanged, which enables us to process and use them for further calculations without loosing any information.

Numeric value formatting for Numpy arrays
The above mentioned numbers are the GDP values of the US
in 2017 and 2016, and the increase within this period.

More on this topic: https://docs.python.org/3/library/string.html#string-formatting

--

--