Python to C++: A Data Scientist’s Journey to Learning a New Language — Hello World

Daniel Benson
Analytics Vidhya
Published in
8 min readApr 3, 2021

An Introduction

As an aspiring Data Scientist I got my start into the great world of Computer Science through learning the programming language Python. This is a language that is highly user-friendly with a number of great libraries for program creation, data gathering, data analysis, etc. From there I chose to expand my knowledge by delving into the world of C programming, specifically C++. After having done some research on programming languages I decided upon C++ because of its widespread usability within the data science community and beyond. In this new series we will explore the differences and commonalities between these two great programming languages with a heavy focus on program engineering.

In today’s introduction issue we will go over some of the basic differences between the Python and C++ languages, including brief descriptions of the two, areas of interest each language is useful for, and what their roles in Data Science are. We will follow this up by creating the ever-popular introductory program where we will gleefully greet the world using both programming languages.

A Brief Descriptive Look at Both Languages

Python, our starter language, is considered to be an interpreted, high-level programming language several steps above computer code. What this means is that there is a lot of software running in the background allowing for more friendly usage and more forgiveness with inevitable mistakes and does not require the user to compile the program to run it. This a programming language great for beginners as it allows the user to focus more on the fundamental concepts of computer science without overburdening the programmer with complex language syntax. Python’s ease of use and accessibility has made it a popular choice amongst programmers from various different fields, including web developers, game developers, general software developers, data scientists, and data analysts. A major downfall of this language’s ease of use is its overall slower runtime. While this is not immediately apparent in the creation of smaller products, larger projects such as major neural network models will suffer more from this runtime issue.

C++, the language built upon the backbone of the C language, is a compiled, high-level programming language that is relatively closer to computer code than Python. There is not as much going on in the background creating a less user-friendly and less forgiving language. C++ requires higher language proficiency from the user and a more stringent approach to syntax. While this does make the language more difficult to learn, its pros far outweigh these cons. First and foremost, as a language built upon C, one of the oldest and most widely used programming languages, its importance in the computer science world cannot be understated. As a compiled language and close proximity to computer code its runtime allows it to perform magnificently in larger programs that could only crawl in other programming languages like Python. This makes the language useful in applications such as more computationally taxing video games, web applications, desktop applications, and performance-critical applications.

Applications Of Each Language Within the Data Science World

The importance of Python within the heavier mathematical fields, especially statistics and data science, is no secret. Python provides a plethora of high quality of life mathematics libraries for use in analytical programs. The world of machine learning becomes infinitely more accessible through the use of libraries such as pandas, numpy, scikit-learn, and scipy. For this reason as well as its ease of use it is no surprise that Python has become the premier language amongst Data Scientists. Programmers from this field can focus more on the mathematical, statistical, and analytical aspects without having to worry about spending too much time on syntax and program engineering. As mentioned above, Python is also an interpreted language, meaning that any program, small or large, can be run immediately without compiling for faster analysis of data, machine learning models, etc.

C++, while not being well known for its widespread use within the Data Science field, still certainly has its uses. After data exploration and model creation has occurred the aforementioned model is generally implemented into a program for predictive purposes with new data. Normally this isn’t an issue; however, when working with a vast amount of data (such as big data) and highly complex predictive models (such as deep artificial neural networks) we need the use of a different language for faster processing, the perfect job for the likes of c++.

Our First Program Comparison — A Bilingual Greeting to the World

In this series I will implement raw code to create simple programs using both the Python and C++ programming languages. I will offer comparisons of the code line by line as well as brief explanations as to their differences. For the sake of this series I will be coding within a local Integrated Development Environment, or IDE for short. I prefer to use VS Code as it is the IDE I was taught in and the one I am most comfortable with. It is up to you to decide which IDE works best for you. All completed programs are then run using my terminal’s command line (on Mac operating system). The following link can be used to download VS Code regardless of your operating system of choice: https://code.visualstudio.com. The following link can be used to download Python and many of its most important libraries regardless of your operating system of choice: https://www.python.org. Lastly, the following link will walk you through the process of setting up an environment for C++ regardless of your operating system of choice: https://www.geeksforgeeks.org/setting-c-development-environment/.

Our first program, a single line of code in Python, allows us to print out the phrase “Hello World!” through the command line:

print("Hello World!")

Simple, clean and easy, the software running behind the scenes allows access to Python’s “print()” method, which takes the parameter passed and prints it out to the user. To run this program we first save it in VS Code, then on the command line ensure we are in the correct directory, in this case I’ve named the directory python. We use the following command line prompt to check our current directory:

$ pwd

and we receive something similar to the following output:

>> /Users/Daniel/Desktop/Medium-Posts/Python to C++ Series/Article 1/python

From here we can run our created program, helloworld.py, using python 3 by typing in the following into the command line:

$ python3 helloworld.py

which will run our program and output the following to the command line:

>> Hello World!

As we would expect, things aren’t as simple when writing the same program in C++.

#include <iostream>int main(){std::cout << “Hello World!\n”;
return 0;
}

The same program, only one line in Python, is five lines in C++. Let’s break this down. The first line is used to call in the standard input and output library so that we are able to use C++’s equivalent keyword for Python’s print(), std::cout. Unlike Python, many of these libraries are not built in to the language software but must be called from separate files using the #include statement. This is one of the reasons C++ is able to run faster than Python.

The second line of code in our C++ program is required in every program created using this language. It is the initializing function, telling the compiler this is where our program starts and everything within this function needs to be checked and/or run. In some of our larger python programs we do include an __init__ function which works similarly, but as we know this is not required for many of our Python programs to run.

The third line, an open bracket, {, is also required whenever we are creating a function in C++. This tells the compiler that this is the beginning of the function and everything after the open bracket must be checked/run. This is similar to the colon symbol in Python after naming the function and its parameters. Unlike Python, however, C++ also requires a symbol indicating the end of a function, or a close bracket, }. We find this in line 6 of our code, and this tells the compiler that the end of the function has been reached and whatever comes after should not be checked/run within the confines of that function.

Finally we have line 4 and 5 of our C++ program, the body of the main() function. Here line 4 indicates to the compiler what the function should do. std is the standard library as mentioned above, which includes methods shuch as cout and cin. We will go over these more in a later article, just know for know that it is important to include the name of the library followed by two colons, then a call to the method we want to use within that library. In this case we are using the std library and the cout method within that library, which will print out whatever follows the double left arrows, <<, to the terminal. The double left arrows tell the compiler that what follows is what needs to be printed out. We then include what we want to print out, in this case our “Hello World!” string. Like in Python, the use of the “\n” tells the compiler that we want to create a new line after we have printed what we want. Finally, a semi-colon is added to the end of the entire statement. Unlike in Python, C++ requires these semi-colons at the end of every statement to indicate to the compiler that it has reached the end of a statement. It acts as a sort of stop symbol so that the compiler knows when it needs to move on to the next statement.

This brings us to line 5 in our program. Every main function must return 0. This is used to indicate to the compiler that the main() function was run successfully without any error. A 0 to the compiler indicates a successful run, while a 1 indicates an error. Again this return statement is ended with a semi-colon.

Now we finally have our program built and ready to run. But here we run into another major difference between the Python and C++ languages. As mentioned earlier, C++ requires the programmer to compile the program into computer code before the computer will run it. To do this, we first check that we are in the correct directory as we did with the python program. In this case I named the directory c++:

$ pwd>> /Users/Daniel/Desktop/Medium-Posts/Python to C++ Series/Article 1/c++

Next, we have to compile our program. We do this using the following code:

$ g++ -o helloworld helloworld.cpp

Where g++ indicates that we are invoking the c++ compiler, -o indicates that we are including our own title for the compiled file, helloworld is what we want to title the compiled file, and helloworld.cpp is the file containing the program that we wish to compile. If everything within our program syntax is correct than we will not be given any output, the cursor will simply move down to the next line in the command line to await our next command. From here our code is now compiled and finally ready to run. We can check that this new compiled file was created by typing the following into the command line and seeing a similar output:

$ ls>> helloworld     helloworld.cpp

which indicates to us that we have two files in our directory now, the helloworld file which contains our compiled program, and the helloworld.cpp which contains our C++ coded program. To run our program the following prompt is typed into the command line and we receive the expected output:

$ ./helloworld>> Hello World!

Thank you for following along with this introduction. In the next installment we will explore data types and variables, as well as compare the code for printing output and accepting input from the user in both languages. Until then, wonderful reader, and as always, rep it up and happy coding.

Daniel Benson

Sources

https://en.wikipedia.org/wiki/C%2B%2B

https://en.wikipedia.org/wiki/Python_(programming_language)

https://www.cplusplus.com/info/

https://www.python.org

--

--

Daniel Benson
Analytics Vidhya

I am a Data Scientist and writer prone to excitement and passion. I look forward to a future I am able to focus those characteristics into work I love.