An amalgamation of new-age big data perspectives

A candid discussion with real-world data scientists/engineers

Anurag Bejju
SFU Professional Computer Science
7 min readMar 1, 2019

--

Anurag Bejju, Manan Parasher, Rishabh Singh, Nikitha Ravi

The video interview with four panelists

We are living in an era where tweets, Instagram stories and Netflix shows largely influence our daily lives. In fact, over the last two years, the world has generated more data than ever in its entire lifetime. Banking on this phenomena, tech companies have been heavily investing in the creation of textual and visual form of data. From cat videos to DIY tutorials, new content is being generated at a very rapid pace. According to a recent Forbes article, that rate is projected to hit a pace of 1.7 megabytes per second for every human by the year 2020. To keep up with such large stream of data, companies around the world have realized the importance of having a robust data strategy and have increased their efforts to acquire talent that can help them utilize the vast power of big data effectively.

In order to comprehend the current big data market trends and understand the responsibilities of real-world data scientists, we conducted a group discussion with a panel of data scientists and data engineers that have set milestones working for top tech companies. They were part of cohort 3 and 4 of SFU Professional Masters in Big Data program and eventually went on to gain immense domain knowledge about the best practices in this field.

Introducing our four extremely talented panelists :

Choosing the Big Data Program

Even though we are all a part of the same program, everyone comes from diverse backgrounds and has their own reasons for pursuing Big Data as a field of specialization. We asked the panelists their motivations for pursuing the SFU big data program.

Says Sethuraman(aka Sethu), who started as a software engineer in Zoho and worked on one of its data analytics products called Zoho reports, “Initially it was fun working with analytics products and adding a few cool features, but soon I noticed that the trend was gearing more towards Big Data. I started researching more, and including some basic functionalities for Machine Learning in our product, but soon I realized that I needed proper education for Data Science. That’s when I decided to pursue a Master’s in Big Data at SFU.”

Education in the changing field of Data Science

Just recently, Forbes published an article recognizing the rapid pace of the latest technology being introduced every day around the globe.

They stated that “It is a game where the rules and goals are constantly changing and the long-term winners aren’t those who simply try to make it to the next level but the ones who continually adapt”.

We wanted to know from the panelists, their strategies to navigate through and emerge successful in this rapidly changing field of data science.

According to Hiral, the basics of any technology remain the same. “For instance”, she says, “If I want to switch from Tableau to PowerBI, the basics of data visualization in reporting remains the same, it is just the UI that changes, and that’s what the university should focus on — the fundamentals and a few handpicked technologies. I feel SFU big data did a great job with that, teaching us the fundamentals as well as a few basic tools like Hadoop and Spark, providing us the right balance.”

Skills needed to be a successful data scientist

A crucial part of this program is the co-op, which entails the students applying the skills they have learned in school and give them exposure of what it is like to work in the field of big data. All of the panelists have successfully completed their co-ops and some of them even have some full-time experience. We asked them some key skills required to be a successful data scientist.

Statistics, basic programming skills, the curiosity to ask questions and most importantly — the ability to communicate that to the people, these are some of the key skills as per Supreet. One other key aspect that comes on is the overhead of solving any problem — whether it is even worth putting in the resources required to solve the problem.

Data Science used in Real Life

As we reminisce over their university life with the panelists and the struggles that they had to face as students, we ask them some examples where they applied data science to a real-world problem and how it differs from an academic setting.

According to Manoj, one of the biggest problems that people face in the industry is that you don’t get ordered data every time. Says Manoj, “I work in the Internet of Things (IoT) domain, where I deal with events. These events happen in their own time, and there is a difference between the event time and the processing time for those events. So how do you compensate for this time difference?

As we know, the word is moving more into streams. Most problems I encountered were in regards to streaming data in real time. Yet another issue is that most data given to us in school is clean, structured data, whereas in real life most data is unstructured and you have to spend significant time cleaning it and converting it to a usable form.”

Sethu talks from a separate perspective, “Talking from the other end i.e. the perspective of a user using your product the first thing to keep in mind is not to make the user wait. You don’t want to make the user wait for a long time, as well as you don’t want to disappoint the user with mediocre results. You have to find the fine balance between speed and accuracy.”

Even though we as data scientists are always trying to find that balance, it becomes a more pronounced problem in the industry where you have a user waiting for your results.

Gender equality at the workplace

“Since it is not visible it doesn’t mean it does not exist” — Supreet Takkar

According to a Forbes arcticle, Women only make up 24% of the computing workforce — and that number is declining. In fact, four out of ten women are leaving STEM careers despite having engineering and computer science background.

Having two successful female data engineers part of this panel, we were able to obtain some insights about gender equality at the workplace. According to Supreet, she never faced gender discrimination during her tenure at Infosys and EA, but acknowledges that it does exist in some places. She applauded the conscious initiatives taken by employers to make sure equal opportunities are presented to employee’s irrespective of their caste, creed, gender or religion.

Hiral had further advanced this discussion by passionately speaking about the drive and determination women possess to accomplish their goals and aspiration of their lives. She feels, women often have to bear the responsibility to change their professional goals in order to fulfill their family commitments. This could also be one of the driving factors for women to leave the workforce. She suggested that companies should promote more gender-balanced policies to allow men and women to share professional and personal responsibilities equally.

Opinions on Data Security and Privacy

“No data is truly secure” — Sethuraman

Even though big data provides great opportunities, it also has a great amount of risk. It possesses the power to crumble the very foundation of our society. With new laws like GDPR ensuring data protection will be more than just a moral judgment — but a legal requirement, it doesn’t go far enough to protect people from misusing data for personal gain. So, we asked our panelists to speak about the measures they take to ensure they don't violate the norms of data security and got some meaningful inputs regarding this. According to Manoj, all organizations should ensure stricter data policies are being enforced and most of the sensitive data is stored in a secure location. He also talked about user awareness and how they share equal responsibility to protect their personal information.

The Eureka Moment

After three and a half hours of productive group discussion with four new-age data scientists, we felt better equipped to enter the workforce with the thoughts they have shared. Even though the topics ranged from the evolution of big data to gender equality and data privacy, it acted as a great source for budding data enthusiasts to gauge the direction the field of big data is heading.

Our Team

[We would like to thank all the four amazing panelists (Hiral, Sethuraman, Supreet and Manoj) for taking the time and attending this discussion on a Saturday afternoon.]

--

--

Anurag Bejju
SFU Professional Computer Science

Data Scientist at Statistics Canada — Data Science Division