As my blog is growing bigger, and the scrolling becomes more unnerving even for myself, I decided to compile a table of contents. More recent articles come first.
The demarcation line between the self-taught folks and “normal” programmers with a degree became slightly discriminating. This article encourages the former ones to raise their heads and wear that self-taught title with pride.
The perceived lack of theoretical knowledge behind the programming languages is often seen as a disadvantage and hindrance for a developer career. Where does the bias come from? Most probably, it originates from the conviction that each developer career must have a software architect title as an ultimate goal. …
Recently, I have worked for a customer who needed a tutorial involving data from a (non-existing) IoT company. I created a test dataset since all available open-source ones were not exactly a match. This inspired me for writing another tutorial: the one you are reading.
Although the existing Python Faker package does a good job in creating fake datasets it does not cover such a use case as time series data.
What are the specifics of time series?
When you just learn to program, you may often use existing code snippets kindly shared by more experienced colleagues on StackOverflow or their personal blogs.
The more advanced code you need to write the more customization you have to apply to the snippets that you find on the Internet. It is always quite a pity to have done a lot of editing and then realizing that your new code won't work. And that you neither saved the original snippet nor bookmarked the blog you have copied it from.
In one of my previous blog posts, I gave a few recommendations…
This article could be titled “How to find a good job in Europe” but I can only offer my experience and information I’ve gathered as a data science job candidate.
Finding job announcements is not the biggest challenge. I will explain this later. The biggest problem I have encountered back in 2011 when I decided to immigrate from Russia to Germany was that I could not attend any job interviews in person.
Skype interviews were not that common at the time. And for me, it was too expensive to jump on a plane and come to a spontaneous appointment.
In my life, I’ve got to know quite a few people who came home after school, programmed fancy little applications instead of doing their homework, and then ended up working as software developers. They inspired me to learn Python on my own. I want to share my learning methods that are based on moving from simple memorizing through reverse engineering to independent coding.
I also want to pay your attention to a vast amount of open and freely accessible online knowledge — blogs, communities, open-source articles — that you have at your disposal. …
Among other things, the scikit-learn is used to teach algorithms in selecting the best model. RapidMiner enables automated model selection, too.
A few years ago, I had a short career stop in a small AI startup. That job had brought me on a new level. After almost four years, I still keep spreading the word about the tools and skills I had learned there.
Most of our algorithms were programmed in R, but we used other data science tools as well. One of them was RapidMiner: previously, a quite expensive one. It is a niche software that offers a drag-and-drop…
I am totally convinced that everyone can be good (and earn good money!) only in the job they like doing. If you are bored by your tasks and have to force yourself every day, you cannot deliver quality results.
But if data science makes you feel warm and light, then you have chosen the right door to open. How do you know it exactly?
Or your dashboard, your pipeline, or whatever you are building. You feel like an artisan looking at his creation and enjoying its perfection.
You feel happier with every line of code that you have added, bringing…
In a business environment, I believe speed has the same priority as accuracy. In fact, my benchmark is that a good data tool should allow you to start deriving insights within an hour. Why so quick? Because business decisions cannot wait longer!
I am often asked to give my opinion on the best data visualization tool. To others’ surprise, I tend to start by pointing to the cost factor.
If you buy a cheap tool, then your businesses’ well-paid data analyst gets busy with the imperfections of the tool instead of doing their actual job, and this may actually result…
With a platform, I mean every software that somehow stores the data permanently. Not a necessarily data analytics platform, but a CRM, an ERP (enterprise resource planning), a campaign management platform, etc.
Platform migrations used to be something unique and rare a decade ago. Nowadays, a lot of companies migrate their working processes every couple of years. Sometimes, they do it because better technology has arrived. More often, an old and even good-working one gets scheduled for retirement.
The data that the platform has gathered is often used for business analytics reports. Typically, a business platform has an analytics extension…