Last year I joined 90 Seconds with the original goal of building the data team and data analytics products. At the time, however, there was a lack of engineering leadership and management in place. So I was tasked with managing and scaling up the engineering team as the interim director of engineering — a role, and a challenge, I had not taken up on before.
It was undoubtedly one of the most challenging periods in my professional life. Yet I am very happy that I’ve got the opportunity to learn, to make mistakes, and to experience the whole thing. …
At 90 Seconds, our team’s main responsibility is to lead and build data analytics capabilities across the organization. To ensure that we can fulfill this duty, we have to build up the data team, with clear structure, responsibilities, and directions.
One of the first tasks we set ourselves to was to define the goal of the team, the different functions, and how we all fit together.
Regardless of where you sit within the data team, you have one goal: to make positive impact with data.
This can be achieved through:
This is more or less my personal note and also, maybe a guide for people who have similar specs machines wanting to try Hackintosh. This whole thing took me about 2 full days in total to get into the state that I’m comfortable with i.e ditching my Macbook Pro for good.
Special thanks to Lich Nguyen for helping me going through from the start with a lot of newbie questions!
Being in this field for a while now, I think this is quite an interesting time to write a summary from my personal experience on the changes in the last few years, the new directions, or simply something we should do to keep up with the trend in the future.
There will be two parts of this blog, the first one (this one), being about Big Data and Business Intelligence, while the second one will be about Machine Learning and Data Science in general.
A couple of years back, most companies would stick with Enterprise Data Warehouse solutions from Oracle…
There are two parts of this blog, the first one, being about Big Data and Business Intelligence, while the second one (this one) will be about Machine Learning and Data Science in general.
In the past few years, Machine Learning has become a lot more accessible and even more and more demanding nowadays:
As someone who has been researching and taking part in shaping up data organization within various startups of different stages, sizes, and industries, I find this an interesting challenge to be focused early on most of the time. Making the right decision on this front will not only help companies reduce cost, minimize duplications of work, turn-over rate, but also remove dysfunctional company cultures.
With that in mind, I will share some of my views on how structures could be defined in different data organizations. …
In this post, I will share with you a simple process that I have been developing when doing Machine Learning in my workplaces.
Hopefully it will clear off a few misconceptions and pitfalls some of us might have in general about machine learning, or when it comes to comparison between machine learning in competitions, in text books, and in practice.
Most of the time, these problems are clearly presented to us data scientists from text books, or in Kaggle-like competitions, together with a pre-defined dataset, a baseline result, and an evaluation metric that we will have to follow.
Note: this works for xgboost-0.6a2 and OS X Sierra (10.12.3).I have not tested it else where.
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"export PATH=/usr/local/bin:/usr/local/sbin:$PATH# upgrade too if you can
brew upgrade# install Python from brew is highly recommended
brew install python
Note: OS X does come with python (2.7.10) by default at
/usr/bin/pythonbut I would highly recommend brew over this.
# install from source code
curl -O http://python-distribute.org/distribute_setup.py
curl -O https://raw.github.com/pypa/pip/master/contrib/get-pip.py
python get-pip.py# show current pip version
pip --version# upgrade pip
pip install --upgrade pip
Note: you can always do
sudo easy_install piplike…
— by Dat Le on 23th Feb 2017
Consistent SQL style would help a lot in code review and development process, especially when SQL is the main language we use in the data team at honestbee.
Below are all the rules and conventions:
— by Dat Le on 13th Feb 2017
Like many other businesses, honestbee operates on a supply (shopper bee and deliverer bee) & demand (consumer) model, with our demand side being much more volatile in real time comparing to its counterpart.
As our team’s duty is to ensure the best service to our customers, we need a mechanism to flatten our capacity curve for a smoother operation. As a solution, we launched the “Dynamic Time Slot Pricing” feature earlier last week:
Data Science and Engineering at foodpanda