PyCon Sweden finished and I really enjoyed it. It was two days full of really interesting talks and great people mainly from the Stockholm startup scene and from the university KTH. The three main effects that these days had on me were:
- I love Python even more. Yes, it’s hard to believe that even more since I already loved it so much.
- I’m more interested (I was before already) in coming back to Sweden one day in the future for working or for pursuing a MSc in Computer Science or Machine Learning at KTH, since I met people from there with a great knowledge, enthusiasm and passion for Machine Learning.
- I got a huge motivation to contribute to the Python community to try to make the frameworks we use everyday more beautiful, easy and fun to use.
I will give now a brief description of the talks and the things the speakers mentioned that I found especially interesting. It won’t be a big review or a big description of them, it will be more like a quick summary.
First day
The first day was basically the Data Science day, most of the talks I joined were about it, I only attended two that were not related to it.
Opening Keynote
The day began with a keynote talk (by Ian Ozsvald) about the importance of Data Science and tips to get started on it. He mentioned some important tools and Python libraries: NLTK 3.0, DataSift, AlchemyAPI, OpenCV, scikit-image, scikit-learn, StatsModels, pandas, IPython, Spyder and segment.io.
He suggested a nice pipeline to get started on solving Data Science problems and develop a solution: visualize, set milestones, K.I.S.S. (Keep It Simple, Stupid), think-hypothesis-test, show results in an IPython notebook and finally engineer a solution. For deploying the solution, he suggested the use of microservices and the use of tools, services and frameworks like: Spyre, Flask, Swagger docs, Git, Docker and Amazon EC2. For scaling it, he suggested the use of Jython/Java, Azure/Amazon ML and Apache Spark.
He also talked about the importance of good data over lots of data, Big Data, and claimed that good data doesn’t exist and that good enough data is consistent and complete.
This talk was quite encouraging to get in the Machine Learning and Data Science path and prepared everyone for the rest of the talks coming later that day.
From Explicitness to Convention: A Journey from Django to Rails
Sincerely I expected more of this talk, which doesn’t mean that it was bad, it was actually good, just less than what I expected. The speaker talked about the differences between Django and RoR: their different approaches to the MVC pattern, the differences on testing on each of them, logging, patterns, Ruby vs Python, etc. She didn’t talk that much about her experience on that migration which is what I was expecting. In the end the answer to the question Django or Rails? was: depends on where you are located and the developers you can hire, which is quite reasonable in my opinion.
Docker and Python at Spotify
More than on Python, the focus of this talk was on Docker, on why Spotify decided to migrate from Debian packages and other tools (like ClusterSSH and Puppet) to it. The benefits of migrating to Docker that the speaker mentioned were the reproducible deploys, easier testing, easier rollbacks, code and configuration synced and in the same repo, etc.
Deep Learning and Deep Data Science
We arrive now to one of the best talks of the day, in my opinion, on one of the hottest topics of the moment. The speaker, Roelof Pieters, a PhD student at KTH, mentioned first some useful Python libraries for working with deep learning: scikit-learn, caffe, theano and IPython.
Data Science = Hacker’s art + Statistics + Machine Learning
Then he explained, without getting into too much detail, that deep learning was about the automatic learning of features using mainly artificial neural networks. He also explained the results that the use of deep learning approaches has given in recent years, showing its success in audio recognition and computer vision, but not so much in natural language processing yet.
Hacking the Human Language
Another amazing talk. As the name suggested, this talk was about natural language processing. After starting with some motivating graphs, the speaker mentioned some useful tools: NL Toolkit, word2vec, spaCy, gensim, d3.js, Highcharts, Google Chart API and scikit-learn. Then he explained some parts of NLP: word tokenization, sentence tokenization, stemming, part-of-speech tagging, named entity recognition and sentiment analysis. The rest of the talk focused on the use of vector representations for words (word2vec), which encode the relationship between two words. This was one of my favorite talks of the day and got me motivated to learn more about the awesome topic that natural language processing is.
IPython: How a Notebook is Changing Science
This talk was interesting, especially since it came from a non-Computer Science and Software Engineering background person. He showed how Python (IPython, matplotlib, numpy, etc.) is a really good alternative to commercial scientific software like MATLAB or Mathematica (something I have believed for quite some years and the reason why I have never used MATLAB, outside of the university. I always end up using the Python scientific stack and well, on very rare occasions, Mathematica). He also talked about Jupyter (Julia + Python + R), what he called the future of IPython and showed that it’s an IPython notebook that works not only with Python, but also with other languages like MATLAB.
Building an Interpreter in RPython
This talk was really good and showed me more about RPython, something I had only heard about, but never used or deeply checked. The speaker explained how to develop a simple interpreter using RPython, as the name of the talk says. He also explained a little bit about the Python bytecode. This talk motivated me to begin one project that I have had in mind for quite some time, but that sadly I have not had time to begin, related to compilers and interpreters on the Raspberry Pi.
Lightning Talks
This short 5 minutes talks were about Pluralsight, Plotly, poliastro and other topics. One great discovery from these talks was the library click for creating nice command line interfaces for Python programs using decorators.
Second day
This day had more variety in the topics than the previous one. I think I enjoyed more the previous day because of the topics, but this one also had some amazing talks!
Opening Keynote
This talk was about software development ethics. It focused especially in the harassment that many people suffer online, on social networks, and the availability of illegal content in those social networks. The two main cases were Twitter and Snapchat. It was an encouraging talk for developers to make an effort on developing solutions and improving existing applications to make sure that the users are not harmed in any way.
One part that I found quite funny (and also alarming) is when the speaker asked how many developers had had access to all their users private data. Most of the conference attendants raised their hands. I think that what she wanted to show is that we as developers must make sure that our users private data is safe and nobody can have access to it, not even us, since it’s their private data and they have trusted the application with it, not the people behind the application development.
Why Django Sucks
The title of this talk is quite controversial and made me think that the talk was going to be about all the bad things that Django has and why people should not use it. But it was not that at all, it was a great talk. The first thing that the speaker, Emil Stenström, made clear was that he loved Django, but that
You can’t just love something blindly
He started by showing three problems Django has in modern web applications (it’s not 2005 anymore, web applications have evolved). But he didn’t stop there. After that, he proposed realistic solutions that could be implemented in Django to make it a better framework for this time. The situations addressed were:
- Shared templates: don’t render everything on the server on everything on the client. We should be able to process the templates on both sides and Django doesn’t implement a way to do this.
- Server push: Django doesn’t implement (without third party libraries) a way to send notifications to clients, the clients have to request the server any information.
- Template components: make adding widgets on templates simpler and more independent, without the need to edit all the separate HTML, CSS and JavaScript and then link to it.
I’m not so familiar with Django (I only used it once on a small project), so I didn’t know about this problems. But, I think it was a really good talk and that it prepared me (more or less, I just know about the problems and possible solutions) for the day that I use Django more and I face those issues.
Puppet and Python
Give me an API and I will automate the world
This talk was from Spotify and it was about automation and the use of Puppet (a Ruby tool) on Python via pypuppetdb mainly (but also other tools like puppetdb-stencil, puppetboard and testinfra were mentioned). The speaker also mentioned that since now Puppet is moving to Clojure and thus to the JVM, it may be interesting to use it with Jython soon.
Embedded Python in Practice
This talk focused on low level aspects of Python and performance. I had never used low level things in Python so I found it quite interesting. The speaker talked about some of the problems he and his team had when using Python for embedded stuff and some showed some simple ways to handle some of them. These problems and behaviors were: memory, disk, licensing, clock, recovery, performance, deployments, updates, launching, running and debugging.
How to Build a Python Web Application with Flask and Neo4j
This talk was quite fun, interesting and interactive. The speaker showed a simple Flask example application, but insted of using SQLite with it, she used Neo4j and py2neo to be able to use it on Python. I had heard many times about graph databases and Neo4j, but I had never seen queries on it or seen it in action. I found it to be really cool and I cannot wait to use it in some application I develop, to compare its performance and scalability against relational databases or other NoSQL databases like MongoDB. The application that she showed can be found here. She also showed GrapheneDB which allows to host a Neo4j database and showed a very nice visualization of the data that was being created in real time for the application by the audience.
GitFS: Building a Filesystem in Python
For me this was the hardest talk of the PyCon, but also a really interesting one. The speaker showed how he and his team developed a FUSE filesystem using Git and Python. I still think I didn’t completely understand many things, that’s why I don’t write a lot about it here. I plan to check their GitHub repo to try to understand it more, because I think that what they did is really interesting.
Python for Humans
Probably my favorite talk of all, an amazing and motivating talk to close the event. The speaker was Kenneth Reitz, the creator of requests and current developer at Heroku. He talked about the importance of developing nice and simple to use APIs, following the Zen of Python and of course using urllib2 vs requests as example. This talk was supermotivational to do something to try to improve some Python standard libraries like os, datetime, subprocess, etc. which are not exactly beautiful and fun to use.
We share a dark past: Perl, Java, PHP, ColdFusion, Classic ASP...
Lightning Talks
The lightning talks this day were really good. The first one was about how we should be careful when making programming languages too close tothe English language.
The next one about the importance of not talking only about accomplishments and success stories at conferences and other events, but also about failures, since the number of ideas that fail is bigger than the number of ideas that succeed.
The third one was about immutability and presented Pyrsistent, a Python library that provides immutable data structures. After it, there was a presentation about Faker, a really cool library that provides a factory of fake data like addresses, names, places, times, credit cards, etc. I remembered this library from seeing it on HN and using it a little bit, but I had forgotten about it. It was good to remember it, since it’s very useful for generating sample-test data.
The following talk was about the use of Ansible, NetworkDB and NAPALM at Spotify. The last talk was about the current state of Jython, to show that it was not dead as many people thought and that it’s development is still very active.
Conclusion
After this two amazing days, as I said on Twitter, I’m really motivated to learn more about machine learning, to continue (or begin) some projects that I have had in mind and to contribute more to open source projects.
This event was not my first experience attending this kind of programming languages conferences (my first and only one before was Java Day 2014 in Guadalajara, México), but it has been the one that I have enjoyed the most by far (maybe because I don’t like Java and I love Python). I hope in Mexico we had this sort of Python groups and PyCon conferences, they are a great addition to a software development community, they are really encouraging, motivational and educational. Looking forward for the EuroPython 2015 this summer!