What I Had lernet This Week

Mohamed Mansour
2 min readAug 29, 2016

--

On my way to improve my English I decided to write a weekly article about what I had learnt this week and I expect your feedback
Let’s go :

This Week I’d finished a course on Coursera called

Python Access Web Data
Every website is a bunch of data (string, integer, photos, videos …etc).
So how can we use this data ?
There was an expression called

Web Scraping
it’s a technique of extracting information from websites.

There are types of data forms in any website.
1) XML “Extensible Markup Language”
It’s a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.

XML

2) JSON “JavaScript Object Notation”
It’s a lightweight data-interchange format. It is easy for humans to read and write.it’s based on a subset of the JavaScript Programming Language.

JSON

So, How can you use this in python?

It’s just libraries you need to import.

Create the required objects from classes and use them.

For example:

>>>import JSON
>>>json_string = ‘{“first_name”: “Guido”, “last_name”:”Rossum”}’
>>>parsed_json = json.loads(json_string)
You can find more about JSON in this documentation.

It’s similar in XML you will have an XML file or web page.

For example:
>>>import xml.etree.ElementTree as ET
>>>tree = ET.parse(‘country_data.xml’)
>>>root = tree.getroot()
You can find more in this documentation.

But before you start scraping any website you need to make sure that This website is scrapable.

There was another topic in this course.

It’s Regular Expressions
When I start my journey as a python developer I heard about regular expressions and I thought it was complex.

I had to be away from it.

But I discovered that it was simple and there is a lot of websites that explain it and make you understand how to use it.

RegEX

After all. The Course was good and I’d learnt a lot of subjects and techniques.

I recommend this course to anyone who wants to start web development.

--

--

Mohamed Mansour

Software Engineer working with high available B2B systems mainly in the Fintech and payments industry. (Python, Django, PostgreSQL, Django, Nodejs, MongoDB)