2. Python, Python, Python

Do you want to be a data scientist? Do you want to be a data analyst? Do you want to do anything data related? Well, then you need to know Python. That’s what every book, tutorial and comment says.

Coming from web development and knowing Javascript, Rails (more than Ruby alone) and PHP, I found Python full of zen. It’s terse and elegant. Sometimes I found it’s like a haiku. Obviously, learning Python for Data Science is different than learning it for the web and I cannot compare oranges with apples, but that’s how it feels.

Basic structures:

mylist = ["List item 1", 2, 3.14]
mydict = {"Key 1": "Value 1", 2: 3, "pi": 3.14}
mytuple = (1, 2, 3)
myfunction = len
print(myfunction(mylist))

Loops

my_list = [1,2,3]
for number in my_list:
print (number)
#this is a list comprehension
list_multiplied_by_2 = [number*2 for number in mylist]

Reading csv files:

# Open a file and print first 5 lines
f = open('my_file.csv', 'r')
data = f.read()
rows = data.split('\n')
print(rows[0:5])

# Open file with csv module
import csv
f = open("my_file.csv", "r")
rows_with_header = list( csv.reader(f))
rows_without_header = posts_with_header[1:]

# Split lines by colons
nested_list = []
for row in rows:
list = row.split(",")
nested_list.append(list)

Regex:

. character
^a starts with "a"
a$ ends with "a"
[bB]it "bit" or "Bit"
frog|toad search for "frog" or "toad"
[1-4] search for 1,2,3 or 4
[1-4]{3} search for 3 repetitions of last line
# import re module and find match
import re
if re.search("bc", "abcde") is not None:
#replace
regex = 'Google|Facebook'
new_str = re.sub( regex, 'Company X', 'Google did something' )
#returns a list of substring in string
numbers = re.findall('[0-9]{2}' , '99 tigers, 12 eagles and 20 flies ')
returns ['99','12','20']

Time

import time
epoch_time = time.time()
human_time = time.gmtime(epoch_time)
human_time.tm_year
human_time.tm_mon
human_time.tm_day
# etc...

import datetime
today = datetime.datetime.now()
today.year
today.month
today.microsecond
#etc
#------------------------------
#convert from timestamp to datetime object
datetime.datetime.fromtimestamp( epoch_time )
#------------------------------
#add time
diff = datetime.timedelta(weeks = 1, days = 1)
someday = today + diff
#-----------------------------
# date formatting with strftime
dec31 = datetime.datetime(year = 1999, month = 12, day = 31)
human_dec31 = dec31.strftime("%b %d, %Y")
print(dec31) #Dec 31, 1999
#-----------------------------
# parsing with strptime
dec31 = datetime.datetime.strptime("Dec 31, 1999", "%b %d, %Y")
One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.