FlatIron Bootcamp Phase 2.

henry chung
4 min readJun 24, 2022

--

My name is Henry. My boss asked me earlier of the year and wondered what I want to do with all my vacation time. I decide to have a “summer of George” experience this summer, which is to find something to do that I will enjoy myself. Yes, I can be watching TV, eating chips and be a couch potato for this summer but I decide to go to New York to attend a data science bootcamp at FlatIron. I don’t have a technical background so this bootcamp could be intimated. If this bootcamp turns out be to be a disaster, I could join George resting at the couch for the rest of the summer. I’ll be documenting my experience and struggles in each phase and sharing it with you all.

I will share my bootcamp experience into technical section and my personal journey section. In my technical section, I will share what I learn from doing this phase project. My technical skills is at a beginner level and there is so many blog post that discuss data science concepts much better and clearer than I do. I will share the blog post link that I read when I study the topic throughout the phase. I will also share my struggle/experience in my personal journey section.

Technical Section.

Concepts that I think that's is important for doing a linear/multi linear regression in this phase.

Topic that we went through for studying regression are.

Concept section: Key concepts about linear regression.

  1. linear regression — hey, we are doing a regression model.

This blog post by Abhigyan covers most of the concept of linear regression.

2) OLS analysis

This blog by Nishigandha Sharma explained how to read from the OLS report https://medium.com/analytics-vidhya/ordinary-least-squared-ols-regression-90942a2fdad5

3) More data cleaning- One Hot Encoding technique for categorical variable.

This blog by Vincent Tabora provides a clear idea on why and how to use One Hot Encoding technique for categorical variable

4) Test- Train set and validation.

We can fully grasp the concept of linear regression and the math behind it, but we also need to know the concept of model performance.

This blog by Valentina Alto explain what a test-train set is and how validation works.

https://medium.com/analytics-vidhya/training-validation-and-test-set-in-machine-learning-7fab555c1080

How to section: This is the section that we can find python code that will do the magic for you.

1) How to implement a regression model project.

This blog post by Kinder Sham provides a clear road map on how to do a linear regression project with scikit-learn.

2) How to implement a test train set in python

Rinu explained how to implement test-train set in python in his post

My personal thought about this phase:

  1. goggle medium or toward datascience post.

There is a lot of new ideas, concepts and python code challenge that is totally brand new to me. I need to use these resources to help me to get an idea how these concepts work and how the codes work.

2. Learn from my classmates.

My classmates are awesome, and the following github links are github link for this phase project. Their technique and approach are much deeper and better than I do. I can also clear some of my confusion for some even basic concept by looking at their work. For example: how to implement a base model. I was using every predictors in my base line model as oppose on using the strongest indicator and build up and fine tune the model. They also sets up a good example on how to do a data science project in general regardless of my data science skill level.

3) Better time management.

Bootcamp is high demanding, so I need to have a better time management to get everything done on time. There is no time to fall behind on the upcoming phases as the topics are getting more deeper and difficulty.

--

--