Do a Simple Web Scrapping with Python — Part 2: Installation

Muhammad Al Fatih
4 min readApr 4, 2022

--

Installation of Python, Pycharm, and GitHub Repository

We’ll use Python as our programming language in this project because it already provides us with an open-source package for web scraping called “Beautiful Soup 4" which will provide us a feature to scrape info of an HTML. And we’ll use Pycharm Community Edition for the IDE which provides us an easiness in coding with Python and connection to the GitHub repository.

Python installation:

  1. Visit the official python website at python.org.
  2. Click the download tab and download the latest version of python.
  3. After the download finishes, run the installation.
  4. If your installation is finished, you can check your python by typing python in the search menu. It will show you the current Python version you installed.

PyCharm is an integrated development environment (IDE) used in computer programming, specifically for the Python programming language. It is developed by the Czech company JetBrains (formerly known as IntelliJ). It provides code analysis, a graphical debugger, an integrated unit tester, integration with version control systems (VCSes), and supports web development with Django as well as data science with Anaconda.

— Wikipedia

Pycharm Installation:

  1. Visit the official website of the Pycharm developer at jetbrains.com.
  2. Click the Developer Tool tab and select PyCharm.
  3. Click the Download button, then download the Community Version.
  4. After the download finishes, run the installation.
  5. Open the PyCharm that you installed, then click get from VCS.
  6. There’s a warning that Git is not installed Download and Install, so you just need to click Download and Install to download GIT directly.

GitHub, Inc. is a provider of Internet hosting for software development and version control using Git. It offers the distributed version control and source code management (SCM) functionality of Git, plus its own features. It provides access control and several collaboration features such as bug tracking, feature requests, task management, continuous integration and wikis for every project.

— Wikipedia

GitHub Set-Up:

  1. Visit the official GitHub website at github.com.
  2. If you haven't created a GitHub account yet, sign up and finish the registration (including e-mail verification).
  3. In the first login interface, select Create Organization first before setting up your repository. Then set up the organization as you like, this will be your brand to store the product that you’ve made.

GitHub Repository Set-Up:

  1. After you finished your organization set up you’ll be directed to your organization page, click Create a new repository button.
  2. Fill the Repository name as you like, for example “Indonesian Earthquake Info”. Set the access to public, and then check Add a README file, Add .gitignore, Choose a license. For the .gitignore option select Python and for the license select GNU General Public License.
  3. Click Create repository.

Connect Your PyCharm to GitHub Repository:

  1. After you Create the repository, you’ll be directed to the repository page that you make. Click a green button named Code, then copy the HTTP link that generated. This link we’ll use to clone your project from the GitHub repository.
  2. Open PyCharm, click Get From VCS then paste the link on the URL text box. If you mind, you can change the directory before you clone the project
  3. If you finished your set up, click Clone button.
  4. After finished cloning your project, open the .gitignore file. Then type .idea on the first line of the code. We do this to prevent .idea file to be pushed to the repository sience it was a garbage file from the Git.

5. Then we’ll try to connect our project to GitHub Repository. Select the Commit tab on the left side of the PyCharm Window, check the file we’ll be commited, type the commit message, then click Commit and Push.

6. If a window opened, click push. Then a warning box is show up, select Use Token…, and click generate.

7. You’ll be directed to the GitHub website, re-enter your password then scroll down and click Generate Token. A token will be generated for you, copy it and paste to the token text box in PyCharm. Click Login.

8. Now you’ve connected your Project to your public GitHub Repository!

Congrats! you’ve installed all the tools that we need to begin the project! Next, we’ll start to set up a Virtual Environment and install the Beautiful Soup 4 packages on our project!

Continue to the Next Article

--

--