Install scikit-learn
Hopefully a happy first step to machine learning
It sounds not to easy to enter the world of machine learning, e.g. with the existence of about 40 algorithms (Kharwal, 2021), deprecated Python package to prevent malicious activity (sklearn, 2023) while the package is different (scikit-learn, 2023), and tutorial sometimes does not provide details information about how to install scikit-learn from PyPI (Brownlee, 2023). In this story a solid and hopefully clear steps how to install scikit-learn are presented.
klearn & scikit-learn
The terms sklearn
and scikit-learn
actually refer to the same package, but its recommended to install the later through pip
(Myrianthous, 2021), since the later will be imported as the first (Rakib, 2022). There was also a reminder that using sklearn
rather than scikit-learn
in pip
will start failing in less than in a week in the last month of 2022 (scikit-learn, 2022). The former is deprecated to prevent malicious actors from using it and in October 2022 the former is downloaded about 1/5 of the later (sklearn-pypi-package, 2023).
Installed packages
To see what are the installed Python package on your computer, use pip list
as follow.
D:\python>pip list
Package Version
------------------------- ------------
anyio 4.0.0
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
arrow 1.3.0
asttokens 2.4.0
async-lru 2.0.4
attrs 23.1.0
Babel 2.13.0
backcall 0.2.0
beautifulsoup4 4.12.2
bleach 6.1.0
certifi 2023.7.22
cffi 1.16.0
charset-normalizer 3.3.0
colorama 0.4.6
comm 0.1.4
contourpy 1.1.1
cycler 0.12.0
debugpy 1.8.0
decorator 5.1.1
defusedxml 0.7.1
executing 2.0.0
fastjsonschema 2.18.1
fonttools 4.43.0
fqdn 1.5.1
idna 3.4
ipykernel 6.25.2
ipython 8.16.1
isoduration 20.11.0
jedi 0.19.1
Jinja2 3.1.2
json5 0.9.14
jsonpointer 2.4
jsonschema 4.19.1
jsonschema-specifications 2023.7.1
jupyter_client 8.3.1
jupyter_core 5.3.2
jupyter-events 0.7.0
jupyter-lsp 2.2.0
jupyter_server 2.7.3
jupyter_server_terminals 0.4.4
jupyterlab 4.0.6
jupyterlab-pygments 0.2.2
jupyterlab_server 2.25.0
kiwisolver 1.4.5
MarkupSafe 2.1.3
matplotlib 3.8.0
matplotlib-inline 0.1.6
mistune 3.0.2
mpmath 1.3.0
nbclient 0.8.0
nbconvert 7.9.2
nbformat 5.9.2
nest-asyncio 1.5.8
notebook 7.0.4
notebook_shim 0.2.3
numpy 1.26.0
overrides 7.4.0
packaging 23.2
pandas 2.1.3
pandocfilters 1.5.0
parso 0.8.3
pickleshare 0.7.5
Pillow 10.0.1
pip 23.3.1
platformdirs 3.11.0
prometheus-client 0.17.1
prompt-toolkit 3.0.39
psutil 5.9.5
pure-eval 0.2.2
pycparser 2.21
Pygments 2.16.1
pyparsing 3.1.1
python-dateutil 2.8.2
python-json-logger 2.0.7
pytz 2023.3.post1
pywin32 306
pywinpty 2.0.12
PyYAML 6.0.1
pyzmq 25.1.1
referencing 0.30.2
requests 2.31.0
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rpds-py 0.10.4
scipy 1.11.3
Send2Trash 1.8.2
setuptools 68.2.2
six 1.16.0
sniffio 1.3.0
soupsieve 2.5
stack-data 0.6.3
sympy 1.12
terminado 0.17.1
tinycss2 1.2.1
tornado 6.3.3
traitlets 5.11.2
types-python-dateutil 2.8.19.14
tzdata 2023.3
uri-template 1.3.0
urllib3 2.0.6
wcwidth 0.2.8
webcolors 1.13
webencodings 0.5.1
websocket-client 1.6.3
It shows that the sklearn
package has not yet been installed. Using pip install scikit-learn
to install the package.
D:\python>pip install scikit-learn
Collecting scikit-learn
Downloading scikit_learn-1.3.2-cp312-cp312-win_amd64.whl.metadata (11 kB)
Requirement already satisfied: numpy<2.0,>=1.17.3 in c:\users\sparisoma viridi\appdata\local\programs\python\python312\lib\site-packages (from scikit-learn) (1.26.0)
Requirement already satisfied: scipy>=1.5.0 in c:\users\sparisoma viridi\appdata\local\programs\python\python312\lib\site-packages (from scikit-learn) (1.11.3)
Collecting joblib>=1.1.1 (from scikit-learn)
Downloading joblib-1.3.2-py3-none-any.whl.metadata (5.4 kB)
Collecting threadpoolctl>=2.0.0 (from scikit-learn)
Downloading threadpoolctl-3.2.0-py3-none-any.whl.metadata (10.0 kB)
Downloading scikit_learn-1.3.2-cp312-cp312-win_amd64.whl (9.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.1/9.1 MB 834.1 kB/s eta 0:00:00Downloading joblib-1.3.2-py3-none-any.whl (302 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 302.2/302.2 kB 668.1 kB/s eta 0:00:00Downloading threadpoolctl-3.2.0-py3-none-any.whl (15 kB)
Installing collected packages: threadpoolctl, joblib, scikit-learn
Successfully installed joblib-1.3.2 scikit-learn-1.3.2 threadpoolctl-3.2.0
Three packages are installed, which are joblib-1.3.2
, scikit-learn-1.3.2
, and threadpool-ctl-3.2.0
.
D:\python>pip show scikit-learn
Name: scikit-learn
Version: 1.3.2
Summary: A set of python modules for machine learning and data mining
Home-page: http://scikit-learn.org
Author:
Author-email:
License: new BSD
Location: C:\Users\Sparisoma Viridi\AppData\Local\Programs\Python\Python312\Lib\site-packages
Requires: joblib, numpy, scipy, threadpoolctl
Required-by:
Now the command pip show scikit-learn
show that the package has been installed.
Check version
Using Python we can check the installed packages using following lines of code
# https://machinelearningmastery.com/machine-learning-in-python-step-by-step/
# Check the versions of libraries
# Python version
import sys
print('Python: {}'.format(sys.version))
# scipy
import scipy
print('scipy: {}'.format(scipy.__version__))
# numpy
import numpy
print('numpy: {}'.format(numpy.__version__))
# matplotlib
import matplotlib
print('matplotlib: {}'.format(matplotlib.__version__))
# pandas
import pandas
print('pandas: {}'.format(pandas.__version__))
# scikit-learn
import sklearn
print('sklearn: {}'.format(sklearn.__version__))
that produces
D:\python>py check_version.py
Python: 3.12.0 (tags/v3.12.0:0fb18b0, Oct 2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)]
scipy: 1.11.3
numpy: 1.26.0
matplotlib: 3.8.0
pandas: 2.1.3
sklearn: 1.3.2
as the output.
Other computer
Since I have also other computer, following are the installation process in that one, different than the previously mentioned above.
PS D:\python> pip install scikit-learn
Defaulting to user installation because normal site-packages is not writeable
Collecting scikit-learn
Downloading scikit_learn-1.3.2-cp311-cp311-win_amd64.whl (9.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.2/9.2 MB 635.7 kB/s eta 0:00:00
Requirement already satisfied: numpy<2.0,>=1.17.3 in c:\program files\python311\lib\site-packages (from scikit-learn) (1.24.2)
Requirement already satisfied: scipy>=1.5.0 in c:\users\62812\appdata\roaming\python\python311\site-packages (from scikit-learn) (1.11.3)
Collecting joblib>=1.1.1
Downloading joblib-1.3.2-py3-none-any.whl (302 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 302.2/302.2 kB 425.0 kB/s eta 0:00:00
Collecting threadpoolctl>=2.0.0
Downloading threadpoolctl-3.2.0-py3-none-any.whl (15 kB)
Installing collected packages: threadpoolctl, joblib, scikit-learn
Successfully installed joblib-1.3.2 scikit-learn-1.3.2 threadpoolctl-3.2.0
[notice] A new release of pip is available: 23.0.1 -> 23.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip
And from check_version.py
PS D:\python\src\import\external\sklearn> py .\check_version.py
Python: 3.11.2 (tags/v3.11.2:878ead1, Feb 7 2023, 16:38:35) [MSC v.1934 64 bit (AMD64)]
scipy: 1.11.3
numpy: 1.24.2
matplotlib: 3.7.0
pandas: 1.5.3
sklearn: 1.3.2
is the result. I hope examples can work in both of the computers.
The scikit-learn
package has been successfully installed also its dependencies. Next to do is to use it in studying machine learning, e.g. logistic regression (Johari, 2017).