Exploring the Top 10 Pokémon Most Similar to Charizard Based on Stats

Chukwunedu Onwuka
INST414: Data Science Techniques
5 min readMar 11, 2024

Introduction:

In the vast world of Pokémon battles, trainers often seek out Pokémon that complement their team composition or share similar traits with their favorites. This analysis aims to identify the top 10 Pokémon that closely resemble Charizard in terms of base stats, including HP, special attack, total stats etc. By exploring these similarities, trainers can discover potential alternatives or additions to their teams, enhancing their strategic options in battles.

Data Description:

The dataset used for this analysis is from Kaggle.com, and comprises information on 721 Pokémon, sourced from reputable platforms such as pokemon.com, pokemondb, and bulbapedia. It includes essential attributes such as Pokémon number, name, types (Type 1 and Type 2), and basic stats: HP, Attack, Defense, Special Attack, Special Defense, and Speed.

Fields in the Dataset:

  • ID: Unique identifier for each Pokémon.
  • Name: Name of each Pokémon.
  • Type 1 and Type 2: Pokémon types, determining their strengths and weaknesses against different attack types.
  • Total: Sum of all stats, providing a general indication of a Pokémon’s overall strength.
  • HP: Hit points or health, indicating how much damage a Pokémon can withstand.
  • Attack: Base modifier for normal attacks.
  • Defense: Base damage resistance against normal attacks.
  • SP Atk: Special attack, the base modifier for special attacks.
  • SP Def: Damage resistance against special attacks.
  • Speed: Determines the order of attacks in battle rounds.

Relevance to the Question

These attributes are vital for determining similarities between Pokémon, particularly in terms of their base stats. By analyzing these attributes, such as HP, Attack, Defense, etc., we can quantify the similarities between different Pokémon and identify those that closely resemble each other in terms of their statistical profiles.

Data Collection:

The dataset used for this analysis was sourced from Kaggle, a renowned platform for hosting datasets and data science competitions. Kaggle hosts a diverse range of datasets across various domains, including the Pokémon dataset used in this study.

Measuring Similarity between Pokémon:

To determine the similarity between Pokémon based on their attributes, we utilize a feature-based approach that compares all base stats, including HP (Hit Points), Attack, Defense, Special Attack, Special Defense, and Speed. These attributes collectively represent a Pokémon’s combat capabilities and overall strength in battles.

Features for Measuring Similarity:

  1. HP (Hit Points): HP represents a Pokémon’s health or vitality and influences its ability to withstand damage during battles. Pokémon with higher HP values tend to have greater endurance and survivability in combat scenarios.
  2. Attack: Attack denotes a Pokémon’s physical strength and effectiveness in executing physical moves or attacks. Pokémon with higher Attack values excel in dealing damage through physical attacks.
  3. Defense: Defense measures a Pokémon’s resilience against physical attacks and reduces the amount of damage it takes from opponents’ physical moves.
  4. Special Attack: Special Attack denotes a Pokémon’s proficiency in executing special moves or attacks that rely on elemental or special abilities. Pokémon with higher Special Attack values excel in unleashing powerful special attacks to deal significant damage to opponents.
  5. Special Defense: Special Defense measures a Pokémon’s resilience against special attacks and reduces the amount of damage it takes from opponents’ special moves.
  6. Speed: Speed determines a Pokémon’s agility and determines which Pokémon attacks first during battles. Pokémon with higher Speed values tend to act first and gain a tactical advantage in combat.

Similarity Metric:

To quantify the similarity between Pokémon based on their base stats, we employ a modified Euclidean distance metric. The modified Euclidean distance calculates the distance between two Pokémon in a multidimensional space defined by their base stats, providing a measure of their dissimilarity or similarity.

Calculation of Similarity:

  1. Feature Normalization: Before computing the modified Euclidean distance, the base stats of each Pokémon are normalized to ensure that attributes with larger scales do not disproportionately influence the similarity calculation.
  2. Modified Euclidean Distance Calculation: The modified Euclidean distance is then calculated between the normalized feature vectors of each pair of Pokémon, encompassing all base stats attributes. This distance metric captures the overall difference or similarity between Pokémon in terms of their combat-related attributes.

Interpretation of Similarity Scores:

A lower modified Euclidean distance between two Pokémon indicates a higher degree of similarity in their base stats, suggesting that they possess comparable combat capabilities and strengths. Conversely, a higher modified Euclidean distance signifies greater dissimilarity between Pokémon, indicating distinct combat profiles and attribute distributions.

Relevance to Analysis:

By leveraging the modified Euclidean distance metric and considering all base stats attributes, we can effectively measure the similarity between different Pokémon and identify those that closely resemble each other in terms of their combat-related characteristics. This similarity analysis enhances our understanding of Pokémon attributes and facilitates comparisons between different species in the Pokémon universe.

10 Most Similar Pokémon Based On Stats:

Explanation of the Analysis:

Based on the analysis conducted using the Pokémon dataset, the top 10 Pokémon most similar to Charizard were identified based on their base stats, including HP, Attack, Defense, Special Attack, Special Defense, and Speed. The similarity between Pokémon was measured using a modified Euclidean distance metric, which quantifies the dissimilarity or similarity between two Pokémon based on their combat-related attributes.

The analysis aimed to determine which Pokémon exhibit similar combat capabilities and strengths to Charizard, a well-known Fire/Flying-type Pokémon. By considering all base stats attributes, I was able to identify Pokémon that closely resemble Charizard in terms of their overall combat profile. This information is helpful for Pokémon trainers and enthusiasts seeking to explore alternative Pokémon with similar attributes to Charizard for strategic team-building or competitive battling purposes.

Data Cleanup Process:

The cleaning process involved several steps to ensure the dataset was free from inconsistencies, missing values, or other common issues. These were the steps taken:

  1. Extracting relevant features: I extracted key features like ‘Name’, ‘HP’, ‘Attack’, ‘Defense’, ‘Sp. Atk’, ‘Sp. Def’, and ‘Speed’, which represent the base stats of Pokémon used for combat. Other information was ignored.
  2. Normalizing the features: I scaled the extracted features using Min-Max scaling to ensure equal contribution to similarity calculations, preventing features with larger scales from dominating.
  3. Fitting the nearest neighbors model: I used scikit-learn’s NearestNeighbors class to fit a model to the normalized features. This model finds the k nearest neighbors for each Pokémon based on their normalized feature values.

Limitations and Biases:

  1. Limited Features: The analysis only looks at basic stats like HP, Attack, Defense, etc., ignoring other important traits like types, abilities, and move sets. Adding more features could give a better picture of Pokémon similarity.
  2. Simplified Metric: Using a modified Euclidean distance metric to measure similarity might be too simplistic. Other metrics like Jaccardian or Cosine similarity could offer different perspectives on Pokémon likeness.
  3. Lack of Context: The analysis doesn’t consider contextual info like Pokémon types and movesets, which are crucial in battles. Ignoring these factors could lead to biased similarity assessments.
  4. Subjectivity: Similarity rankings can be subjective, depending on the chosen features and metrics. Different stakeholders might prioritize different attributes, leading to varied interpretations of Pokémon similarity.
  5. Limited Sample: With only 721 Pokémon in the dataset, the analysis does not capture the full diversity of Pokémon. A more extensive dataset covering all species and variations would provide more accurate similarity assessments.
  6. Data Collection Bias: Biases during data collection, like sampling biases or errors, could affect the accuracy of similarity assessments. Addressing these biases is essential for reliable results.

Github Link — https://github.com/ChukwuneduOnwuka/Pokemon-Similarities/blob/main/Similarities.ipynb

--

--