FireDucks: Diving into API Compatibility with Pandas
In past discussions, I’ve shed light on FireDucks, briefly touching on its capabilities and innovative approach to DataFrame manipulation (Blog Article #1). I’ve also put it to test against Polars, where FireDucks demonstrated its prowess by outshining Polars in performance (Blog Article #2). FireDucks boasts a feature that has piqued the interest of many: its touted high compatibility with the Pandas API.
Now, we dive into the world of FireDucks, a rising star in the field of Dataframe manipulation. With a commitment to enhanced performance and unique features, FireDucks stands out among its peers. One of its most alluring aspects for users is its seamless integration with Pandas, ensuring a smooth transition and effortless utilization of existing code.
Join us as we explore FireDucks’ API similarities and differences compared to Pandas, unravelling its potential to revolutionize data manipulation workflows.
Similarities for a Familiar Experience:
FireDucks aims to provide a familiar environment for Pandas users by incorporating:
- Core Data Structures: It utilizes Series and DataFrames, mimicking the fundamental building blocks of Pandas data manipulation.
- Common Operations: Almost all the methods in Pandas have counterparts in FireDucks which can be used without any change in syntax. These include:
- groupby
- filter
- sort
- head
- tail
- merge
- join
- at
- iat
- loc
and many more
#There are two ways to run Pandas code under FireDucks:
- Execute the code using an import hook without making any change in Pandas code:
python -mfireducks.imhook sample.py
2. Changing the import statement, in this case no import hook is required:
Use ‘import fireducks.pandas as pd’ instead of ‘import pandas as pd’
Here is a comparison of outputs from Pandas and Fireducks. For Fireducks all the Pandas code has been executed as it is with the import hook for FireDucks
#Sample data
data = {'category': ['A', 'A', 'B', 'B', 'C'],
'value': [10, 15, 20, 25, 30]}
- #groupby example
#Pandas code
import pandas as pd
#create dataframe from data
df = pd.DataFrame(data)
#category wise mean and standard deviation calculation
grouped_by_category_stats = df.groupby('category').agg({'value': ['mean', 'std']})
#FireDucks output with following cmdline:
python -mfireducks.imhook sample.py
2. #filter example
#Pandas code
import pandas as pd
#create dataframe from data
df = pd.DataFrame(data)
#filtering where category is ‘B’ and ‘value’ is greater than 20
print ((df['category'] == 'B') & (df['value'] > 20))
#FireDucks output with following cmdline:
python -mfireducks.imhook sample.py
3. #sorting example
#Pandas code
import pandas as pd
#create dataframe from data
df = pd.DataFrame(data)
#sorting by ‘value’ column
print (df.sort_values(by='value'))
#FireDucks output with following cmdline:
python -mfireducks.imhook sample.py
4. #head example
#Pandas code
import pandas as pd
#create dataframe from data
df = pd.DataFrame(data)
#show first three rows
print (df.head(3))
#FireDucks output with following cmdline:
python -mfireducks.imhook sample.py
5. #tail example
#Pandas code
import pandas as pd
#create dataframe from data
df = pd.DataFrame(data)
#show last three rows
print (df.tail(3))
#FireDucks output with following cmdline:
python -mfireducks.imhook sample.py
6. #merge example
#Pandas code
import pandas as pd
#Dataframe with ‘ID’ and ‘Name’
df1 = pd.DataFrame({
'ID': [1, 2, 3],
'Name': ['Alice', 'Bob', 'Charlie']
})
#Dataframe with ‘ID’ and ‘Age’
df2 = pd.DataFrame({
'ID': [2, 3, 4],
'Age': [25, 30, 22]
})
# Merging the DataFrames on the 'ID' column
merged_df = pd.merge(df1, df2, on='ID')
# Displaying the merged DataFrame
print("Merged DataFrame:")
print(merged_df)
#FireDucks output with following cmdline:
python -mfireducks.imhook sample.py
7. #join example
#Pandas code
import pandas as pd
# Creating two sample DataFrames with common indices
df1 = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
}, index=[1, 2, 3])
df2 = pd.DataFrame({
'Age': [25, 30, 22],
}, index=[2, 3, 4])
# Performing an inner join on the indices
joined_df = df1.join(df2, how='inner')
# Displaying the joined DataFrame
print("Joined DataFrame (inner join):")
print(joined_df)
# Performing an outer join on the indices
joined_df = df1.join(df2, how='outer')
# Displaying the joined DataFrame
print("Joined DataFrame (outer join):")
print(joined_df)
# Performing an left join on the indices
joined_df = df1.join(df2, how='left')
# Displaying the joined DataFrame
print("Joined DataFrame (left join):")
print(joined_df)
# Performing an right join on the indices
joined_df = df1.join(df2, how='right')
# Displaying the joined DataFrame
print("Joined DataFrame (right join):")
print(joined_df)
#FireDucks output with following cmdline:
python -mfireducks.imhook sample.py
8. #at example
#Pandas code
import pandas as pd
#Dataframe
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 22],
}, index=[0, 1, 2])
print("Retrieved element: ", df.at[1, 'Name'])
# Modifying a specific element
df.at[1, 'Name'] = 'Carter'
#FireDucks output with following cmdline:
python -mfireducks.imhook sample.py
9. #iat example
#Pandas code
import pandas as pd
#Dataframe
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 22],
}, index=[0, 1, 2])
print("Retrieved element: ", df.iat[1, 0])
# Modifying a specific element
df.at[1, 0] = 'Carter'
#FireDucks output with following cmdline:
python -mfireducks.imhook sample.py
10. #loc example
#Pandas code
import pandas as pd
#Dataframe
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['max_speed', 'shield'])
print("Slice with labels for row and single label for column:\n", df.loc['cobra':'viper', 'max_speed'])
print("Using boolean list:\n", df.loc[[False, False, True]])
#FireDucks output with following cmdline:
python -mfireducks.imhook sample.py
From the analysis above, it’s evident that FireDucks executes code written for Pandas seamlessly, with matching results across various scenarios. Although a minor variance arises in cases of outer and right joins, where FireDucks outputs “None” instead of “NaN” in the column ‘Name’, its overall compatibility ensures a smooth transition for users. This level of alignment significantly minimizes the need for code adjustments, thereby reducing the learning curve when migrating from Pandas.
Advancing Beyond Compatibility: The Distinctive Offerings of FireDucks:
In addition to its compatibility efforts, FireDucks introduces a range of unique offerings that elevate its capabilities. Among these standout features are efficient memory handling and optimized execution models, setting FireDucks apart from its counterpart, Pandas.
List of Financial Potential, FireDucks in Action:
· High-Frequency Trading (HFT)- In the fast-paced world of finance, High-Frequency Trading (HFT) stands out as a strategy that relies on lightning-fast decision-making and execution.
o FireDucks emerges as a valuable ally, equipped with blazing-fast filtering and aggregation capabilities that are ideal for analysing real-time market data feeds.
o One of FireDucks’ key strengths lies in its ability to handle data processing tasks with remarkable speed and efficiency. Traders can leverage its powerful filtering and aggregation capabilities to extract valuable insights from complex datasets in real-time. This not only enables them to stay ahead of market movements but also allows for the rapid execution of trades with minimal latency.
· Portfolio Optimization and Risk Management:
At the heart of effective portfolio management lies the ability to swiftly analyze historical data and assess the associated risks
o FireDucks Quickly analyze vast amounts of historical price data and calculate risk metrics for various asset allocations.
o FireDucks facilitates efficient backtesting of various portfolio strategies, minimizing compute time and enhancing decision-making processes for investors and traders.
· Fraud Detection:
o Analyze large volumes of transaction data to identify anomalies that might indicate fraudulent activity.
o FireDucks’ speed can enable real-time fraud detection thus helping to prevent financial losses.
· Algorithmic Trading Strategy Development:
o Rapidly prototype and test various algorithmic trading strategies on historical data.
o FireDucks’ fast iteration speed can help in quicker refinement and optimization of trading algorithms.
· Quantitative Analysis (Quant) Workflows:
o Perform complex calculations and feature engineering on financial data sets with ease.
o Data manipulation tasks can be executed efficiently, allowing quants to focus on extracting insights and building robust models.
Conclusion:
FireDucks offers a compelling option for data manipulation, particularly for users familiar with Pandas due to its API compatibility. However, it’s crucial to go beyond just compatibility and evaluate its unique features, performance characteristics, and suitability for your specific needs. By understanding both its similarities and differences with Pandas, you can make an informed decision about whether FireDucks is the right tool for your data science projects.
FireDucks is a great option. It gives you the tools to work smarter and faster, which is why it’s worth thinking about for your next project. It’s not just another tool; it’s one that makes your job easier and helps you do more.
References:
- FireDucks Documentation: https://github.com/fireducks-dev
- FireDucks PyPI Page: https://pypi.org/project/fireducks/