April, an amazing month for Data Science

The Deep Hub Editors
The Deep Hub
Published in
Sent as a

Newsletter

4 min readApr 15, 2024

The past few weeks have been outstanding for Artificial Intelligence at The Deep Hub! We extend our congratulations to our writers for your amazing contributions and to our readers for your constant support.

We have submitted a diverse range of stories covering various aspects of AI, ranging from stock optimization in the investment sector to analyzing the influence of ML in the aviation industry, and even delving into discussions about data and black holes!

Here are our five picks selected for the week:

1. Engineering an Advanced ELT Pipeline for Optimizing Stock Portfolios — Ty Rawls

In this article, Ty Rawls walks us through the detailed process of constructing an ELT (Extract, Load, Transform) pipeline for optimizing stock portfolios.

Definitely a must-read about ETL and stock portfolio optimization.

ELT stock portfolio optimization pipeline — Image created by Ty Rawls

Rawls explains the architecture of his ELT pipeline which can use either local or cloud storage (specifically AWS) to store and manage stock data effectively.

He details the data extraction process using Python and financial APIs such as Financial Modeling Prep (FMP) and Yahoo Finance (YFinance).

In addition, the author talks about data loading into PostgreSQL databases, data quality and integrity checks, and the transformation of daily price data into different time intervals using dbt (data build tool).

Finally, he reveals how he created a Streamlit API for portfolio optimization, where users can input stock tickers and set parameters for investment scenarios.

Read the article here.

2. RAG and Few-Shot Prompting in Langchain: Implementation — Shivam Sharma

Shivam Sharma’s guide explores the use of Langchain, a popular open-source framework for developing applications with Large Language Models (LLMs).

He focuses on the Retrieval Augmented Generation (RAG) model, which enhances LLMs by enabling access to external data sources, thus overcoming the token length limitations of large datasets.

Retrieval Augmented Generation (RAG) representation — Image created by Shivam Sharma

Sharma provides a detailed walkthrough of setting up a RAG chatbot in a Jupyter Notebook environment, using various libraries and the Langchain framework to negotiate Software as a Service (SaaS) agreements.

Additionally, the article covers the concept of few-shot prompting, which involves providing LLMs with specific examples to guide their output. This technique is shown to be particularly effective in generating precise responses from the LLM.

Read the article here.

3. Machine Learning and AI in AOCCs: Challenges and Opportunities — Frank Morales

The article discusses the significant potential for artificial intelligence to enhance the aviation industry, specifically in airline operation control centers (AOCCs).

Frank Morales Aguilera will show you how AI technologies can improve operational efficiency, reduce costs, and boost customer satisfaction by addressing various challenges such as flight delays, crew management, and maintenance operations.

AI systems in the aviation industry — Image created by Frank Morales

Key points include:

  1. Flight Delays: AI can predict and mitigate flight delays by analyzing diverse data sources like weather forecasts, air traffic control, and maintenance records.
  2. Crew Management: Data science helps optimize crew scheduling by analyzing availability, flight schedules, and other relevant factors, reducing administrative overhead and ensuring proper staffing.
  3. Maintenance Operations: AI algorithms analyze data from aircraft sensors and maintenance records to identify issues before they become significant problems.

Read the article here.

4. A Deep Dive into Evaluation in Azure Prompt Flow — Shahzeb Naveed

LLM-as-a-judge — Image created by Shahzeb Naveed with Adobe Firefly

If you´re an Azure enthusiast you might find this article interesting from Shahzeb Naveed which will guide you through the Azure prompt flow system.

Shahzeb Naveed explores the functionalities of Azure Machine Learning Prompt Flow, using the “Chat with Wikipedia” demo. It provides a detailed tutorial on setting up OpenAI credentials, creating and configuring flows, and preparing and uploading a dataset formatted as .jsonl for evaluation.

Some of the key aspects he discussed:

  1. Setup and data preparation
  2. Evaluation metrics
  3. Execution of evaluation
  4. Custom metrics

The article is quite technical and aimed at users familiar with Azure Machine Learning, however, even if you know nothing about Azure you can benefit from it by learning how cloud systems are creating their prompt systems.

Read the article here.

5. If Data Had Mass, the Earth Would Become a Black Hole: How Artificial Intelligence Enables Knowledge to Escape — Everton Gomede

Black hole representation — Image retrieved from Everton Gomede’s post

To finish this week´s newsletter we´ll leave you with a more philosophical article from Everton Gomede, PhD.

This piece is a creative exploration into the metaphorical implications of data possessing mass and how AI helps manage its overwhelming influx.

The article presents a hypothetical situation where, if data had physical mass, the massive amount of data generated could theoretically turn Earth into a black hole of information.

The article also includes a Python script using numpy and matplotlib to simulate a dynamic system where data points with varying “mass” influence each other, illustrating how AI might manage and interpret vast datasets in a visual format.

Read the article here.

Thank you for your support! Do you have anything to contribute? Make sure to send us your piece and we will read it.

Until the next week,

The Deep Hub authors.

--

--

The Deep Hub Editors
The Deep Hub

The editors of The Deep Hub. Exchanging ideas and empowering your knowledge. https://medium.com/thedeephub