Generating Synthetic Data for LLMs, Deploying Open-Source LLMs, Trustworthy AI Practices, and London’s AI Scene

ODSC - Open Data Science
Published in
Sent as a


6 min readMay 23, 2024


Generate Synthetic Data to Test LLM Applications

In this blog, we will walk through the popular topic of synthetic data in the context of LLM testing and evaluation.

Podcast: Training and Deploying Open-Source LLMs with Dr. Jon Krohn

In this episode, Jon Krohn delves into the lifecycle of open source Large Language Models, from transformer architecture to deploying to production.

Podcast: Embedding Trustworthy Practices Across the AI Lifecycle with Vrushali Sawant

In the latest episode of the ODSC Ai X Podcast, we dive into the captivating realm of trustworthy AI with Vrushali Sawant, such as understanding important underlying principles.

AI-Driven Solutions to Battle Spear Phishing Attacks

Spear phishing can be hard for humans to detect, so AI is an ideal alternative. Here are a few ways businesses can use AI to stop spear phishing attacks.

Exploring the Vulnerability of Language Models to Poisoning Attacks

This article discusses a paper, “Poisoning Language Models During Instruction Tuning,” and summarizes key findings on the topic.

Webinar: Improving RAG outcomes with your unique enterprise data

May 30th, 2024

In this webinar, Snorkel AI co-founder and CEO Alex Ratner will share his insights into emerging AI practices and the future of enterprise adoption. He will be joined by principal research scientist Chris Glaze and Generative AI product lead Marty Moesta who will discuss the latest research in RAG tuning and demonstrate how to apply it via AI data development techniques.

Industry, Opinion, Career Advice

AI Resilience: Upskilling in an AI Dominant Environment

At this year’s ODSC East, Leondra Gonzalez gave a compelling talk on navigating a career in data science given recent technological advances in AI.

ODSC East 2024 Keynote: Mozilla’s Abeba Birhane Social and Ethical Implications of Generative AI

Watch this ODSC East 2024 keynote by Mozilla’s Abeba Birhane as she delves into generative AI’s social and ethical implications.

Why London is a Powerhouse in Artificial Intelligence

Here are a few reasons why you should consider learning more about London’s AI scene, including AI companies, research institutions, and startups.

Join us at ODSC Europe to get hands-on with the cutting-edge AI tools and techniques that transform how we work and connect.

Take deep dives into generative AI, LLMs, RAG, prompt engineering, and more with the leading experts in the field to build practical, implementable skills.

Register now for 70% off!

Data Science & AI News

ODSC’s AI Weekly Recap: Week of May 17th

This week’s AI Weekly Recap is all about Microsoft’s staff relocation, US/China AI meeting, Elon’s AI skepticism, and more AI news. Sign up here to get this as a newsletter every Friday morning.

IMF Chief Sees AI Impacting Labor Like a “Tsunami”

AI is poised to dramatically impact the global labor market, akin to a “tsunami” according to the International Monetary Fund managing director.

Google Introduces Generative AI Into Search and More

Google has announced that its search engine will be powered by a new custom Gemini model, promising to streamline the search process.

OpenAI Introduces GPT-4o to the World

OpenAI announced the release of GPT-4o, a new GPT model that promises to seamlessly integrate text, audio, image, and video inputs and outputs.

Breaking: OpenAI Disbands Team Focused on Long-Term AI Risk

OpenAI has disbanded its team focused on the long-term risks of artificial intelligence just one year after the group was announced.

AI Godfather Sees Need for Universal Basic Income Due to AI

Professor Geoffrey Hinton, the computer scientist known as the “godfather of artificial intelligence,” has called for the establishment of a universal basic income.

ODSC Highlights

9 Reasons Why Your Boss Wants You at ODSC Europe 2024

Between hands-on training and great networking opportunities, here are a few reasons why your boss wants you at ODSC Europe 2024.

The ODSC Europe 2024 and ODSC West 2024 Call for Speakers

Want to share your research, case studies, and expertise? Learn more here about how you can speak at ODSC Europe or ODSC West this year!

Announcing ODSC West 2024 and the New AI for Robotics Track

We’re happy to announce ODSC West 2024, coming to San Francisco this October 29th-31st! Learn more about what to expect from the event here, including info on our first-ever AI for Robotics track.

New Podcast Episode: HPCC — Open-Source Platform High-Performance Computing on Large-Scale Data with Bob Foreman

Join us for a deep dive into the HPCC project to discover how it simplifies complex data analysis at scale and why it’s an ideal tool for students, startups, or companies exploring or running POC for large-scale data-intensive computing. Spotify | SoundCloud | Apple

Video of the Week: Winning The Room: Creating And Delivering An Effective Data-Driven Presentation

This video offers straightforward strategies and practical tips to enhance your presentation skills, ensuring your data not only informs but also engages and persuades your audience. Learn how to clarify, simplify, and present complex information without sacrificing accuracy, utilizing powerful visuals to make your data storytelling memorable and impactful.

Featured Jobs from Hiring Partners:

Upcoming Webinars and Meetups:

How to Reduce LLM Costs by 98%: Practical, Scalable Solutions

Wed, May 29, 2024 12:00 PM — 1:00 PM EDT

In this webinar, we will cover advanced techniques for leveraging the full power of Large Language Models (LLMs) while adhering to a reasonable budget. We work with many data scientists who have successfully demonstrated the potential of GenAI technology, only to find themselves encountering significant obstacles when it comes to cost and efficiency at production scale.

Time-Series Databases for AI: Enabling Trend Analysis, Anomaly Detection, and Accurate Predictions

Thu, May 30, 2024 12:00 PM — 1:00 PM EDT

This session will consider why it’s essential to look at a time series database when working with ML and AI, how they differ from other databases, and factors such as scalability, data ingestion and storage capabilities, advanced analytics support, and integration capabilities.

Designing AI for Trust — How To Create Value While Setting The Right Guardrails

Tue, Jun 4, 2024 12:00 PM — 1:00 PM EDT

With a focus on business and technical leaders responsible for bringing AI solutions to life, we will draw from best practices in designing and deploying AI solutions across mission-critical sectors such as healthcare, energy, and financial services, where trust is critical.

Secure LLM App Deployments — Strategies and Tactics

Wed, Jun 12, 2024 12:00 PM — 1:00 PM EDT

In this session, you’ll discover how to improve the security of Large Language Models (LLMs). We’ll show you how to use red-teaming techniques from cybersecurity to identify and evaluate vulnerabilities in LLM applications, ensuring its safety and reliability. Additionally, you’ll learn how Giskard’s tools can be integrated into your workflow for automatic vulnerability detection, allowing you to scale your security efforts for Generative AI.

Hybrid Intelligence: Accelerating End-to-End Financial Processing and Reporting with Pre-Trained and Custom ML Models

Tue, Jun 18, 2024 12:00 PM — 1:00 PM EDT

Speed and accuracy of processing and reporting are paramount in the financial services industry, yet the sector has been slow to fully embrace emerging technologies, such as artificial intelligence, to significantly enhance these processes. This webinar introduces an innovative approach to overcoming these obstacles.



ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.