Milvus — The Unstructured Olympics of the Mind — AI? Data?

4 min readAug 9, 2024

Milvus, Vector Database, Python, SDK, Zilliz, Paris, Olympics 2024, Data

The first set of data we want is the very information heavy, wikipedia page that has a lot of good information on the Paris Summer Olympics of 2024.

This is pretty easy. We grab the whole page vectorize it and take a summary as a varchar field as well as the title. We could grab all the olympics in Wikipedia if we want to expand and do some analytics. Things to think of when you are turning our current Bronze Demo into a potential Gold Demo.

I do like the idea of organizing a hackathon for building three levels of cool demos with Milvus and awesome open source AI tools. We have space in Princeton and virtually, so if you are interested comment or reach out.

Check out the Paris Summer Olympics 2024 website!

Paris 2024 Olympics - Latest News, Schedules & Results

Welcome to the Paris 2024 Summer Olympic Games website. Follow the world's top athletes as they go for gold in France…

olympics.com

Okay so in this very simple demo, we use the Python wikipediaapi to read our page as english and HTML. This gives us a little bit of data. For the next level we should chunk, parse and pull out the pieces that will help feed an LLM of our choice. We will most likely run an open source model on OLLAMA. I have llama3 loaded locally, so I will probably use that.

Gold!!!!

We need the medals, unforunately can’t get a live feed. But I got a download from Kaggle (I’ll have to update this in the next round when we connect to some models and deep learning code). So this is in flux, more medals before the end.

So we load from a CSV and build a sentence to encode as well as store and add some filter fields and a good chunk of JSON for good measure.

It’s so easy and super fast to query.

Next up we’ll combine it with a model for Olympics fun.

SOURCE CODE

GitHub - tspannhw/AIM-Olympics-2024: Paris Olympics

Paris Olympics. Contribute to tspannhw/AIM-Olympics-2024 development by creating an account on GitHub.

github.com

RESOURCES

GitHub - tspannhw/AIM-Milvus-KB: Knowledge Base for Milvus

Knowledge Base for Milvus. Contribute to tspannhw/AIM-Milvus-KB development by creating an account on GitHub.

github.com

GitHub - tspannhw/AIM-Milvus-DotNet: Milvus - C# - .NET

Milvus - C# - .NET. Contribute to tspannhw/AIM-Milvus-DotNet development by creating an account on GitHub.

github.com

Paris 2024 Olympic Summer Games

Medals, results & events datasets

www.kaggle.com

2024 Summer Olympics - Wikipedia

The 2024 Summer Olympics, officially the Games of the XXXIII Olympiad and branded as Paris 2024, is an international…

en.wikipedia.org

GEODATA - Evènements (EN)

Geodata - Liste des événements de Paris 2024 (EN)

data.paris2024.org

https://data.paris2024.org/api/explore/v2.1/catalog/datasets/paris-2024-sites-de-competition/records?limit=20

Olympic Summer & Winter Games, 1896-2022

Medals & Results & Athletes from Athens 1896 to Beijing 2022

www.kaggle.com

GitHub - tspannhw/FLaNK-python-processors: Many processors

Many processors. Contribute to tspannhw/FLaNK-python-processors development by creating an account on GitHub.

github.com

REAL-WORLD EVENTS

Aug 13, 2024: Unstructured Data Meetup NYC

Unstructured Data Meetup New York · Luma

This is an in-person event! Registration and photo identification is required to get in. Topic: Connecting your…

lu.ma

Aug 15, 2024: AI Camp NYC

AI Meetup (NYC): AI, GenAI, LLMs and ML

AICamp: Learn and practice AI/ML from anywhere any time with webinars, workshops and courses.

www.aicamp.ai

Sept 24, 2024: Unstructured Data Meetup NYC

Unstructured Data Meetup New York · Luma

This is an in-person event! Registration is required to get in. Topic: Connecting your unstructured data with…

lu.ma

WEBINAR

Unstructured Data Processing from Cloud to Edge

Join us for a webinar on why you should add a Cloud Native Vector Database to your Data and AI platform

zilliz.com

Unstructured Data Processing from Cloud to Edge Webinar

Unstructured Data Processing from Cloud to Edge Webinar — Download as a PDF or view online for free

www.slideshare.net

Star Us On GitHub and Join Our Discord!

If you liked this blog post, consider starring Milvus on GitHub, and feel free to join our Discord! 💙

GitHub — milvus-io/milvus: A cloud-native vector database, storage for next generation AI…

A cloud-native vector database, storage for next generation AI applications — milvus-io/milvus

github.com

Get Milvused!

Vector database — Milvus

Milvus is a powerful vector database tailored for processing and searching extensive vector data. It stands out for its…

milvus.io

Read my Newsletter every week!

AIM Weekly 17 June 2024

17-June-2024

medium.com

For more cool Unstructured Data, AI and Vector Database videos check out the Milvus vector database videos here:

Zilliz

Zilliz is a leading vector database company for production-ready AI. Built by the engineers who created Milvus, the…

www.youtube.com

x.com

Edit description

x.com

Edit description

x.com

https://www.linkedin.com/company/zilliz/

https://www.linkedin.com/in/timothyspann/

Join the Milvus Discord Server!

Check out the Milvus community on Discord — hang out with 1734 other members and enjoy free voice and text chat.

discord.com

https://milvusio.medium.com

Open Source Vector Databases

Open Source Vector Databaseswww.opensourcevectordb.cloud

Milvus — The Unstructured Olympics of the Mind — AI? Data?

Paris 2024 Olympics - Latest News, Schedules & Results

Welcome to the Paris 2024 Summer Olympic Games website. Follow the world's top athletes as they go for gold in France…

Gold!!!!

SOURCE CODE

GitHub - tspannhw/AIM-Olympics-2024: Paris Olympics

Paris Olympics. Contribute to tspannhw/AIM-Olympics-2024 development by creating an account on GitHub.

RESOURCES

GitHub - tspannhw/AIM-Milvus-KB: Knowledge Base for Milvus

Knowledge Base for Milvus. Contribute to tspannhw/AIM-Milvus-KB development by creating an account on GitHub.

GitHub - tspannhw/AIM-Milvus-DotNet: Milvus - C# - .NET

Milvus - C# - .NET. Contribute to tspannhw/AIM-Milvus-DotNet development by creating an account on GitHub.

Paris 2024 Olympic Summer Games

Medals, results & events datasets

2024 Summer Olympics - Wikipedia

The 2024 Summer Olympics, officially the Games of the XXXIII Olympiad and branded as Paris 2024, is an international…

GEODATA - Evènements (EN)

Geodata - Liste des événements de Paris 2024 (EN)

Olympic Summer & Winter Games, 1896-2022

Medals & Results & Athletes from Athens 1896 to Beijing 2022

GitHub - tspannhw/FLaNK-python-processors: Many processors

Many processors. Contribute to tspannhw/FLaNK-python-processors development by creating an account on GitHub.

REAL-WORLD EVENTS

Unstructured Data Meetup New York · Luma

This is an in-person event! Registration and photo identification is required to get in. Topic: Connecting your…

AI Meetup (NYC): AI, GenAI, LLMs and ML

AICamp: Learn and practice AI/ML from anywhere any time with webinars, workshops and courses.

Unstructured Data Meetup New York · Luma

This is an in-person event! Registration is required to get in. Topic: Connecting your unstructured data with…

WEBINAR

Unstructured Data Processing from Cloud to Edge

Join us for a webinar on why you should add a Cloud Native Vector Database to your Data and AI platform

Unstructured Data Processing from Cloud to Edge Webinar

Unstructured Data Processing from Cloud to Edge Webinar — Download as a PDF or view online for free

Star Us On GitHub and Join Our Discord!

GitHub — milvus-io/milvus: A cloud-native vector database, storage for next generation AI…

A cloud-native vector database, storage for next generation AI applications — milvus-io/milvus

Get Milvused!

Vector database — Milvus

Milvus is a powerful vector database tailored for processing and searching extensive vector data. It stands out for its…

AIM Weekly 17 June 2024

17-June-2024

Zilliz

Zilliz is a leading vector database company for production-ready AI. Built by the engineers who created Milvus, the…

x.com

Edit description

x.com

Edit description

Join the Milvus Discord Server!

Check out the Milvus community on Discord — hang out with 1734 other members and enjoy free voice and text chat.

Open Source Vector Databases

Open Source Vector Databases

Written by Tim Spann