#6 Lesson — Part 3 | Learn to Code with Microsoft M-Language: Real World Project — Pull & Clean up Data from Meetup.com via API

SPRDsheet
10 min readMar 29, 2018

--

Let’s begin with the end in mind. This is going to be the end result of today’s lesson. A neat collection of tech venues in different cities in the USA.

You are going to learn to automate the process of cleaning up data coming from Meetup.com through their API servivce. Let’s get started.

First, select API Service. If you need a refresher, go back to the previous part of this lesson.

Hit Try in console. We are going to collect tech venues from Pittsburgh (within 50 miles radius). Select:

  • zip: 15106
  • radius: 50
  • category: 34
  • clear page field

In case of Meetup.com API you need to figure out how things work through a lot of trial and error. How do you know how they measure radius? Just try it out. How do you know if country returns any results? Or just crashes the API Console? Just try it out.

Although Meetup.com shares a lot of valuable data, their API service is far from perfect, and so is their documentation. Just recently, Wework, the co-working space giant valued at BN20 USD, bought Meetup.com, and they aim to make that service more appealing to business users like you and me.

Together with a befriended database developer, we are going to write a blog post about our experiences with Meetup.com APIs and how to enhance them for a business user. I will keep you posted on this.

For now, let’s go back to the Meetup.com API Console.

Next, hit Show Response and grab your Signed URL.

Open new Excel Workbook. On the Data Ribbon, hit From Web in the Get & Transform Data section. Enter the Meetup.com AIP URL.

Next, hit To Table on the Transform Ribbon.

Select the default settings regarding delimiter and extra column handling. Just hit OK.

Click the double arrowed Expand icon.

Uncheck Use original column name as prefix and hit OK.

This API call gave us loads of data: 144 rows, each representing a different tech event group. A lot is going on in Pittsburgh! Plus 27 columns on every single event. That’s quite a bit.

For now, let’ just keep the most basic data. Hold down the control key and click on the following columns:

score, status, created, city, country, state, join_mode, visibility, lat, lon, organizer, who, group_photo, key_photo, timezone, category, meta_category, pro_network

Right click on the name of any of the selected columns and hit Remove Columns. We’ve now boiled down our data to 9 columns.

Let’s now clean up the data we have. M-Language is built for that. First, let’s capitalize the first letters of each column name. Let’s also change localized_country_name to Country and localized_location to City & State. Finally, under Query Settings > Properties > Name let’s change the name to PittsburghTechEvents. Now, let’s take a closer look at the Description and Next Event columns, which both still look a little bit off.

Let’s start with the Description column. It contains a lot of HTML tags. We need to remove those one by one, and I will show you only once how to do that. Just go ahead and remove the remaining ones by yourself. First, let’s remove the <p> tag. Select the Description column, it should be highlighted in green. On the Home Ribbon, click the Replace Values icon.

Enter <p> in the Value To Find field, leave the Replace With field blank and hit OK.

Click on any cell in the Description column to see what other HTML tags you need to remove and repeat the procedure as described above until your data looks clean and neat. In case you want to replace a tag with a space, just hit the space tab in the Replace With field.

Make sure that you always select the entire column when replacing values. Click on a cell to preview its content. After some wrestling with unnecessary HTML tags, your Description column should look like this.

The APPLIED STEPS section displays each individual modification you made. In case you accidentally removed some text you didn’t want to, just click on the corresponding step and hit the X icon left to it. This will undo that step.

Last but not least, let’s see what we can do with the Next Event column. Click on the double arrowed Expand icon.

Keep the default settings and hit OK.

You should see this.

Let’s go ahead and change the column names:

  • id.1 > Event ID
  • name_1 > Event Name
  • yes_rsvp_count > Yes RSVP Count

Next, select the time and utc_offset columns, right click on one of the column names and hit Remove Columns. But … wait a second! You might want to know the venue’s date. In fact, that’s a very valuable piece of information. In order to retreive that piece of data, you need to convert the time stamps into a date format. If you want to learn how to do this, please download Reza Rad’s free e-book “Power BI from Rookie to Rockstar”. You can find the chapter “Meetup Data Source for Power BI” in Reza’s book on page 158, or on his blog.

We’re done with cleaning up data. Now hit the lower part of the Close & Load icon. Select Close & Load to …

Select Only Create Connection and hit OK.

You will see an empty Excel Sheet and the query we worked on shows up on the right hand side.

Before we move on, let’s save the file as TechEventsUSA. Now, let’s do some real magic. Ready for that? Under Queries & Connections right click PittsburghTechEvents and select Edit. Next, hit the Advanced Editor. You’ll see all the steps we performed as code. By now, you should be able to read and understand at least some of the code that has been autogenerated for us.

“Where’s the magic?!” you’re asking.

KnowYourMeme

Imagine, you want to find tech venues in all major cities in the US. For example, in case your company wants figure out which ones to sponsor or send speakers to. Would you want to do all the tedious work over and over again? Nope. Let’s write a function that will reduce the amount of work that we just did to … only one click. Let’s go ahead and do it! First, select and copy the entire PittsburghTechEvents code in the Advanced Editor. Next, hit done.

Then, on the Home Ribbon select New Source.

Select Other Sources > Blank Query. On the Home Ribbon hit Advanced Editor. Remove the original code and next paste the code we’ve just copied from PittsburghTechEvents.

Now, let’s convert that code into a function. It’s easy. First, write this line of code above the Let expression.

(URLMeetupAPI) =>

Next, carefully remove the entire URL (including quotation marks) left to the Source identifier. Then, write this.

URLMeetupAPI

Your first three lines of code should now look like this.

(URLMeetupAPI) =>let
Source = Json.Document(Web.Contents(URLMeetupAPI)),

Here’s how the entire code within the Advanced Editor should look like.

Next, hit Done. Look what we got! We just converted the PittsburghTechEvents Query into a function that extracts & transforms data by just entering the API Service URL in the paramter field. Next, you just hit Invoke and … let the function do all the tedious work for you.

Bear in mind that this function only works on exactly the same API Service! If you want to gather different data from Meetup.com throuth another API Service, you would need to write that as a separate function. But — hey! You already know how to do that.

Before we test our function, let’s do some final tweaking.

Under Query Settings > Properties > Name change Query1 to FindGroups. Next, on the left hand side click on the right arrow above Queries. Click on FindVenues and move that function to the top, above the PittsburghTechEvents query.

Now, let’s find all tech venues within a 50 miles radius of Cleveland. Go back to the API Console and perform the exact same steps as we did in the beginning of this lesson. With only one little exception: Select a Cleveland ZIP code, for example 44101. The Meetup.com site might have some hiccups. If you’re lucky, you will get the API URL the first time you hit Show Response. If not, just give it another try. With Meetup.com APIs, it’s like dialing a landline number in the former Soviet Union. Doesn’t always work the first time you try.

Once you get the Signed URL, enter it under Enter Parameter and hit Invoke.

After just a few seconds, you should get a cleaned up, full list of tech venues in a 50 miles radius around Cleveland.

Our FindGroups function performed exactly the same steps we did before by hand in our PittsburghTechEvents query. Let’s go ahead and change the name from Invoked Function to ClevelandTechEvents.

On the Home Ribbon, hit the lower part of the Close & Load icon and select Close & Load to …, next select Only Create Connection. You should now see your Excel Sheet and Queries & Connections on the right hand side.

Next, let’s load our collection of Pittsburgh Tech Events query data into an Excel Sheet. Right click PittsburghTechEvents under Queries & Connections. Select Load To … Next, select Table and select cell A1 in your Excel Sheet.

Here we go!

Hit save … were done!

You just learned how to pull data from Meetup.com through their API, clean up with M-Language and automate that process. That’s a lot. Congrats!

This is my entire course

--

--

SPRDsheet

All things spreadsheet. Microsoft Excel and beyond. SPRDsheet is about sharing learnings amongst peers, regular folks who love to learn & work.