MyData DevLog #3: Initial data storage and capture plans

Like most people these holidays, my week was mostly spent recovering from Christmas dinner. Lovely bit of turkey with all the trimmings. However, I was really keen on figuring out how I can store my health data in the cloud and in the future, uploading it to the cloud directly rather than just updating a excel file every week.

Due to the holiday season, this won’t be a day by day journal like my previous other entries. This week was split into two tasks; Data storage and modelling.

Data storage

Currently, I have my data stored in excel spreadsheets (XLSX format) with appropriate headers and columns. I’ve also been storing this data in CSV format (just so I could play around with the data in Jupyter). My initial idea was to convert the CSV data into JSON format and then store the data in Cosmos DB.

That didn’t work out so well for me. I only have a small amount of data and I’d rather not spend the little money I have on a Azure resource that I don’t need right now. The simple option then was just to use a Azure SQL instance that I could use for almost free considering the little amount of data I have right now (Not even 5MBs worth! Do I even lift?).

However, that raised another problem. How can I migrate my data from Excel to SQL server? If you’re a seasoned data professional, you’ll think that this is quite an easy task. But for newbies like me, I got a little lost.

Attempt 1 — CSV to SQL

An initial idea I had was to just take the CSV file I had an convert it to SQL server. This was just a case of using the import and export data wizard that comes with SQL server and storing each CSV file as a table within a locally hosted database.

This worked, but each column was imported as a nvarchar type. Not particularly useful especially if I want to perform some BI work on the data in the future. I did look to perform some simple T-SQL statement to alter the table, but this seemed like a tedious task. There has to be a better way of importing the data to SQL without having to do too much heavy lifting.

Attempt 2 — XLS to SQL

This option worked out a little better. I could keep the columns data types from Excel without having to write pointless T-SQL. However this process was a little more complex than I anticipated. Instead of using the Import and Export wizard (which for some reason wouldn’t allow me to export my XLSX files to SQL Server), I had to use Integration Services. Again, another tool I had no experience in using.

It was pretty simple. It was just a case of setting up a data flow that took the excel file and then exported it to my local SQL database. I created a SSIS package for each table just for simplicity. There’s probably an easier and more efficient way to do this, but I’ll look into it in the future.

This worked! Not only did I manage to get all my data in there, it also kept the data types that I specified in Excel. Slight problem with this solution was I had to use a XLS file type (Excel 97–2003) instead of XLSX file type (Excel 2016). My hope in the future is to enter my data via a web portal and then exported the data in whatever file type I desire (XLS, CSV, JSON etc). But for now, I have my data in a SQL database.

Now, it was just a simple case of deploying my local database to Azure. Thankfully, SSMS makes this a pretty simple task. I just had to create a empty server in Azure, target that and deploy! This method is actually pretty cost effective since I only have a small amount of data. Just to make sure everything works, I connect to the server in SSMS and it all works!!

My initial plan of using a mobile app to log my data didn’t seem like a good one. I’m looking at ASP.NET application instead just so I can upload data once to my database and then export it in whatever format I need. This is still an area of discussion though.

Data Modelling

This past week was spent on still trying to determine an MVP for my data model. I’m keen to see the changes in my weight over a period of time, however I have gone through cycles of losing weight then maintaining a weight, so in the future I’ll need to get a scale or find an alternative method to measure my body fat percentage. For the moment, I’ve create a simple table that stores the following data:

  • Date of measurement
  • Weight in stone
  • Weight in KG
  • Weight in Lbs
  • Loss/Gain amount per week in lbs
  • Lbs to target.

Currently, my target weight is 210 pounds. But in the future I’ll have to change this column to body fat percentage. I’m not going to store this table in SQL Server just yet. Once I’ve moved back to New Zealand, I’ll look into how I can measure body fat.

I’ve also decided to log my food intake. I’ll be able to use this data to see how my weight changes with calorie deficit and intake over time. I’ll also be able to track my macros over time and see how that affects weight loss. I’m also not going to store in this data in SQL server just yet (Have no data yet!). So this is the simple table that I’ve got at the moment:

  • Date
  • Calories in
  • Fat (g)
  • Fiber (g)
  • Carbs (g)
  • Sodium (mg)
  • Protetin (g)
  • Water (ml)
  • Water Target Intake achieved

I’m aiming to drink 2.2 litres of water each day, but I haven’t always been consistent with this so it’ll be interesting to see if that affects weight intake. Right now, my data isn’t quite relational. I’ll need to spend some time in the future trying to figure out if this is possible and how I can do it.

Good things

  • For the meantime, I’ve got an MVP for my data. Of course in the future I’ll have to expand and manipulate the data to satisfy future questions for interest (maybe for new devices), but for now this works for me.
  • I’ve got a way of storing my data in the cloud!

Bad things

  • Data entry is still a mundane process. I’ll need to figure out a way of making this process more simple in the future.

Heading into the new year, I’m in a pretty good place. I’m interested to see how this data works with my analysis heading in the future. So for the next couple of months, it’s just a case of collecting data and then seeing how it all works together. This process will be an iterative process so it’ll be interesting to see how the data evolves over time.

Next week, I’m heading to Holland for a bit of a holiday, so I’m just going to be collecting data. I may do some initial planning on how to improve the data entry process, so I may not be working on the project as much I would have liked.

Regardless, I hope you all enjoyed New Years and I hope you all have a successful and happy 2018!.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.