Non-Higher Ed Pathways Into Data Science (FAQ 005)
In the last article, we covered paths to a data science career from the higher education system. We saw bachelor’s degrees, minors, concentrations, Master’s degrees, and certificate programs as potential options. In this article, we’ll take a different path and focus on routes that don’t involve the higher education system, mainly bootcamps and online learning platforms. These options have their own benefits and drawbacks but having a full picture of the available options should help you to decide which is the right path for you.
Data Science Bootcamps
Coding bootcamps have become all the rage as technology jobs have become the hot new career of the past 10–15 years. The goal of a coding bootcamp is usually to take someone with little to no computer programming experience and prepare them for a career in software engineering. With the advent of big data, the same spike in data science bootcamps has occurred. Bootcamps are good options for those who want shorter programs that will teach you a breadth of skills in a fast-paced environment. They also are more likely to mimic the project process for data science workflows than a Master’s or higher education program. Data science bootcamps are offered by companies like Galvanize, Springboard, and Metis to name a few.
Considerations
Compared to programs in higher education (covered in my previous article), bootcamps are more flexible and targeted. In a bootcamp, your projects are more likely to be something that you’re interested in personally because you get to pick the project and learn the tools that you want to learn. Higher education tends to take a very specific path to homework assignments. Depending on your availability, bootcamps offer full-time and part-time options. If you’re dead-set on getting into data science, the full-time option is a better choice but if you’re just dipping your toes in the water, a part-time bootcamp is a great choice. Degrees in AI/DS are often measured in months or years (Master’s programs being up to 2 years long if taken part-time) where bootcamps generally are measured in weeks (or within 6 months) until “graduation”. This could give the opportunity for students to complete a bootcamp during a school break or pharmacists to take a leave of absence if they can, in any way, swing that. At the minimum, the duration of a bootcamp can get you back up and running with a new position if you decide to do a career switch. You can also decide whether or not an in-person or online experience would work best in your situation.
The pricing for various bootcamps is highly dependent on the program and can require a lot of planning. From quick research, programs can range from $7,500 to $17,000. This likely would require significant savings or other funding/financing mechanisms. With the programs being large investments, it doesn’t hurt to (again, sorry to be a broken record across articles!) connect with data scientists in your area to see if the bootcamps available to you teach you the needed skills. This essentially kills two birds with one stone, getting started with networking for future jobs and making sure your money is well-spent. It’s also helpful to look into the backgrounds and experiences of the individuals who are teaching in the bootcamp. If they have years of experience working on data science teams that have large impacts within their organization, you can be more confident that the program is a good use of money.
+’s (advantages)
An advantage that applies very specifically to one bootcamp-type offering called the Insight Data Science Fellowship (more similar to a bootcamp in length than a Fellowship) is that they offer a program that requires having a doctorate degree. This program calls out MD and PhD’s but in the past I did reach out and confirm that healthcare professionals with doctorates (like PharmDs) are also welcome in the program. Bootcamps tend to be much more “with the times” in the material that they teach participants than most educational programs. Talking with a few pharmacists who have done data science bootcamps, they learned tools like Flask and Computer Vision algorithms that are not often taught within a Master’s program. Tools like these will continue to get more popular in the future so learning them in a bootcamp is extremely advantageous. Bootcamps will provide different career services to participants like interview coaching that can be incredibly helpful in landing your first job post “graduation”. A neat experience I was invited to be a part of recently was a “demo day” at a data science bootcamp. A pharmacist that I met through LinkedIn (who was doing the data science bootcamp) invited me to this demo day. It was a great opportunity for bootcamp participants to share what they learned and get more confident with presentation skills. Presenting your work to clinicians and administration is one of the most important parts of a data science job so these skills are absolutely critical to build.
-’s (disadvantages)
There are a couple of disadvantages to pursuing a bootcamp experience as there is with any of the options I’m sharing in this article or the higher education article. Specific to bootcamps is the perception that some practicing software engineers and data scientists have of those programs, primarily that they aren’t long enough or in-depth enough to teach the necessary skills for the field. I agree that there is no way you can learn everything needed for a job from a bootcamp but a bootcamp gets you a great start. The extra skills usually come with time and more practice, which a data science job would expose you to. I cannot tell you how many concepts in data science and programming that I’ve learned since I’ve had my job in data science that I had never come into contact with prior. Once you have a good foundation (like one provided by a bootcamp), you really are much better prepared to pick up the other necessary skills on the job. You can certainly still contribute to the success of the team regardless of what those who are skeptical say about your experience in a bootcamp.
The other disadvantage that I see is how well bootcamp projects would mimic projects that you would work on in a data science job. This is especially difficult in healthcare, there are not many datasets that are representative of what data looks like in my job at a health system. If you pursue a data science bootcamp, it would be wise to be thoughtful about the dataset you try to do a project on and make it one that you could share with a data scientist who works where you would like to work. Ideally, the project would be similar enough to translate to the company’s own data. You can get an idea of what datasets might be available by networking (again!) with data scientists to understand what their data looks like and get suggestions about what kind of data sets are out there that are similar. During our projects, we also engage with our stakeholders to ask them questions about the current tools they are using and what they would like to see in a new tool. This would be hard to replicate but building your communication skills by interacting with instructors, fellow bootcampers, and networking can help you build those skills.
Online Learning Platforms / MOOCs
If you’ve heard me on any of the pharmacy podcasts I’ve done (here, here, or here), you’ve probably heard me recommend getting your start in data science through online learning platforms like DataCamp. DataCamp should 100% be sponsoring me with how much I talk about the platform. The great thing about these online learning platforms and MOOCs (Massive Open Online Courses) are that most are reasonably priced and there is a number of different ones that could be just right for your needs. Bootcamps and educational degrees tend to be more rigid and one size fits all offerings but online courses can play a number of different roles. I even use DataCamp currently so they can supplement even real-world work experience!
Popular online learning platforms are DataCamp, edX, Coursera, Udacity, Udemy, Kaggle, and DataQuest.
Considerations
Online courses are great for those who want to move at their own pace and don’t want to wait around for the instructor to cover the next topic. The most popular online class platforms have all of their lessons, including videos and practice problems, pre-recorded for students to progress through at their own pace. They are also ideal if you need to try out this new field and are hesitant that data science is a good career choice. They require little time or financial commitment up-front. Some of the programs will also give you an idea of how long each individual course or track (collection of courses) is so that you can map out how long you’ll need to spend learning the information.
Within the group of online platforms mentioned in the opening of the online program section, there is a wide range of pricing and further assistance (job placement) that the platforms offer. We’ll get to those in the +’s and -’s sections but it’s important to take those items into account as you decide which path to take within the online learning platform domain. The platforms even vary in whether or not the learning paths are open-ended or definitive. For example, Udacity offers its classes in what they call Nanodegrees. You choose and complete a Nanodegree from Udacity which is a set number of courses with a project at the end. Most students will then stop using the platform unless they plan to take a Nanodegree in another subject like web design. In contrast, a site like DataCamp offers a subscription for any classes that you want to take, regardless of if the topic is in a certain learning path. DataCamp offers tracks (collections of the individual classes that steer you towards a specific job role) but you are always free to take individual classes outside of the track. Udacity Nanodegrees tend to be static or only released intermittently, whereas DataCamp is often releasing new classes on a wide variety of topics.
+’s
The main advantage of an online learning platform is the self-paced nature. This is great for busy professionals like pharmacists who may not be able to commit to a bootcamp or a graduate degree; taking a self-paced class is much more feasible. It’s also great for those with families or other time commitments. One of my favorite things about online platforms is the speed at which some of them (or their class creators) create new content. Bootcamps and higher education options typically rely on a few teachers who curate/teach the content. Platforms like DataCamp, Coursera, and Udacity recruit or have a process designed to recruit top educators and data scientists to teach classes on their areas of expertise. This means that the platforms can put out many new pieces of content in a single month. The wide variety of teachers also gives the benefit of a wide variety of course topics. There are so many topics to learn in data science including git/version control, Docker, machine learning, software engineering concepts, visualization, etc that no one teacher could possibly create content for. Online platforms also have practice problems and quick knowledge checks built into their program so you can reinforce concepts. You’re able to go back in the future and do these exercises to make sure you’ve retained them over time. Online learning platforms generally are on the less expensive end of the spectrum considering the price of higher education these days.
Overall, online learning platforms tend to allow quick, bite-sized learning opportunities and are often priced at a reasonable price-point making them ideal for students, those just starting out, and those not sure that data science is right for them. They are also great for data scientists to learn new technologies.
-’s
Online learning platforms are limited in the amount of career direction or job placement they can offer. The only platform that has career services to my knowledge is Udacity. Udacity offers resume and LinkedIn profile reviews, GitHub reviews for portfolio projects, and career coaching services to their Nanodegree participants. This comes with a much higher price than the other platforms which is Udacity’s main drawback. A Nanodegree runs around $400 per month, which includes the career services part of the program. DataCamp in comparison is $25/mo without career support services. When choosing a platform, you should consider that part of the equation.
Depending on the platform, you also might need to be more hands-on with your learning planning. Coursera, edX, and Udemy have limited learning path structure in terms of where to start your learning and what skills to build next. These platforms offer a one-off course to learn a specific topic. DataCamp and Udacity have learning paths built into their platform. Udacity’s Nanodegree is a highly curated learning path where you can’t deviate from it because those classes are the only ones available to you. DataCamp has a great combination of structured learning paths and one-off courses in my opinion. DataCamp offers Career Paths like “Data Scientist with Python” or “Machine Learning Scientist with Python” which have 23 classes that make up the Career Path. You’re still able to take one-off courses like Scala or version control with your paid subscription though. When choosing a platform, consider how much you need help in designing a learning plan or take to Google and see if you can find advice on what to learn in what order — there are a ton of resources out there!
The major last disadvantage of the platforms is the lack of real-world projects that they offer. If you’ve been paying attention, I’ve pretty much mentioned this for every option. Online learning platforms are great at using basic data sets to teach you the basics (I’m looking at you iris data set), but this data is not like something you’d see in a data science job — it’s often too simplistic, especially for health data science. I’d encourage you again to go out and find more complex data sets and work on projects with those after you learn the basics with the easy, clean data sets. Options for health data science include Synthea and MIMIC. Learn on the iris dataset, grow your expertise on Synthea/MIMIC.
EXTRA: If you’re interested in checking out DataCamp specifically after this article, you can use this affiliate link to get a discount on your subscription.
One More Thing…
I briefly wanted to include one miscellaneous option that straddles the line between the higher education path and the bootcamp or online learning platform path. That is the open-source Data Science Master’s program, of which I’ve come across two. The original one I came across years ago is linked here and a more recent one that I was informed of (shout-out to Brian Henderson) is linked here. These programs have resources from higher education lectures as well as online resources like Udacity, Udemy, and others. It offers the best of both worlds and is cheap if any cost at all. You’ll still need to work on networking and work through a project to increase your chances of getting a position in data science but it should provide you a more straight-forward learning path.
Wrap Up
In this article, we identified the non-higher education options that are available to learn data science skills. We covered bootcamps and online learning resources as available options. These programs allow you to generally upskill faster than higher education options like a Master’s degree or certificate. They also tend to cost less than the higher education options. Bootcamps are great for immersive experiences when you’re more sure than not that data science is going to be the career for you. Bootcamps generally have you learn using up-to-date frameworks that higher education tends to lack, depending on your professors. Online learning platforms are best for truly progressing at your own rate and most are much cheaper than any of the other options that were discussed in this or the higher education article.
Going through a bootcamp or online learning platform should not excuse you from working on portfolio projects or networking with data scientists in your desired industry. Projects from bootcamps and online learning platforms oftentimes are not the same types of problems that you’ll need to solve in the real-world so you should make sure that you have projects that are useful to your future employer. An example that I can give in my own industry (healthcare) is that many health systems are not working actively on computer vision problems like supplementing radiologists or dermatologists with state-of-the-art algorithms. Our most common model at my health system is a Random Forest. It’s important to align your projects with something that will be useful to your organization. Networking with data scientists in your area is probably one of the easiest ways to accelerate your search for a job. If a hiring data science team leader has had experiences seeing your work or understanding your passion about data science, they are much more willing to take a chance on you and hire you. You don’t have to become their best friend but a conversation or two will go a long way. Projects and networking are crucial to your success in getting into data science.
What other content would you like to see me write about? Let me know in the comments or connect with me on LinkedIn!
-Dalton