Mastering EDA & Visualization and Probability & Statistics in the MSDS Boot Camp: Insights from Professor Wang

Aline A
USF-Data Science
Published in
4 min readJul 21, 2023

We had the opportunity to explore one of the three MSDS boot camp courses. Today, we are excited to have Shan Wang guide us through the EDA & Visualization and Probability & Statistics courses. She will also provide an overview of these courses, discuss their main challenges, and offer insights on how to excel in them. Additionally, we will learn a bit about Shan’s background and discover her favorite weekend activity.

Q: Professor Wang, tell us a bit more about your Probability and Statistics course.

A: Probability and Stats review probability theories and statistical inference. The topics include but are not limited to random variables, distribution functions, joint distributions, central limit theorem, maximum likelihood estimation, confidence interval, and hypothesis testing.

Q. What are the main challenges?

A: Learning probability and statistics in five weeks can be arduous and challenging. The subject itself is intricate, requiring a solid understanding of mathematical concepts and logical reasoning. The compressed timeline leaves little room for in-depth exploration and may overwhelm students, making it difficult to absorb the vast array of theories and methodologies.

Q: How to succeed in this course?

A: Based on the assignment and quiz schedule, make a consistent study plan and:

  1. Create a study and support group: Forming a study group can be highly beneficial. Collaborate with your peers to share knowledge, discuss concepts, and solve problems together. This group can provide valuable support and motivation throughout your studies.
  2. Utilize recommended texts: Use the recommended reading materials by thoroughly studying and practicing the examples and questions provided. This will deepen your understanding and help reinforce key concepts.
  3. Seek assistance when needed: Don’t hesitate to contact me whenever you have questions or doubts. I am here to support you and provide clarification whenever necessary.

By implementing these strategies, you can optimize your learning process and make the most of your educational journey.

Q: Professor Wang, tell us a bit more about your EDA and Visualization course.

A: This course uses Python to provide a thorough introduction to exploring and visualizing data using statistics, charts, graphics, etc. Other topics include advanced Python functionality and using statistical measures to communicate data concepts.

Q: What are the main challenges?

A: Even though the basics of using Python functions can be easy, the challenges can be:

  • How to efficiently write code.
  • Every data is different, how to build a consistent plan to explore the data.
  • How to choose the right plot for the data and present correct and useful information.
  • Obtaining a great sense of visualization design.
  • Thinking about how to communicate your result to a broader audience.

Q: How to succeed in this course?

A: Here are some recommendations to maximize your learning and productivity:

  1. Summarize lectures in your own words: After attending each lecture, take the time to summarize the key points in your own words. This will help solidify your understanding and allow you to recall the information more effectively.
  2. Create your own coding questions: Challenge yourself by coding the questions independently. This practice will enhance your coding skills and improve your problem-solving abilities.
  3. Plan your EDA and visualization: Before delving into data analysis and visualization, creating a plan outlining the steps you will take is beneficial. This ensures a structured and organized approach to your analysis, leading to more insightful and meaningful results.
  4. Maintain well-organized code: As you work on each question or task, make an effort to organize your code neatly and efficiently. This practice will help you better understand and navigate your codebase, making it easier to revisit and review your work later.

By implementing these strategies, you can enhance your learning experience, improve your coding proficiency, and ensure a systematic and effective approach to data analysis and visualization.

Shan was born and raised in China. She pursued her undergraduate studies in mathematics at Fudan University in Shanghai. After completing her undergraduate degree, she came to the US for graduate studies and received a Ph.D. in Statistics from Purdue University.

While her background is primarily rooted in theory, her research interests have taken a captivating turn toward biostatistics and bioinformatics. She is drawn to the practical applications of statistics and data science, especially when it involves analyzing medical data and uncovering valuable insights.

Besides her academic pursuits, Shan is a mother to a 1.5-year-old. She embraces the challenges and magic of motherhood. Her biggest “hobby” is caring for her child. Her free-time activities are kid-centered. They enjoy exploring nature around the Bay Area together. Some amazing destinations in the past few weeks have included Lake Tahoe and Henry Cowell state park.

--

--

Aline A
USF-Data Science

Program Assistant at University of San Francisco | MS, Data Science