Apologies, I don’t have a Windows computer to pilot on. However, you should be able to do the same thing by replacing the Mac specific parts:
opencommand to the windows
Thanks for your response! Yes, I definitely see some truth in your first comment on data versus infrastructure tools, and also that Docker/K8s have nothing inherently to do with data science.
However, I would say these engineering tools might be necessary to the data scientist who really delivers value within an…
Since the OCR seems to have some difficult extracting these texts accurately they’d be tough to label with the raw version then replace with something generic like [Redaction]. Fortunately, there’s already a hand corrected version out there — see Mark Harwood’s response to this story on labeling entities using the hand corrected version post OCR.
It was meant more as a brief instructional for those looking to find pertinent info in the report (or their own docs) as opposed to a full-fledged data science project. But yes, it is currently lacking in the ‘science’ aspect of data science. I plan to add/follow up with some actual analysis and NLP when I have the time, just wanted to get it out there in a timely fashion. Thanks for your critique.
“train” is both the verb used to described the process of teaching a machine learning algorithm, and a noun often used to refer to the “training data”. Train-test-split is the process of splitting your data into 2 parts: 1 to train your model on (the “training” data) and another out of sample set to evaluate your model on (the “test” data).
Awesome thanks, I remember using Tophat and Bowtie. Databricks, the company founded off of Apache Spark recently released a unified genomics platform with Regeneron that unifies a lot of the disparate tools and brings them into a distributed environment if you’re interested — haven’t been able to play around with it yet: https://databricks.com/product/genomics.
Yeah love anaconda but was avoiding anything too Python oriented for the sake of originality. I’m curious if you have recs on bioinformatics packages though, I remember using homebrew to install some RNA-seq CLIs back when I did more bio stuff.