21 Feb 2019 Quick Bytes – Machine Learning in the life sciences – some caveats
Machine learning is a popular topic in the life sciences these days. A related question cropped up during a panel discussion at BioIT World 2018 and BioTeam’s VP of Consulting, Ari Berman, raised some interesting points and an important caveat for those looking to leverage machine learning in their work.
The original footage was shot as part of the Bioteam Town Hall at Bio-IT World 2018 and is used with the permission of CHI/Bio-IT World.
A.I. and Machine Learning.
I’m on a bent trying to demystify [AI and ML] and make it so that it’s a little more tractable because otherwise it becomes like probabilistic statistics where people choose the test that gives them the best p-value versus what’s actually going to work for their data.
And so that’s that’s a general problem also with machine learning. The interesting thing about machine learning if I could segue into another thing: In order for a deep learning neural network to work right you must have well curated data.
How many people curate their data in this room? That’s what I thought! Yeah. So that means that you actually have to manage your data. You have to know what you have. You have to understand it on a level beyond “the dude who was here seven years ago wrote it in a notebook somewhere that’s under a desk”. That’s not going to work, or “I think I understand this complex directory hierarchy and there’s an Excel file somewhere that tells me about it”.
If you throw junk into a machine learning algorithm you’re going to get junk out, just like any other mathematical model on the planet…