When reading blogs and news about AI and especially machine learning applications, I don’t see so much applications in biology. Predicting market parameters, creating recommendation systems or developing autonomous vehicles are common applications detailed in blogs of the AI sphere, but biology seems to be less present. Is biology less interesting or less bankable for the machine learning community ? I don’t think so.
Healthcare : the tip of the iceberg
Actually biology is at least mentioned, under the “healthcare” category. It often refer to heath applications of connected devices (Internet of the Things or IoT). From a source to another, this category may cover more biotech activities such as drug discovery, genetic analysis of diseases and other domains, more or less linked to human health or wellness.
Is healthcare bankable ? According to CB insight, it is ! and, although the affirmation that it is the “hottest AI category for deal in 2016” can always be questioned, it is obviously a very dynamic market.
Moreover, healthcare is only the tip of the iceberg. Machine learning applications in biology are much wilder than IoT and biotech start-up.
One domain where machine learning can bring a lot to biology is agriculture and livestock. Many companies and private institutes try to produce new varieties of plants (cereals, vegetable, fruits…) with enhance characteristics (better yield, faster growth, resistances…) compared to their “natural” counterparts, in order to fight climate change or feed our ever-growing population. All this breeding programs tend to develop “genomic selection” in order to select the most promising lines. Genomic selection is not GMO (although both could be combined). The idea behind genomic selection is to grow many lines and finding the best ones to cross them and produce new varieties. To reduce cost of such programs, breeders try to predict the most promising lines and guest what ? they use machine learning approaches. Their features can be markers (genetic position in the plant genome related to the wanted characteristic), more affordable thanks to the drastic drop of sequencing cost, and/or environmental features (climate, features of the field…). The same is more or less true for livestock, with a huge difference : breeding programs are much more longer, due to the length of reproduction cycle, so the need for such tool is even greater.
Although some big companies (e.g. Monsanto) may already have efficient tools for plants, most organization are still developing research programs for even better algorithms (I am myself working on such questions). For livestock, there is already some good tools, but development still continue and there is rooms for new methods.
Agriculture and livestock is one example of needs for machine learning in biology but it can easily be extended to other domains.
Genomics and machine learning : a key to the future
Deciphering genomic informations will most probably shape the future of the human being, and machine learning is applicable to far genomics problems than just agriculture and livestock. Of course, the “healthcare” category of AI applications mentioned above encompass some of these applications but not all of them.
In this article published in Nature Review Genetics, Libbrecht and Noble give some examples of machine learning applications in genetics and genomics. From predicting a gene function to finding all genes involve in a metabolic pathway, this spectrum is wide and can lead to many discoveries, from curing some diseases to optimize our food for well-being. Genomics is also not restricted to human. Think about what we can do with bacteria or viruses. Many industrial processes use bacteria for some transformation. Using machine learning approaches as decision helper could be helpful to “reprogram” efficiently a bacterial genome and transform it into a tiny factory, which multiply by itself. And the same is true for viruses which can become genomics tools.
It is not only theoretical
Entrepreneur may feel these possibility quite far from their need to develop products. Maybe it is the reason why AI business only consider a few application in biology under the “healthcare” category. In fact, plenty of biological discoveries made through machine learning could quickly be applied to a particular problem and thus become a service or a product, thanks to the quantity of data available (and not only IoT).
About products, developing tools able to predict something from biomarkers could be the beginning of a business and plenty of public data are already available to create such tool. About services, producing biological data become cheaper and cheaper and now, it is the analysis that is the main cost, as shown in the graph below.
The graph is already old but it has proven to be more and more true. The downstrean analysis are potential target for machine learning applications. A classical question is to take population A which has a special characteristic (let’s say a disease), a population B of control, then sequence all individuals and look for the genomic differences between the population as potential markers of interest. This sounds like a task for unsupervised learning (clustering) then extracting the most relevant features that explain this clustering, isn’t it ? What about offering a service that do this task with a better efficiency that the current method ?
Although biology, under the “healthcare” label is one of the growing application field of machine learning, there is still plenty of applications (and more to come) that may required machine learning and could become a source of business. So yes, biology is bankable in the AI world, and could be even more !