Digital tests guide efforts to predict lung cancer

Jake Kantrowitz, Ania Tassinari and a lab member is meeting for their school project.

By Linjing Cheng


Jake Kantrowitz, Ania Tassinari and a lab member is meeting for their school project.
Jake Kantrowitz, Ania Tassinari and a lab member is meeting for their school project.

When the human genome project brought computation into the world of medical research in the 1990s, Dr. Avrum Spira went into Boston University’s first class offered in bioinformatics.

Today, he runs one of the three biocomputation labs at his alma mater, working out an easier method of predicting lung cancer. Meanwhile, the algorithm revolution with its power in data mining, has taken place in all the biotech companies and some of the most funded labs in medical schools.

Spira has a theory that if cells down in the lung are mutating, somewhere up in the airway, nasal cells are changing too. And nasal cells are easier for swabbing and notice the change.

Smoking is one of the triggers leading to lung cancer. However, a smoker will only have lung cancer when he has oncogenes. The lab’s current work is trying to identify all the oncogenes.

The model would allow people in their 20s to go to the clinic and use a Q-tip swab in their noses to see if they have inherited lung cancer genes. The process will be easier and cheaper than current testing such as a biopsy.  The researchers acknowledge that it’s not a hassle to swab nose, so it can be included as a regular test before anything happens. Nowadays, the doctors wouldn’t suggest a biopsy unless something already appears to be wrong.

To support such a theory requires massive gene data analysis. The Spira-Lenberg lab members don’t work on lab mice or real patient cell samples. They order human gene samples, digitize the gene expressions, analyze the raw gene data on computers, and identify biomarkers of preliminary lesions that will later develop into lung cancer.

They turn the patients’ cell samples into data. Algorithms are in every step of their work. They generate, create and manage data.

Jake Kantrowitz, a medical and doctoral student who directs the lab work, said that he connects his computer remotely to the supercomputers in a cold room in west Massachusetts. It takes the supercomputer a whole day to sort out the thousands of genes.

“A standard microarrays has standard 10 to 20 thousands genes — a gene expression of that many genes. So doing statistics on even one of those, will take a fair bit of time by hand, so we leverage the computational power of the machines we work with, to use those to do massive computation on matrixes”Kantrowitz said. “We have essentially bought into a large academic cluster of computers that are hosted out at Mt. Holyoke in West Massachusetts. And we have at least a couple of computers that have 16 cores on them each. And they have at least a few gigs on them each.”

The normal process of detecting cancer today still relies on an X-ray showing the visible lump. However, that would already be late in many cases. To study genomics and gene expression profiling, is to understand diseases, and give proactive cures.

“Lung cancers are detected pretty late. Partially that comes from the fact that computer tomography is not available everywhere,” said lab member Ania Tassinari. “The biomarker that we are developing, it focuses on the nasal cells in your airway, so that we can perform a simple procedure called bronchoscope compared to tumor biopsy, it’s a much less invasive procedure.”

But as medical researchers begin to rely on computers to crunch massive amounts of data that was once painstakingly done by lab assistants, questions are being raised about how far computation will replace scientists.  In fact, An Oxford study concludes that many of the high-end science jobs will be replaced by extensive computational work in the next 20 years.

However, 25 years after the first practice of biocomputation, scientists realized that it doesn’t solve all the problems so quickly. The research progresses slowly.

Sean Corbett is new at the lab, and he holds a bachelor ‘s degree in computer science. He believes doctors will never be completely replaced by computers, because the process requires “human doctors to operate tests on real people,” though parts of it will change slowly.

“There is no push in the medicine per se to increase doctors’ awareness of computational ability,” Kantrowitz said. “Science research is a rewarding but painstaking process. Biocomputation is yet one way to solve any big disease since the human genome project.”

Benjamin Blum worked for five years for a biotech company, engineering novel protein therapeutics. He saw the industry quickly adopted big data and software systems.

The invention of the biomarker method also set a new business standard for companies involved in medicine.

Traditionally, a drug that works for more than 20 percent of the population gets permission for release. Blum mentioned that the bio companies now have a new standard that, alongside the drug, you also develop a tool that picks out the 20 percent of the patient population.

This tool is biomarker, and computation helps generate this data. The computation enables scientists to use biomarker, and precisely target the drugs at its patient population, so it doesn’t go waste on the 80 percent.

Kantrowitz said that the idea of drug developing software sounds scary, but as far as he can see, these companies will need more human staff on the new types of research work. “I can see we are working on jobs that need at least 50 people.”


About linjing_cheng 4 Articles

Linjing Cheng is a second year journalism graduate student at Emerson College. She is a published writer in English and Chinese. She has done work during her internships, among others, at China Central Television International, WERS, and The Somerville Times. You can find her works at