Strategic Grants
Mapping of the spruce genome
Umeå Plant Science Centre, a research center operated in collaboration between Umeå University and the Swedish University of Agricultural Sciences, and SciLifeLab.
Some of the contributing researchers:
Umeå Plant Science Centre
Ove Nilsson
Pär Ingvarsson
Stefan Jansson
Nathaniel Street
SciLifeLab
Björn Nystedt
Joakim Lundeberg
Grant in SEK:
SEK 75 million
The project’s main intent has been to map the genome to understand the biology and evolution behind the spruce.
“Conifers came up with something already before the dinosaurs wandered in the coniferous forest that was so successful that they still dominate. After this, all flowering plants have undergone enormous changes while the conifers have virtually not developed at all,” says Ove Nilsson, Professor of Forest Genetics and Plant Physiology at the Swedish University of Agricultural Sciences and the Umeå Plant Science Centre.
It is a bit of a mystery why the conifers have retained their dominant position in spite of this.
“In purely theoretical terms, it should be a disadvantage to not develop at the same rate as the other plants have done.”
Tailor-made trees
Another objective in the project has been to improve and hasten the cultivation of spruce and pine.
“In that we have the entire genome sequence, we have complete control of every gene so that we can connect them to various characteristics. For example, the trees can be cultivated to grow faster and become more resistant to diseases. We hope that the all Swedish tree cultivation will change after this."
According to the researchers, the possibility also opens up to tailor-make trees for various areas of use. One could imagine one type for paper pulp, another for boards and a third for new plastic materials or fuels.
The mapping of the spruce genome has been a gigantic process in many ways. The DNA consists of 20 billion base pairs. It is the four nucleotides abbreviated A, C, G, and T that together form the genetic material, the DNA. These four letters can be said to constitute the alphabet of life. It is the order of the nucleotides that actually lays the foundation for what an organism can do, an order that enables an almost infinite number of different combinations.
“The problem is that only small parts can be read at a time, which results in a giant puzzle of random DNA segments. This requires a computer with an enormous amount of RAM. A computer with this capacity was not available in Scandinavia when we submitted the application for a project grant to the Knut and Alice Wallenberg Foundation,” explains Ove Nilsson.
The greatest challenge was to put together all the puzzle pieces so that the letters were in the right sequence. In the case of the spruce, it involved more than 2 trillion puzzle pieces that had to be put together into a whole. To make the enormous amount of information involved a bit more understandable, Ove Nilsson uses the Bible.
“Imagine how many letters there are in the Bible. Then imagine how many Bibles you would have to put in a row in the distance between Stockholm and Uppsala, and the total number of letters they contain. This provides an illustration of how much data is involved.”
Technical development a prerequisite
When SciLifeLab was established it became technically possible to sequence the genome.
SciLifeLab is a collaboration between KTH Royal Institute of Technology, Stockholm University, Karolinska Institutet and Uppsala University, which provides access to entirely new technology and new platforms for genomics and bioinformatics.
In the summer of 2010, the project got started and as early as 2013, the spruce genome was mapped.
“Technical development is going dizzily fast. We also came to drive development, which will also be positive for the human genome research.”
The Köttsjö spruce
The spruce the DNA researchers sequenced originally comes from a spruce in Köttsjö in the Jämtland area of Sweden. Since some sprigs were grafted from the Köttsjö spruce in 1959, it has had millions of offspring in the Swedish forests. The sequencing shows that the spruce has at least 29,000 different genes. This is just a few more than what humans have, but the spruce’s genetic material is nonetheless seven times larger than ours.
One of the project’s goals was to provide an answer to why the genome is so large.
“We were somewhat certain that it was not because the spruce has more genes. However, the genome is filled by short DNA sequences that are repeated. They are the remains of “transposons”, pieces of DNA that can replicate themselves and jump around in the genome. Most plants have mechanisms to limit the spread of these transposons, something that the conifers appear to lack,” explains Ove Nilsson.
Saved by extensive pollen distribution
According to Ove Nilsson, the transposons probably spread so much in the spruce that a limit has been reached for how large a genome can become.
“Every chromosome is as large as the human genome. The spruce genome also consists of many pseudogenes, a kind of physical remains from dead genes.”
A genome filled with irrelevant genes should be a major disadvantage, but for some reason it does not affect the conifers.
“What probably counteracts the negative effects is the fact that spruce and pine can spread their pollen up to 1,000 kilometers. This leads to a large population that can interbreed with each other and then, it becomes less dangerous to bear non-functional genes. However, if two closely related spruce are crossed, or seeds from the same spruce are used, the results are seriously ill trees,” says Ove Nilsson.
Text Carina Dahlberg
Translation Semantix
Photo Johan Gunséus and Magnus Bergström
Facts: Genome
Nearly all living cells contain a genome, genetic material, hereditary information encoded in DNA. Even viruses have a genome that consists of either DNA or RNA.
The genome contains all information necessary in the cells to produce the components required for the function of the cells, which means various kinds of RNA and proteins.
Genes are the parts of a DNA molecule that contain genetic information, while other parts of the molecule have more structural purposes or are involved in the regulation of the genes.
Every gene constitutes a DNA sequence, a series of base pairs, which give rise to a certain protein. DNA sequencing is a process used to find out the order of the nitrogenous bases that build up the genetic code. DNA consists of four different nitrogenous bases that are designated A, C, G, and T.
The work of identifying the full DNA sequence for mankind was completed in the Human Genome Project at the beginning of the 2000s.
Researchers at UPSC were involved in the international project that mapped the poplar genome in 2006. The poplar was then the third plant - after Arabidopsis (thale cress) and rice - that had been mapped.
More about Ove Nilsson's research
He wants to understand - and control - tree blooming and growth