Computational biology and bioinformatics articles within Nature Communications

Featured

  • Article
    | Open Access

    CpG islands are high GC content DNA elements that surround the majority of transcriptional start sites in eukaryotes. Here, the authors analyse over 200 genomic data sets to provide new insight into global CpG islands-dependent regulatory mechanisms in differentiated and pluripotent stem cells.

    • Samuel Beck
    • , Bum-Kyu Lee
    •  & Jonghwan Kim
  • Article |

    Despite our growing understanding of their complexity, different types of RNA are still classified using technical rather than functional criteria. Andersson et al.show that categorization of RNAs based on stability and direction of transcription is an effective means of functional classification.

    • Robin Andersson
    • , Peter Refsing Andersen
    •  & Albin Sandelin
  • Article |

    The development of software tools to analyse large mass spectrometry data sets lags behind the increase in diversity of the data. Here the authors develop MS-GF+, a database search tool that outperforms other popular tools in identifying peptides from a variety of data sets.

    • Sangtae Kim
    •  & Pavel A. Pevzner
  • Article
    | Open Access

    No experimental evidence exists for intra-helical motion of DNA at the μs timescale, which has been attributed to technical difficulties in observing motion in this time range. Here, the authors demonstrate, using extensive molecular dynamics simulations and experimental analysis, that such motion is effectively absent from a B-DNA duplex.

    • Rodrigo Galindo-Murillo
    • , Daniel R. Roe
    •  & Thomas E. Cheatham III
  • Article
    | Open Access

    Metastasizing tumour cells undergo epithelial-to-mesenchymal transition. Using both bioinformatic and in vivo approaches, Chanrion et al.identify combined Notch activation and p53 inactivation as a potent inducer of this transition, and apply this to create a highly metastatic tumour model in mice.

    • Maia Chanrion
    • , Inna Kuperstein
    •  & Sylvie Robine
  • Article
    | Open Access

    Linear mixed models (LMMs) provide a powerful method for studying genotype–phenotype associations. Here the authors present a LMM application that estimates an optimal transformation from observed data and increases the accuracy of heritability estimation and phenotype prediction.

    • Nicolo Fusi
    • , Christoph Lippert
    •  & Oliver Stegle
  • Article |

    The functional consequences of naturally occurring variation in ribosomal DNA (rDNA) copy number are poorly understood. Here the authors estimate rDNA copy number and mitochondrial DNA abundance in humans using whole-genome short-read DNA sequencing and characterize global regulatory mechanisms for cellular homeostasis and adaptation.

    • John G. Gibbons
    • , Alan T. Branco
    •  & Bernardo Lemos
  • Article
    | Open Access

    Common methods to detect adenosine-to-inosine RNA editing sites rely on mapping short RNA reads to the genome while allowing only a limited number of mismatches. Here, Porath et al. present a novel RNA-seq based approach to identify hyper-edited reads that significantly expands the RNA editome.

    • Hagit T. Porath
    • , Shai Carmi
    •  & Erez Y. Levanon
  • Article
    | Open Access

    Metagenomic studies of microbial communities often report DNA sequences from unidentified viruses. Here, Dutilh et al. analyse metagenomic data to reveal the complete genome of an abundant, ubiquitous virus from human faeces, and predict that the virus infects bacteria of the Bacteroides group.

    • Bas E. Dutilh
    • , Noriko Cassman
    •  & Robert A. Edwards
  • Article
    | Open Access

    Intestinal microbes can have important effects on our health. Here, the authors analyse the gut microbiota composition in 1,000 western adults and find that certain bacteria are either abundant or nearly absent, and that these alternative states are associated with ageing and overweight.

    • Leo Lahti
    • , Jarkko Salojärvi
    •  & Willem M. de Vos
  • Article
    | Open Access

    Some viruses are spherical particles in which protein components are organized with well-defined icosahedral and local symmetries. Here, Gipson et al. describe a unique arrangement of proteins, breaking all expected local symmetries, in particles of a marine bacterial virus.

    • Preeti Gipson
    • , Matthew L. Baker
    •  & Wah Chiu
  • Article
    | Open Access

    Dyslipidemia and obesity have a high prevalence in populations with Amerindian backgrounds, such as Mexican–Americans. Here, the authors design an approach to identify Amerindian risk genes in Mexicans and identify five genomic loci, which include RORA and SIK3that may contribute to the risk of dyslipidemia and obesity in Amerindian populations.

    • Arthur Ko
    • , Rita M. Cantor
    •  & Päivi Pajukanta
  • Article |

    Analyses of genome and transcriptome data are unable to accurately predict protein levels and function in tumour samples. Here, the authors carry out a comprehensive protein analysis in 3,467 samples from the cancer genome atlas, providing a resource to study the prognostic and therapeutic potential of tumour proteins.

    • Rehan Akbani
    • , Patrick Kwok Shing Ng
    •  & Gordon B. Mills
  • Article |

    Model-based part design is a key step in synthetic biology. Here, the authors report a method for tuning nucleosome architecture in order to strengthen native promoters and facilitate synthetic promoter design in yeast.

    • Kathleen A. Curran
    • , Nathan C. Crook
    •  & Hal S. Alper
  • Article
    | Open Access

    Misfolded protein accumulation is a hallmark of many neurodegenerative diseases. Here Budrikis et al. model protein aggregation in the endoplasmic reticulum and show that it is the result of a non-equilibrium phase transition caused by tipping the balance from the rates of protein production to degradation.

    • Zoe Budrikis
    • , Giulio Costantini
    •  & Stefano Zapperi
  • Article |

    The enzyme butyrylcholinesterase (BChE) can metabolize cocaine, albeit at relatively low speeds. Here the authors use computational methods to define mutations that increase BChE-mediated cocaine hydrolysis, achieving a catalytic activity comparable to that of one of the fastest naturally occurring enzyme.

    • Fang Zheng
    • , Liu Xue
    •  & Chang-Guo Zhan
  • Article
    | Open Access

    Gene expression is highly variable between tissues, and changes during development and with age. Here, the authors provide a comprehensive RNA-Seq analysis of the rat transcriptome, spanning eleven organs, four developmental stages and both sexes.

    • Ying Yu
    • , James C. Fuscoe
    •  & Charles Wang
  • Article
    | Open Access

    mRNA transport contributes to the proper localization of its cognate proteins. Here the authors report a correlation between the physicochemical properties of mRNAs and their cognate proteins, suggesting that these properties of mRNAs can predict the subcellular localization of their cognate proteins.

    • Anton A. Polyansky
    • , Mario Hlevnjak
    •  & Bojan Zagrovic
  • Article |

    Predicting the dynamics and disorder of a protein is a computationally complex task that, until now, has depended on prior knowledge of protein structure. Cilia et al.develop a tool to rapidly predict protein backbone dynamics based on sequence alone.

    • Elisa Cilia
    • , Rita Pancsa
    •  & Wim F. Vranken
  • Article |

    Non-small cell lung cancers (NSCLC) that harbour mutations in KRas can be separated into KRas-dependent and -independent subsets. By analysing transcriptome, proteome and phosphoproteome data from NSCLC cell lines, Balbin et al. show that KRas-dependent cell lines activate the Lck pathway.

    • O. Alejandro Balbin
    • , John R. Prensner
    •  & Arul M. Chinnaiyan
  • Article
    | Open Access

    FGFR2 gene variation is associated with breast cancer risk but the molecular mechanism is unknown. Fletcher et al. provide a link between FGFR2 signalling and breast cancer susceptibility by demonstrating that FGFR2 signalling activates the ERa transcriptional network, which drives transcription of risk genes.

    • Michael N. C. Fletcher
    • , Mauro A. A. Castro
    •  & Kerstin B. Meyer
  • Article |

    Mutually exclusive splicing of genes is a mechanism for generating proteome diversity. Here Kollmar et al. determine the mutually exclusive spliced exome of Drosophila melanogaster and reveal insights into its evolutionary history within the Drosophilagroup.

    • Klas Hatje
    •  & Martin Kollmar
  • Article
    | Open Access

    Dynamic changes in T cell repertoire underlie immune responses during infection, allergy, autoimmunity and cancer. Here, Li et al. present a workflow for high throughput sequencing and analysis of T cell receptor sequences, and use it to monitor the T cell response to influenza vaccination in a human patient.

    • Shuo Li
    • , Marie-Paule Lefranc
    •  & Eric J. Gowans
  • Article |

    Sequencing whole microbial genomes has become standard practice and methods to examine their phylogenetic relationships need to match the increasing demand. Segata et al. present a new computational pipeline that allows fast and accurate taxonomic assignment of microbial species.

    • Nicola Segata
    • , Daniela Börnigen
    •  & Curtis Huttenhower
  • Article
    | Open Access

    Biological network data are often incomplete, which makes it difficult to determine interaction motifs within such data sets. Here Tran et al. present a new method to count motif numbers in large networks from noisy and incomplete biological data.

    • Ngoc Hieu Tran
    • , Kwok Pui Choi
    •  & Louxin Zhang
  • Article
    | Open Access

    Systematic large-scale analysis of embryonic development requires the processing of large amounts of microscopy data. Here Schmid et al.solve this problem by developing a high-speed imaging system that projects zebrafish embryos onto a ‘world map’ in real time, revealing characteristic migration patterns in the early endoderm.

    • Benjamin Schmid
    • , Gopi Shah
    •  & Jan Huisken
  • Article
    | Open Access

    The comprehensive bioanalysis of proteins usually requires multi-step surface and mobile phase measurements. Here, the authors use chips functionalized with dynamically actuated nanolevers—DNA strands that can be switched in an electric field—to obtain motional dynamic measurements of proteins on a chip.

    • Andreas Langer
    • , Paul A. Hampel
    •  & Ulrich Rant
  • Article
    | Open Access

    Cell lines are widely used in cancer research to study tumour biology. Here Domcke et al.compare genomic data from ovarian cancer cell lines with those from clinical ovarian tumour samples and identify cell lines that most closely resemble the genomic features of high-grade serous ovarian cancer.

    • Silvia Domcke
    • , Rileen Sinha
    •  & Nikolaus Schultz
  • Article |

    The identification of hosts of blood-sucking insects is important for studying ecological factors that affect pathogen distribution. Önder et al. report a proteomics-based methodology for the analysis of blood remnants in ticks that identifies the host species from which the tick has fed up to 6 months earlier.

    • Özlem Önder
    • , Wenguang Shao
    •  & Dustin Brisson
  • Article
    | Open Access

    The rice sheath blight pathogen, Rhizoctonia solani, is an important fungal pathogen that can devastate rice and maize crops. Zheng and colleagues sequence and assemble the R. solani AG1 IA genome—the first to be sequenced from the Rhizoctoniagenus—using Illumina sequencing technology.

    • Aiping Zheng
    • , Runmao Lin
    •  & Ping Li
  • Article
    | Open Access

    To describe the biochemical composition of an organism multiple data sets must be combined and this information can then be used forin silico analysis. By combining metabolism and transcription data, Lerman et al. discovered new regulons and improved the gene annotation for the simple organism Thermotoga maritima.

    • Joshua A. Lerman
    • , Daniel R. Hyduke
    •  & Bernhard O. Palsson