Publications
Total: 250
Data sharing ethics toolkit: The Human Cell Atlas.
The commitment of the human cell atlas to humanity.
GENCODE 2025: reference gene annotation for human and mouse.
- preprint
GENCODE: massively expanding the lncRNA catalog through capture long-read RNA sequencing.
Long non-coding RNAs involved in
The Catalan initiative for the Earth BioGenome Project: contributing local data to global biodiversity genomics.
CapTrap-seq: a platform-agnostic and quantitative approach for high-fidelity full-length RNA sequencing.
Systematic assessment of long-read RNA-seq methods for transcript identification and quantification.
- preprint
Studying relative RNA localization From nucleus to the cytosol.
An Insertion Within SIRPβ1 Shows a Dual Effect Over Alzheimer's Disease Cognitive Decline Altering the Microglial Response.
Integration of transcription regulation and functional genomic data reveals lncRNA SNHG6's role in hematopoietic differentiation and leukemia.
Molecular signatures of alternative reproductive strategies in a facultatively social hover wasp.
- preprint
An encyclopedia of enhancer-gene regulatory interactions in the human genome.
A fast non-parametric test of association for multiple traits.
The status of the human gene catalogue.
The hologenome of Daphnia magna reveals possible DNA methylation and microbiome-mediated evolution of the host genome.
Genome annotation: From human genetics to biodiversity genomics.
- preprint
Systematic assessment of long-read RNA-seq methods for transcript identification and quantification.
- preprint
CapTrap-Seq: A platform-agnostic and quantitative approach for high-fidelity full-length RNA transcript sequencing.
- preprint
The ENCODE4 long-read RNA-seq collection reveals distinct classes of transcript structure diversity.
Putting hornets on the genomic map.
The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models.
- preprint
The status of the human gene catalogue.
Year: 2023
Editor: ArXiv
Functional Evolution of Clustered Aquaporin Genes Reveals Insights into the Oceanic Success of Teleost Eggs.
RNAget: an API to securely retrieve RNA quantifications.
The landscape of expression and alternative splicing variation across human traits.
Day-night and seasonal variation of human gene expression across tissues.
GENCODE: reference annotation for the human and mouse genomes in 2023.
Brain transcriptomic profiling reveals common alterations across neurodegenerative and psychiatric disorders.
Genetically predicted telomere length and its relationship with neurodegenerative diseases and life expectancy.
Paired guide RNA CRISPR-Cas9 screening for protein-coding genes and lncRNAs involved in transdifferentiation of human B-cells to macrophages.
How to ensure the Human Cell Atlas benefits humanity.
Author Correction: Expanded encyclopaedias of DNA elements in the human and mouse genomes.
Genomic and functional conservation of lncRNAs: lessons from flies.
GA4GH: International policies and standards for data sharing across genomic research and healthcare.
The European Genome-phenome Archive in 2021.
Multivariate Analysis and Modelling of multiple Brain endOphenotypes: Let's MAMBO!
FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow.
Perivascular spaces are associated with tau pathophysiology and synaptic dysfunction in early Alzheimer's continuum.
Enhancers with tissue-specific activity are enriched in intronic regions.
Genetic Influences on Hippocampal Subfields: An Emerging Area of Neuroscience Research.
Genetic Predisposition to Alzheimer's Disease Is Associated with Enlargement of Perivascular Spaces in Centrum Semiovale Region.
The genomic basis of evolutionary differentiation among honey bees.
Conserved long-range base pairings are associated with pre-mRNA processing of human genes.
- preprint
Day-night and seasonal variation of human gene expression across tissues.
Identification and analysis of splicing quantitative trait loci across multiple tissues in the human genome.
bsAS, an antisense long non-coding RNA, essential for correct wing development through regulation of blistered/DSRF isoform usage.
Annotation of Full-Length Long Noncoding RNAs with Capture Long-Read Sequencing (CLS).
GENCODE 2021.
PyHIST: A Histological Image Segmentation Tool.
The abundance of the long intergenic non-coding RNA 01087 differentiates between luminal and triple-negative breast cancers and predicts patient outcome.
Correction to: The genome sequence of the grape phylloxera provides insights into the evolution, adaptation, and invasion routes of an iconic pest.
Cell type-specific genetic regulation of gene expression across human tissues.
The impact of sex on gene expression across human tissues.
Publisher Correction: The tuatara genome reveals ancient features of amniote evolution.
Effect of BDNF Val66Met on hippocampal subfields volumes and compensatory interaction with APOE-ε4 in middle-age cognitively unimpaired individuals from the ALFA study.
The tuatara genome reveals ancient features of amniote evolution.
A limited set of transcriptional programs define major cell types.
Expanded encyclopaedias of DNA elements in the human and mouse genomes.
Functional annotation of human long noncoding RNAs via molecular phenotyping.
The genome sequence of the grape phylloxera provides insights into the evolution, adaptation, and invasion routes of an iconic pest.
The rate and spectrum of mosaic mutations during embryogenesis revealed by RNA sequencing of 49 tissues.
Gene duplications, divergence and recombination shape adaptive evolution of the fish ectoparasite Gyrodactylus bullatarudis.
NEAT1 Long Isoform Is Highly Expressed in Chronic Lymphocytic Leukemia Irrespectively of Cytogenetic Groups or Clinical Outcome.
Enteric infection induces Lark-mediated intron retention at the 5' end of Drosophila genes.
Dynamic changes in intron retention are tightly associated with regulation of splicing factors and proliferative activity during B-cell development.
Re-annotation of 191 developmental and epileptic encephalopathy-associated genes unmasks de novo variants in
Dynamics of microRNA expression during mouse prenatal development.
Processive Recoding and Metazoan Evolution of Selenoprotein P: Up to 132 UGAs in Molluscs.
Integrative transcriptomic analysis suggests new autoregulatory splicing events coupled with nonsense-mediated mRNA decay.
Recent advances in functional genome analysis.
Damage-responsive elements in
GENCODE reference annotation for the human and mouse genomes.
Using geneid to Identify Genes.
Modified penetrance of coding variants by cis-regulatory variation contributes to disease risk.
ggsashimi: Sashimi plot revised for browser- and annotation-independent splicing visualization.
Towards a complete map of the human long non-coding RNA transcriptome.
The reference epigenome and regulatory chromatin landscape of chronic lymphocytic leukemia.
Expression of the transcribed ultraconserved region 70 and the related long non-coding RNA AC092652.2-202 has prognostic value in Chronic Lymphocytic Leukaemia.
Comparative transcriptomics across 14 Drosophila species reveals signatures of longevity.
The effects of death and post-mortem cold ischemia on human tissue transcriptomes.
The discovery potential of RNA processing profiles.
High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing.
Data Resources for Human Functional Genomics.
Selenoprofiles: A Computational Pipeline for Annotation of Selenoproteins.
Comparative transcriptomics in human and mouse.
LncATLAS database for subcellular localization of long noncoding RNAs.
Brain Transcriptome Sequencing of a Natural Model of Alzheimer's Disease.
Genomic history of the origin and domestication of common bean unveils its closest sister species.
Scalable Design of Paired CRISPR Guide RNAs for Genomic Deletion.
Computational identification of the selenocysteine tRNA (tRNASec) in genomes.
Ten Simple Rules on How to Organize a Scientific Retreat.
Discovery of Cancer Driver Long Noncoding RNAs across 1112 Tumour Genomes: New Candidates and Distinguishing Features.
ChimPipe: accurate detection of fusion genes and transcription-induced chimeras from RNA-seq data.
Extreme genomic erosion after recurrent demographic bottlenecks in the highly endangered Iberian lynx.
Human selenoprotein P and S variant mRNAs with different numbers of SECIS elements and inferences from mutant mice of the roles of multiple SECIS elements.
The Allelic Landscape of Human Blood Cell Trait Variation and Links to Common Complex Disease.
Genetic Drivers of Epigenetic and Transcriptional Variation in Human Immune Cells.
Selenoprotein Gene Nomenclature.
Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing.
Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq).
Lokiarchaeota Marks the Transition between the Archaeal and Eukaryotic Selenocysteine Encoding Systems.
Gene-specific patterns of expression variation across organs and species.
Cytoplasmic long noncoding RNAs are frequently bound to and degraded at ribosomes in human cells.
Erratum to: Promoter-like epigenetic signatures in exons displaying cell type-specific splicing.
Erratum to: 'DECKO: Single-oligo, dual-CRISPR deletion of genomic elements including long non-coding RNAs'.
Whole genome sequencing of turbot (Scophthalmus maximus; Pleuronectiformes): a fish adapted to demersal life.
Genome and transcriptome analysis of the Mesoamerican common bean and the role of gene duplications in establishing tissue and temporal specialization of genes.
Spatiotemporal Control of Forkhead Binding to DNA Regulates the Meiotic Gene Expression Program.
Active transcription without histone modifications.
Corrigendum: Domains of genome-wide gene expression dysregulation in Down's syndrome.
Genome of Rhodnius prolixus, an insect vector of Chagas disease, reveals unique adaptations to hematophagy and parasite infection.
Promoter-like epigenetic signatures in exons displaying cell type-specific splicing.
DECKO: Single-oligo, dual-CRISPR deletion of genomic elements including long non-coding RNAs.
Molecular signatures of plastic phenotypes in two eusocial insect species with simple societies.
CARMEN, a human super enhancer-associated long noncoding RNA controlling cardiac specification, differentiation and homeostasis.
Identification of a selenium-dependent glutathione peroxidase in the blood-sucking insect Rhodnius prolixus.
Absence of canonical marks of active chromatin in developmentally regulated genes.
Evolution of selenophosphate synthetases: emergence and relocation of function through independent duplications and recurrent subfunctionalization.
Comparison of GENCODE and RefSeq gene annotation and the impact of reference geneset on variant effect prediction.
Human genomics. Effect of predicted protein-truncating genetic variants on the human transcriptome.
Human genomics. The human transcriptome across tissues and individuals.
The genomes of two key bumblebee species with primitive eusocial organization.
Role of six single nucleotide polymorphisms, risk factors in coronary disease, in OLR1 alternative splicing.
Genomic analysis of a migratory divide reveals candidate genes for migration and implicates selective sweeps in generating islands of differentiation.
Enhanced transcriptome maps from multiple mouse tissues reveal evolutionary constraint in gene expression.
RNA. Prescribing splicing.
The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima.
A comparative encyclopedia of DNA elements in the mouse genome.
Reply to Brunet and Doolittle: Both selected effect and causal role elements can influence human biology and disease.
Transcriptional diversity during lineage commitment of human blood progenitors.
Comparative analysis of the transcriptome across distant species.
Identification of genetic variants associated with alternative splicing using sQTLseekeR.
The RIDL hypothesis: transposable elements as functional domains of long noncoding RNAs.
Genome-wide profiling of the cardiac transcriptome after myocardial infarction identifies novel heart-specific long non-coding RNAs.
Defining functional DNA elements in the human genome.
Domains of genome-wide gene expression dysregulation in Down's syndrome.
Finding the missing honey bee genes: lessons learned from a genome upgrade.
ASPic-GeneID: a lightweight pipeline for gene prediction and alternative isoforms detection.
Transcriptome characterization by RNA sequencing identifies a major molecular and clinical subdivision in chronic lymphocytic leukemia.
SelenoDB 2.0: annotation of selenoprotein genes in animals and their genetic diversity in humans.
Assessment of transcript reconstruction methods for RNA-seq.
Systematic evaluation of spliced alignment programs for RNA-seq data.
Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories.
Transcriptome and genome sequencing uncovers functional variation in humans.
Topoisomerase II regulates yeast genes with singular chromatin architectures.
SECISearch3 and Seblastian: new tools for prediction of SECIS elements and selenoproteins.
Unravelling the hidden DNA structural/physical code provides novel insights on promoter location.
Transcriptome analyses of primitively eusocial wasps reveal novel insights into the evolution of sociality and the origin of alternative phenotypes.
CPEB1 coordinates alternative 3'-UTR formation with translational regulation.
Grape RNA-Seq analysis pipeline environment.
Intron-centric estimation of alternative splicing from RNA-seq data.
The GEM mapper: fast, accurate and versatile alignment by filtration.
Modelling and simulating generic RNA-Seq experiments with the flux simulator.
The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression.
GENCODE: the reference human genome annotation for The ENCODE Project.
Combining RT-PCR-seq and RNA-seq to catalog all genic elements encoded in the human genome.
Understanding transcriptional regulation by integrative analysis of transcription factor binding data.
Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs.
Landscape of transcription in human cells.
Modeling gene expression using chromatin features in various cellular contexts.
An encyclopedia of mouse DNA elements (Mouse ENCODE).
The genome of melon (Cucumis melo L.).
Chimeras taking shape: potential functions of proteins encoded by chimeric RNA transcripts.
Composition and evolution of the vertebrate and mammalian selenoproteomes.
BLUEPRINT to decode the epigenetic signature written in blood.
The Long Non-Coding RNAs: A New (P)layer in the "Dark Matter".
Fast computation and applications of genome mappability.
Evidence for transcript networks composed of chimeric RNAs in human cells.
Estimation of alternative splicing variability in human populations.
Interplay between BRCA1 and RHAMM regulates epithelial apicobasal polarization and may influence risk of breast cancer.
Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia.
Genome-wide CTCF distribution in vertebrates defines equivalent sites that aid the identification of disease-associated genes.
The origins, evolution, and functional potential of alternative splicing in vertebrates.
[Long non-coding RNAs with enhancer-like function in human cells].
Structural constraints revealed in consistent nucleosome positions in the genome of S. cerevisiae.
Sequencing of Culex quinquefasciatus establishes a platform for mosquito comparative genomics.
Long noncoding RNAs with enhancer-like function in human cells.
Reshaping the gut microbiome with bacterial transplantation and antibiotic intake.
Insights into evolution of multicellular fungi from the assembled chromosomes of the mushroom Coprinopsis cinerea (Coprinus cinereus).
International network of cancer genome projects.
From chromatin to splicing: RNA-processing as a total artwork.
Transcriptome genetics using second generation sequencing in a Caucasian population.
The histone variant macroH2A is an epigenetic regulator of key developmental genes.
Nucleosome positioning as a determinant of exon recognition.
Variation in novel exons (RACEfrags) of the MECP2 gene in Rett syndrome patients and controls.
Low exchangeability of selenocysteine, the 21st amino acid, in vertebrate proteins.
The human CD6 gene is transcriptionally regulated by RUNX and Ets transcription factors in T cells.
The genome sequence of taurine cattle: a window to ruminant biology and evolution.
CROC: finding chromosomal clusters in eukaryotic genomes.
Hnf1alpha (MODY3) controls tissue-specific transcriptional programs and exerts opposed effects on cell growth in pancreatic islets and liver.
Identifying protein-coding genes in genomic sequences.
A short motif in Drosophila SECIS Binding Protein 2 provides differential binding affinity to SECIS RNA hairpins.
Functional targets of the monogenic diabetes transcription factors HNF-1alpha and HNF-4alpha are highly conserved between mice and humans.
SECISaln, a web-based tool for the creation of structure-based alignments of eukaryotic SECIS elements.
ASTD: The Alternative Splicing and Transcript Diversity database.
Conserved chromosomal clustering of genes governed by chromatin regulators in Drosophila.
Relaxation of selective constraints causes independent selenoprotein extinction in insect genomes.
A general definition and nomenclature for alternative splicing events.
A comparison of random sequence reads versus 16S rDNA sequences for estimating the biodiversity of a metagenomic library.
Efficient targeted transcript discovery via array-based normalization of RACE libraries.
SNP and haplotype mapping for genetic analysis in the rat.
Using geneid to identify genes.
In silico meets in vivo.
A combinatorial code for CPE-mediated translational control.
Interoperability with Moby 1.0--it's better than sharing your toothbrush!
SelenoDB 1.0 : a database of selenoprotein genes, proteins and SECIS elements.
Evolution of genes and genomes on the Drosophila phylogeny.
Islands of euchromatin-like sequence and expressed polymorphic sequences within the short arm of human chromosome 21.
Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.
Structured RNAs in the ENCODE selected regions of the human genome.
Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution.
Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions.
The DART classification of unannotated transcription within the ENCODE regions: associating transcription with known and novel loci.
BioMoby web services to support clustering of co-regulated genes based on similarity of promoter configurations.
Multiple non-collinear TF-map alignments of promoter regions.
The implications of alternative splicing in the ENCODE protein complement.
Improving gene annotation using peptide mass spectrometry.
Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia.
GENCODE: producing a reference annotation for ENCODE.
EGASP: the human ENCODE Genome Annotation Assessment Project.
EGASP: Introduction.
Unweaving the meanings of messenger RNA sequences.
Transcription factor map alignment of promoter regions.
Mutation patterns of amino acid tandem repeats in the human proteome.
ABS: a database of Annotated regulatory Binding Sites from orthologous promoters.
Tandem chimerism as a means to increase protein complexity in the human genome.
Diversity and functional plasticity of eukaryotic selenoproteins: identification and characterization of the SelJ family.
Regulation of Fas alternative splicing by antagonistic effects of TIA-1 and PTB on exon definition.
EGASP: collaboration through competition to find human genes.
Gene finding in the chicken genome.
Nematode selenoproteome: the use of the selenocysteine insertion system to decode one codon in an animal genome?
Comparative gene finding in chicken indicates that we are closing in on the set of multi-exonic widely expressed human genes.
Are splicing mutations the most frequent cause of hereditary disease?
Gene organization features in A/T-rich organisms.
Comparison of splice sites in mammals and chicken.
Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype.
Clustering of genes coding for DNA binding proteins in a region of atypical evolution of the human genome.
Splice site identification by idlBNs.
Recent advances in gene structure prediction.
Comparative analysis of amino acid repeats in rodents and humans.
Genome sequence of the Brown Norway rat yields insights into mammalian evolution.
Reconsidering the evolution of eukaryotic selenoproteins: a novel nonmammalian family with scattered phylogenetic distribution.
gff2aplot: Plotting sequence comparisons.
Characterization of mammalian selenoproteomes.
Transcriptional network controlled by the trithorax-group gene ash2 in Drosophila melanogaster.
Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes.
Comparative gene prediction in human and mouse.
Initial sequencing and comparative analysis of the mouse genome.
The genome sequence of the malaria mosquito Anopheles gambiae.