Using the Human Ageing Genomic Resources

There are several tools and databases available at HAGR. Its key resources can be divided, however, into gene-centric databases: GenAge, GenDR and the LongevityMap, the former divided into human genes and genes in model organisms; plus the animal species database AnAge. At the root of all these resources is manual curation and data quality. What follows is a succinct user manual for each of them followed by an overview of the other major features in HAGR.

The wonderful folks at OpenHelix also made an excellent video introducing HAGR:

GenAge: The Ageing Gene Database

GenAge has two main resources: a database of ageing- and longevity-associated genes identified in model organisms and a database of human ageing genes. Both databases are manually curated. The model organisms database has more genes, yet the human database is more detailed and contains substantially more data and information for each gene.

The dataset of genes associated with longevity or ageing in model organisms is essentially a list of genes with one or a few key references, a brief description of the phenotype or effects of genetic manipulations of the gene, human homologues of the gene, and a few external links. For a gene to be featured, its association with ageing and/or longevity must be unambigous, and hence most genes were selected based on genetic manipulations and not mere correlations, such as a gene's upregulation with age, in which causality is impossible to determine.

Gene in model organisms can be classified as follows:

In cases where conflicting results were observed, or the data were not sufficient to draw a definite conclusion, our policy is to keep all observation and annotate the genes as Unclear and Unannotated, respectively.

Genes associated with longevity or ageing from model organisms can be useful as a reference and educational resource for researchers. This dataset also serves as the primary resource for deriving the list of genes that may be associated with human ageing. Typically, the best genes from model organisms serve as basis for deriving the dataset of putative human ageing-related genes, as further described below.

The human dataset in GenAge is a curated database of genes that may regulate human ageing or that at least might be considerably associated with the human ageing phenotype. It is a functional genomics database designed to provide up-to-date information in the context of ageing and molecular genetics.

Because our focus is on the fundamental ageing process, what some authors call senescence, and not just age-related pathologies, the human dataset features primarily genes related to biological ageing rather than genes that only affect longevity by having an impact on overall health. This is an important point because longevity can be influenced by factors unrelated to ageing, and the distinction is crucial, albeit often difficult. (For those interested in genes associated with human longevity, please refer to the LongevityMap). Likewise, a gene being differentially expressed during ageing is not by itself proof that this gene is causally involved in the ageing process. Nonetheless, for researchers studying transcriptional changes with age, also available are genes commonly differentially expressed during mammalian ageing which were identified by performing a meta-analysis of ageing microarray data.

Given the above considerations, when using the human dataset, you should not expect to find genes solely associated with a given age-related pathology but rather genes that can regulate the ageing process as a whole or at least multiple aspects of the ageing phenotype. As mentioned above, genes in the human dataset are by and large selected based on findings in model organisms, and thus they must be classified as putative, not proven, cases of genes associated with human ageing. Below we explain how genes are selected for inclusion in the human dataset.

Each gene in the human dataset was selected after an extensive review of the literature. We identified genes associated with ageing in model organisms as well as those that may directly modulate ageing in mammals, including humans. Of course, genes related to ageing in model systems may or may not be related to human ageing, and so we reviewed the literature concerning human and mouse homologues of genes identified in lower organisms. Each gene was selected or excluded based on its association with ageing in the different model systems, with priority being given to organisms biologically and evolutionary more closely related to humans. Because our focus is on the genetic basis of human ageing, we do not provide an in-depth description of ageing in model systems but rather incorporate the information gathered from multiple models to gather clues about the genetics of human ageing.

Initially, we grouped genes associated with organismal ageing to obtain functional groups. These are groups of genes that share similar functions or are associated with similar pathways. Identifying the largest groups and those most strongly associated with ageing allowed us to select a number of other genes for inclusion in the human dataset due to their association with other genes or pathways previously linked to ageing. Information from several other databases was also evaluated and, in some cases, integrated into GenAge. Several genes only indirectly linked to ageing are featured as we prefer false positives to false negatives; while users can ignore entries they consider irrelevant, false negatives can impact on research conducted using GenAge.

In each human gene entry the main reason for inclusion in the database is given. The following criteria are used:

You can view a list of all human genes in GenAge with their respective major reason for selection.

Searching the human dataset can be done from its main page or from its query page. GenAge symbols follow the HUGO Gene Nomenclature Committee (HGNC), but genes' common names and frequently-used aliases can also be searched. For instance, searching for "ghbp" would retrieve the growth hormone receptor (GHR) because "GHBP" is a common alias of GHR. The search engine of GenAge is case-insensitive. At present wildcards are not accepted in the searches, except "*" as in "abc*" will match "abc", "abcd", "abcde", etc. Lastly, you can view the whole list of genes in one page.

Once you find an entry of interest, GenAge can be used as a reference database, particularly for human genes. A wealth of biological data is provided for each entry, including annotation in the context of gerontology. For example, in GHR's entry, apart from basic biological information, the potential link between GHR and ageing is described, complete with bibliographical references: GHR gene products have been linked to ageing in mice, and this was the major reason for GHR's inclusion in the human dataset, but not in humans, even though mutations in the human gene have been described. Importantly, each entry in the human dataset represents not merely a summary of the bibliography but a manually curated summary which aims for accuracy and relevance in the context of ageing.

Of course, GenAge has its limits and our aim is to include the most relevant information, but not all the data available. If further, more detailed information is required then a selection of bibliographic references is available in each entry including hyperlinks to PubMed abstracts. Hyperlinks to top databases such as OMIM are also included to help researchers quickly locate additional information within other or broader biological scopes.

Arguably, the human dataset offers an overall view of what is presently known about the genetics of human ageing. To investigate the human dataset as a whole the most adequate tool is its browser. Using the browser it is possible to, among other things, retrieve only those entries that pass certain criteria related to the selection process or annotation. You may choose different criteria by holding down the CTRL key. If you choose one sole criterion, the resulting list will be ordered accordingly. Users can also download the database and analyze it locally.

The human dataset in GenAge can be helpful in more classical genetic studies of ageing and longevity. For example, if a given chromosomal region is identified, it is possible to look up which genes are present in that region. Although GenAge is not a bibliographic database, the bibliographic references in the human dataset can be a useful resource. References are hosted in our LibAge database.

Linking to GenAge

Human Gene

Linking to GenAge is straightforward. To create link to a human gene please use (preferred):

http://genomics.senescence.info/genes/entry.php?hgnc=x

And replace x by the HGNC symbol. Or (deprecated):

http://genomics.senescence.info/genes/entry.php?id=XXXX

And replace XXXX by the appropriate HAGRID. Using HGNC, however, is preferable because HAGRID numbers are not static and might thus change in future releases.

For example, to link to TP53 please use (preferred):

http://genomics.senescence.info/genes/entry.php?hgnc=tp53

Or (deprecated)

http://genomics.senescence.info/genes/entry.php?id=0006

Model Organism

To create link to a gene from a model organism please use (preferred):

http://genomics.senescence.info/genes/details.php?gene=x&organism=y

And replace x by the gene symbol and y by the organism's genus. Or (deprecated):

http://genomics.senescence.info/genes/details.php?id=x

And replace x by the appropriate ID. Using the gene symbol, however, is preferable because ID numbers are not static.

For example, to link to age-1 please use (preferred):

http://genomics.senescence.info/genes/details.php?gene=age-1&organism=elegans

Or (deprecated)

http://genomics.senescence.info/genes/details.php?id=7

AnAge: The Animal Ageing and Longevity Database

AnAge is a database of longevity, ageing, and life history in extant species employing the same engine of GenAge. You can query AnAge as described above for GenAge. The only wildcard allowed is "*", and you can search the database by keywords that include the organism's common name or taxonomic classification. Given the larger number of entries in AnAge, you cannot view all entries in the same page.

The most important trait in AnAge is maximum longevity (also called maximum lifespan) because it is the most widely used parameter for comparing rate of ageing between species. Maximum longevity is estimated from record longevity. Of course, many factors can bias longevity records, such as population size and whether animals are kept in captivity or not. Because we want maximum longevity to be a reliable term for comparisons between species, we try to minimize these problems. Briefly, we make a great effort to obtain the original source of each longevity record and verify its authenticity; anecdotes are not used to estimate maximum longevity, though they are mentioned in the observations section; and species for which maximum longevity is suspicious of being significantly underestimated have generally a maximum longevity classified as "not yet available". In addition, whether the maximum longevity of a given species comes from a specimen in the wild or in captivity is indicated for the vast majority of species.

All species have an estimate of sample size to allow researchers performing comparative longevity studies to minimize the bias of sample size on longevity records. For longevity records obtained from species in captivity, estimates of sample size were obtained from the International Species Information System (ISIS). Estimates of wild-derived records were typically obtained from the sources of the longevity data, such as banding studies in birds. Sample sizes reflect differences in orders of magnitude in the number of specimens for each species and are classified as 'tiny' (fewer than 10 specimens), 'small' (10-100), 'medium' (100-1000), 'large' (over 1000) and 'huge'. Human beings are the only species with a sample size classified as 'huge' and this classifier was included to mark the special status of the human species in this context.

Each entry has a qualifier of the confidence placed in the longevity data. This qualifier is based on the reliability of the original reference from which maximum longevity was obtained, sample size, whether a given species has been studied and reproduces in captivity, and whether there are any conflicting reports. Confidence in the longevity data is hence classified as: 'low' (only used for species without an established maximum longevity in AnAge), 'questionable', 'acceptable' and 'high'.

Entries in AnAge can be useful for researchers to learn more about the ageing process of a particular species. Species with unique ageing phenotypes or of special interest to gerontologists, such as species with negligible senescence and commonly used model organisms, are included. Apart from longevity, observations about physiological and pathological changes with age in animals are featured where available. Although demographic measurements of ageing are included in AnAge, these require detailed animal studies which are rarely available and thus represent only a small fraction of the data in AnAge. If possible, we determined the mortality rate doubling time (MRDT) for a given species using the Gompertz equation, as described elsewhere.

Typical values of major life history traits such as adult body size and age at sexual maturity are also featured in AnAge, at least for most mammals. Estimates of metabolic rates, such as resting or basal metabolic rate, are also featured for some species. Nonetheless, while we try to consult the original source regarding longevity records (as described above), for other life history traits and metabolic rates we usually rely on reviews and large-scale datasets. We do try to minimize errors, however, and observed discrepancies (e.g., between male and female ages at sexual maturity or between inter-litter interval and litters per year) reflect inconsistencies in these large-scale data sources.

For mammals, also included is the maximum longevity (tmax) residual, expressed as a percentage of the expected maximum longevity calculated from the adult body size (M) and derived from the mammalian allometric equation: tmax = 4.88M0.153. This is useful to identify species that live longer than expected for their body size. Cetaceans were excluded because we have less confidence in their longevity records, obtained from studies in the wild often using indirect methods, than in those from other mammalian taxa.

Included for some mammals and birds are growth rates. These values represent postnatal growth rate and are expressed in days-1. They were calculated by fitting empirical data taken from published growth curves to sigmoidal growth functions and are considered appropriate for comparative analyses within the same taxonomic class. Please be aware, however, that growth rates for mammals were derived from the Gompertz function while growth rates for birds were derived from the logistic function, so comparisons between the two classes need to take this into account.

Again, like GenAge, AnAge has its limits and external hyperlinks point users to further sources of information such as the websites of the Tree of Life and Animal Diversity Web.

AnAge's browser works differently than GenAge's browser, though, as it allows users to browse through the taxa in AnAge. Once a given set of entries has been selected it can be surveyed to gather simple descriptive statistics. Again, the species of interest can be selected with a mouse click and the average value for the parameter of interest can be calculated. For instance, to search for turtles one would simply type "testudines", the order to which turtles belong to. The list of turtles in AnAge would then be displayed and ordered according to taxonomy or longevity, and one could then obtain the averages and standard deviations for longevity, adult body size, age at sexual maturity, or other major traits present in AnAge. Like in GenAge, it is easy to navigate between the different tools in AnAge due to extensive cross-links between the different pages.

Often, additional information relevant to a particular species can be found in the higher taxa of that species. Therefore, users are encouraged to refer to the observations related to their species of interest as well as the taxa it belongs to.

Because our ultimate aim is to help understand human ageing, priority is given to species evolutionary closer to humans. Though there is a special focus on mammals and mammalian entries tend to include more information, other taxa are also represented, including some non-animal species. Species are classified according to: Kingdom, Phylum, Class, Order, Family, Genus, and Species. The taxonomy of AnAge follows that of the Integrated Taxonomic Information System (ITIS).

The bibliography of AnAge is also available for search, working exactly as described above for GenAge.

Linking to AnAge

Linking to AnAge is also simple. Just use the species' name (preferred):

http://genomics.senescence.info/species/entry.php?species=x_y

And replace x by the organism's genus and y by the organism's species. This method is preferable because if for some reason an entry happens to be deleted you will get an error rather than a different entry. You may also create links using the HAGRID numbers (deprecated):

http://genomics.senescence.info/species/entry.php?id=XXXXX

And replace XXXXX by the appropriate HAGRID. Using species' names, however, is preferable because HAGRID numbers may change in future releases of the database.

For example, to link to the human species Homo sapiens use (preferred):

http://genomics.senescence.info/species/entry.php?species=homo_sapiens

Or (deprecated)

http://genomics.senescence.info/species/entry.php?id=03116

GenDR: The Dietary Restriction Gene Database

GenDR is an informational resource on genes related to dietary restriction (DR), the most powerful non-genetic intervention shown to extend lifespan in a wide range of model organisms. Specifically, GenDR includes a manually curated list of DR-essential genes ("Gene manipulations"), and a conserved molecular signature of DR generated from a meta-analysis of DR-differentially expressed genes in mammals ("Gene expression").

DR-essential genes are genes that when genetically manipulated (e.g. by loss-of-functions such as mutation, transposition, or knockdown, or by gain-of-function like overexpression of additional transgenes) interfere with the ability of DR to extend the lifespan. In the ideal case a genetic alteration of a DR-essential gene does not change the lifespan on ad libitum (AL), but totally cancels out lifespan-extension by a restricted diet. In such cases, their selection criteria follows the same logic behind GenAge, but focuses specifically on those genes which are necessary for the modulation of the ageing process by diet (which are by definition not necessary also ageing genes). A focus on genes from gene manipulation experiments means that the selection procedure for including genes related to DR will be more objective and unbiased, just like for GenAge. Nonetheless, as there are multiple different forms of DR regimens, most of which can be applied at several different levels of severity, genes were included if there is evidence that they interfere with at least one kind of DR regimen or strength of restriction. This includes by definition a shift in the response to the food concentration at which lifespan is extended (e.g. chico). We sought to use as broad a definition as possible to select DR-essential genes, and to adhere to it consistently, in order to incorporate genes that are DR-essential under a diverse range of experimental settings.

Each record in GenDR on a DR-essential gene contains comments (i.e. observations) about literature-based evidence and the reason(s) for inferring an association with DR. References are also cited. In the case of conflicting reports for a given gene, our policy is to still include this gene in GenDR but mention all the conflicting studies. Specific for yeast, genes which are DR-essential in either a replicative or a chronological lifespan assay were included. Both the specific DR regimen and the ageing model in which a gene was found to be essential for DR-lifespan extension is annotated in GenDR. The intention for using a broad definition of DR-essential genes, rather than a more specific one, is that this will make it possible to focus on mechanisms universal to DR and therefore more likely relevant to human ageing. Homologs of DR-essential genes in each other species, such as other model organisms, as well as in humans were retrieved via the HomoloGene or InParanoid database. Genes can be viewed by organism, individual genes can be searched, and in a matrix view (specifically developed for GenDR) all genes including those with or without homologs can be displayed all at once. DR-essential genes for each organism can also be downloaded.

In a meta-analysis of DR using microarray profiles from mammals, a common transcriptional signature of DR-differential expressed genes was derived. These genes consistently change their expression level in multiple tissues, across multiple species, and between different experimental platforms. Individual genes can be searched, and for every gene its enrichment in over- or under-expression is indicated.

Linking to GenDR

Providing a link to DR-essential genes works either via providing the name of both genus and species for all genes of an organism or the Entrez gene ID for individual genes:

http://genomics.senescence.info/diet/search_mut.php?organism=Caenorhabditis%20elegans

links to all nematode DR-essential genes.

http://genomics.senescence.info/diet/details.php?id=14600

links to Ghr.

LongevityMap: Human Longevity Genetic Variants

The LongevityMap is a database of human genetic variants associates with longevity. Genes, variants (e.g., SNPs) and chromosomal locations are featured, reflecting the diversity of human genetic association studies of longevity. All these can be queried for searching entries of interest.

We followed the high standards and rigorous procedures of GenAge to develop the LongevityMap. Succinctly, all entries in the LongevityMap were manually curated from the literature. Studies were selected following a rigorous literature survey that excluded studies in cohorts of unhealthy individuals at baseline. Although the LongevityMap, like GenAge, is an inclusive database in which both large and small studies are included, details on study design are detailed for each entry, including population and sample size. Negative results are also included in the LongevityMap to provide visitors with as much information as possible regarding each gene and variant previously studied in context of longevity. Each entry refers to a specific observation from a study and is flagged regarding whether results were or not statistically significant. Large-scale studies often have multiple entries in the LongevityMap. As in GenAge, our policy concerning controversial results is to detail the facts concerning the controversy and let users to reach their own opinions.

Linking to LongevityMap

Because the LongevityMap has a large diversity of data, creating links is not as simple as for our other genetic databases. It is possible, however, to link to specific gene included in the database:

http://genomics.senescence.info/longevity/gene.php?id=x

And replace x by the HGNC symbol.

For example, to link to APOE:

http://genomics.senescence.info/longevity/gene.php?id=APOE

DrugAge: Database of Ageing-Related Drugs

The DrugAge database contains an extensive compilation of drugs, compounds and supplements (including natural products and nutraceuticals) with anti-ageing properties that extend longevity in model organisms. Drugs, species, dosage and (for some entries) gender are featured. All these can be browsed or queried for entries of interest.

We followed the high standards and rigorous procedures of our other databases to develop DrugAge. Succinctly, all entries in DrugAge were manually curated from the literature. Studies were selected following a rigorous literature survey. Our focus is on drugs/compounds potentially impacting on ageing, and therefore drugs/compounds extending lifespan in disease-prone animals (e.g., cancer models) or harmful conditions are excluded. Negative results are only included for selected relevant entries (i.e., drugs of particular interest to the field of ageing, like drugs shown in major studies to extend longevity). Each entry refers to a specific observation from a given study, and therefore there could be multiple conflicting entries for a given compound. As in GenAge, our policy concerning conflicting results is to detail the facts and let users to reach their own opinions.

CellAge: Database of Cell Senescence Genes

The CellAge database contains a collection of human genes associated with cellular senescence, the process in which normally proliferating cells undergo irreversible cell-cycle arrest and cease dividing. Cell senescence plays a role in tumour suppression and is thought to contribute to the ageing process.

As with all our databases, the data was manually curated using results reported in the scientific literature. Studies were selected following a review of the literature available for each gene. Selection was based on gene manipulation experiments (gene knockout, gene knockdown, partial or full loss-of-function mutations, over-expression or drug-modulation), which cause cells to induce, inhibit or reverse cellular senescence. Focusing on genes from genetic manipulation experiments ensured the selection process was more objective and unbiased.

The type of senescence (replicative, stress-induced, oncogene-induced), cell type/line and manipulation methods have been recorded and can help the user browse or query for entries of interest. The CellAge database includes data both from primary cells as well as from immortalized cell lines and cancer cell lines, the latter being marked as such for cancer specific analyses.

Each record contains observations about the evidence often including further experimental details and a summary of the results. Where reported common markers of senescence such as increased SA-Beta-galactosidase activity, a decrease in BrdU incorporation or changes in morphology are described. Ideally, entries only describe the results from one relevant piece of literature that best meets the selection criteria, however in some cases results from multiple publications are detailed to best illustrate the link to senescence.

Additional Features in HAGR

In HAGR you will also find a collection of bioinformatic tools, in particular a Perl toolkit entitled the Ageing Research Computational Tools (ARCT). Documentation and examples are included in ARCT's distribution and it is recommended that you read it before installing ARCT or any of its modules. A method to estimate the rate of ageing of a given population is also available together with an SPSS script to perform the calculations.

HAGR uses an accession number called HAGRID, which stands for Human Ageing Genomic Resources ID, not the gamekeeper in Harry Potter. HAGRID numbers can be used to search and retrieve entries from most of the HAGR databases. As mentioned above, however, HAGRID numbers might change and hence their usage should be kept to a minimum. These IDs are mostly used by curators and developers and are not intended for third-party use.

We recommend that users read our scientific strategy to understand the goals of HAGR and better use our resources. Also, for both legal and practical reasons, HAGR is part of senescence.info, an informational website about the biology of ageing that is an excellent introduction to the topic. Included in HAGR are also extensive links.

Feedback is always welcomed. If you have any problem, suggestion, idea, etc., please feel free to contact us.