Administrative Core

Overview

PI: Michael A. Province

The LLFS Administrative Core is the hub of daily communication between the field sites (and their staff), the coordinating center at Washington University in St. Louis, and project investigators. The Administrative Core facilitates communication and coordination between and amongst Project 1, Core B Phenotyping, Core C Biospecimen, and Core D Analysis Methods. The Administrative Core is responsible for tracking study progress, overseeing the budget, and handling subcontracts. In addition, the Administrative Core hosts all LLFS conference calls for all the committees and cores using GoToMeeting and designated conference call lines. The logistics of in-person steering committee meetings and Observational Study Monitoring Board (OSMB) meetings are handled by the Administrative Core. LLFS yearly training workshops are also coordinated and planned by the Administrative Core. As the central communication and information hub for LLFS, data entry via Research Electronic Data Capture (REDCap) will be implemented and overseen for Visit 3 by the Administrative Core. As an extension of this, the entered data will undergo quality control/quality assurance (QC/QA), harmonization, and cleaning to prepare interim and final cleaned analysis datasets for release on the investigator website, all of which is performed by the Administrative Core. The Administrative Core is also responsible for upkeep of the investigator and public LLFS websites. All of these responsibilities are integral to the successful collaborations that come from LLFS and will continue under the following specific aims:

Aim 1:Organize, facilitate, and support all communication for the LLFS, including committee conference calls, coordination of cores, project, and study sites, LLFS internal and external websites, liaison with the single IRB, budgeting, and meeting planning.

Aim 2: Implement and Oversee Project 1 Data Collection, integration, phenotypic harmonization, data cleaning, and data distribution.

Aim 3: Share data to outside scientific community and implement training workshops for both new and existing LLFS investigators.

Ancillary Studies Committee

The LLFS encourages collaboration and invites investigators to propose and conduct ancillary studies. Ancillary Studies typically utilize LLFS basic resources and data and generate additional data either gathered directly from LLFS subjects or from stored biospecimens. Such studies enhance the value of the LLFS and ensure a continued interest in the LLFS from the diverse group of investigators who are critical to the success of the study as a whole. To protect the integrity of the LLFS, ancillary studies are reviewed and approved first by the Ancillary Study Committee and then by the Executive Committee before their inception. OSMB approval of ancillary studies is needed only for proposals involving participant burden. In general, ancillary studies require external (non-LLFS) funding, since no costs for an ancillary study are budgeted by the parent study. A total of 17 ancillary study proposals have been approved in LLFS, of which two are completed, seven are in progress, two are awaiting funding decisions, and six are abandoned due to lack of funding or feasibility. The Ancillary Studies Committee has conference calls as needed.

Budgets

While the LLFS has a detailed budget plan, any complex project such as this one may experience unexpected events, missed timelines, or delays that may require re-budgeting to meet study goals and aims. For example, if Visit 3 recruitment goals are not met as planned at a particular field site (due to unexpectedly high mortality in the probands or higher than expected refusal rates in grandchildren) another site may experience lower mortality or lower refusal rates and be able to make up the difference. It is relatively easy to re-budget between subcontracts of the single U19 grant. It was much more difficult to accomplish this under the previous LLFS funding mechanism of 5 linked U01s (one for each FC plus the DMCC). A key function of the Administrative Core is to monitor progress on all aspects of the study, in Project 1 and all 4 Cores, and to make budgetary adjustments as needed to achieve the specific aims of the U19. Ms. Cherie Moore is the Business Manager for the Department of Genetics as Washington University, and will handle Budgetary issues for this U19. She has had extensive experience with complex, research projects with multiple subcontracts over many years.

Data Management

The Data Management Coordinating Center (DMCC) receives all de-identified data from the Field Centers via the Research Electronic Data Capture (REDCap) system for further quality control (QC), reporting, and generation of analyses files. The DMCC implemented an online training session and a data entry certification test for all Field Center staff who use REDCap for LLFS. Throughout data entry, the DMCC downloads data from REDCap into SAS datasets and runs QC programs monthly. QC programs check for correct skip patterns and consistency of information across forms, incorporate data from prior visit and annual follow-up when necessary, and verify completeness of data entry.

Data checks have been incorporated into the data entry in REDCap. During Visit 2 data collection, QC reports were distributed monthly to each field site. These reports were a comprehensive list of ‘edit checks’ for the field staff to compare to the original records and submit/correct potential errors as soon as possible in REDCap or mark the questioned data point as valid. Throughout active data collection and data cleaning, we have bi-weekly Data Management Committee meetings with all field site representation and the DMCC. These meetings discuss any issues with data entry that may be ongoing and address potential solutions, which are then discussed and implemented as soon as possible for smooth data entry. This led to high quality data from Visit 2, thus we intend to implement the same process for Visit 3 phenotypic data.

Every 6 months the DMCC will distribute an interim cleaned version of the cumulative data to all LLFS investigators. The cleaning process for Visit 3 and annual follow-up data involves an intense interaction between DMCC and key Field Center staff through Phenotyping Core, over the preceding month. For omic data, it involves interaction with Biospecimen Core investigators. Derived variables for analysis will be updated every 6 months and a data dictionary will be distributed to include formulas and scripts used to create derived variables. A documentation “Codebook”, will give a complete inventory of each dataset, and an index providing for each variable its label, format, and selected univariate statistics. Documentation will be distributed to LLFS investigators through our secure LLFS website (above). As with prior visits, the cleaned Visit 3 de-identified data & documentation will be posted to dbGaP at the same time as to LLFS investigators. Further edits and newly derived variables created during distributed analysis will be referred back to the master DMCC database to be incorporated into future data releases. Additionally, we will look TOPMed as a model of data distribution in terms of type of data files shared, platforms, etc. so harmonization in future studies can be easily achieved and analysis scripts for one study can work on multiple studies.

LLFS Logistics and Website

Meeting Logistical Support

The Administrative Core is responsible for all in-person meetings and conference calls. This includes the bi-annual steering committee meetings, annual OSMB meeting, yearly training workshops, and any other in-person meetings that occur. The Project Manager, Leanne Kniepkamp, secures meeting spaces, arranges the meeting space (meals, technology, phone lines), aids with hotel/travel for investigators, and prepares the meeting agendas and materials. Ms. Kniepkamp works with Committee Chairs to distribute minutes after each meeting (via study website), and tracks action items.

Website

The Administrative Core is responsible for maintaining and updating the internal (investigator) LLFS website. We utilize CONFLUENCE software from Atlassian to manage the site. CONFLUENCE is a Java-based application that supports complex project management with many subcomponents. It is used extensively at Washington University to support many online activities, including teaching, financial, regulatory compliance, organization, etc. The internal LLFS website is password protected, with each LLFS investigator having their own sign-in credentials. This website organizes all LLFS committees and Cores, including documenting the call details, distributing presentations, and keeping all meeting minutes and agendas, action items and progress. Additionally, the study specific calendar is housed on the site. Ancillary study proposals and proposed manuscript are also maintained here and publications and proposals are tracked on the website. The current version of the Manual of Procedures (MOP) is posted on the site as are data collection forms. The internal LLFS website is also used to distribute phenotypic data to LLFS PIs and collaborators. The Administrative Core also maintains the external (public) LLFS website. Originally this site was used for study recruitment, and is currently being updated by the DMCC to highlight the key findings from LLFS, promote recruitment of the grandchildren, promote opportunities for collaborative studies and to provide links to shared data on dbGaP.

Publications and Presentations (P&P) Committee

The P&P Committee manages the internal review of meeting abstract and paper proposals and also discusses and encourages development of priority papers. One or two internal reviewers and a statistical reviewer when needed, are assigned to review and provide feedback on papers at the proposal and final manuscript stages. The P&P Committee consists of representatives from each field center, each Core and the NIA. Papers will be referred to the Analysis Methods Core for statistical review when they present methodological issues beyond the expertise of the P&P. The P&P will continue to meet bi-monthly to ensure timely review of papers and paper proposals. The P&P committee has reviewed a total of 162 proposals since its inception (60 publications, 102 active proposals). These proposals have produced key results and publications that support the pursuit of our U19 Specific Aims, some of which we highlight below.

Defining Healthy Aging Phenotypes. One of the major findings of LLFS has been the discovery of considerable familial phenotypic heterogeneity. Different families show different healthy aging characteristics and profiles in key pathways of healthy aging (cognitive, metabolic, inflammatory, physical function, pulmonary, etc.). One major aspect of our research to date has thus focused on identification of further characterizing these healthy aging phenotypes (HAPs), defining their heritability and relation to exceptional longevity (EL), and identifying biomarkers and genetic underpinnings of these HAPs. Newman et al (1) found that diabetes, chronic pulmonary disease and peripheral artery disease were less common, pulse pressure and triglycerides were lower, HDL-C levels were higher, and perceptual speed task and gait speed were better in LLFS probands and offspring compared with similarly aged persons in the CHS and FHS cohorts. Age-specific comparisons showed differences that would be consistent with a higher peak, later onset of decline or slower rate of change across age in LLFS participants. Barral et al. (2) defined a cognitive endophenotype based on exceptional episodic memory performance (EEM) in the proband generation and found that LLFS relatives in the proband generation from EEM families showed better episodic memory performance than those from Non-EEM families. In a related analysis, Singh et al. used factor analysis to construct a set of endophenotypes on 28 traits representing 5 domains (cognitive, cardiovascular, metabolic, physical, and pulmonary) and assessed their relationship to mortality. The most dominant endophenotype primarily reflecting physical activity and pulmonary domains, was significantly associated with mortality, and attenuated the association of age with mortality by 24.1% and may represent a major underlying phenotype related to aging.

Biomarker Findings. Several key analyses have examined biomarkers associated with HAPs. Barral et al (3) found that families with exceptional cognition (EC) showed a better metabolic profile (β = -0.63, SE = 0.23, p = .006) and physical/pulmonary function than non-EC families, which was related to obesity in an age-dependent fashion. The prevalence of obesity in EC families was significantly lower compared with non-EC families (38% vs 51%, p = .015) among family members < 80 years. EC families also showed better physical/pulmonary function than non-EC families (β = 0.51, SE = 0.25, p = .042). Sebastiani et al (4) examined 19 blood biomarkers of standard hematological markers, lipid biomarkers and markers of inflammation and frailty in both LLFS generations, yielding 26 biomarker signatures that correlated with longitudinal changes in physiological functions and incident risk of cancer, cardiovascular disease, type 2 diabetes, and mortality. Signature 2 (characterized by lower than average creatinine and cystatin C values, lower than average biomarkers of inflammation, and elevated albumin) was associated with significantly lower mortality, morbidity, and better physical function relative to the most common biomarker signature in LLFS. Nine other signatures were associated with less successful aging, characterized by higher risks for frailty, morbidity, and mortality. This analysis shows that various biomarker signatures exist, and their significant associations with physical function, morbidity, and mortality suggest that these patterns represent differences in biological aging.

Genetic Findings. We have also identified several linkage peaks for key HAPs that are highly heterogeneous, with exceptionally large HLODs, each driven by small numbers of different families, suggesting that there are multiple novel, rare, lineage-specific variants driving healthy aging profiles & their longitudinal trajectories in different families for different phenotypes. Several analyses have shown strong linkage to metabolism related HAPs. An et al. (5) examined GWA with HbA1c in non-diabetic participants and confirmed two known loci at GCK rs730497 (or rs2908282) and HK1 rs17476364 (p < 5e–8). Of 25 suggestive loci, one known (G6PC2 rs560887) and one novel locus (OR10R3P/SPTA1- rs12041363) were replicated in other cohorts. Feitosa et al. (6) identified novel variants near NRLP1 (17p13) which were associated with an increase of HDL levels at genome-wide significant levels. Additionally, several CETP (16q21) and ZNF259-APOA5-A4-C3-A1(11q23.3) variants associated with HDL were found, replicating those previously reported in the literature. A possible regulatory variant upstream of NLRP1 that is associated with HDL in these elderly subjects may also contribute to EL and health. For the exceptional episodic memory (EEM) trait, association analysis identified SNPs nominally associated with EEM in a 40-megabase window encompassing the linkage peak (7). Replication in one cohort identified a set of 26 SNPs associated with episodic memory. Meta-analysis of the 26 SNPs found SNPs rs9321334 and rs6902875 to be nominally significantly associated with episodic memory. Haplotype analysis incorporating the 2 SNPs flanking rs6902875 (rs9321334 and rs4897574) and haplotype analysis revealed that the A-A-C haplotype, a region that harbors monooxygenase dopamine β-hydroxylase-1ike 1gene (MOXD1), that was significantly associated with episodic memory performance (P = 2.4 × 10−5). Druley et al (8) performed custom hybridization capture sequencing to identify the functional variants in 464 candidate genes for longevity or the major diseases of aging in all 4,953 LLFS individuals, using a multiplexed, custom hybridization capture. Variants were analyzed individually or as a group across an entire gene for association to aging phenotypes using family based tests. Significant associations to three genes and nine single variants were found: most notably, a novel variant significantly associated with exceptional survival in the 3’ UTR OBFC1 was found in 13 individuals from six pedigrees. OBFC1 (chromosome 10) is involved in telomere maintenance, and falls within a linkage peak from an analysis of telomere length in LLFS families (Lee et al., 2014)(9). Two different algorithms for single gene associations identified three genes with an enrichment of variation that was significantly associated with three phenotypes (GSK3B with the Healthy Aging Index, NOTCH1 with diastolic blood pressure and TP53 with serum HDL). Most notably, a novel variant significantly associated with EL in the 3’ UTR of OBFC1 was found in 13 individuals from six pedigrees. In another of our studies, OBFC1 (chromosome 10) is involved in telomere maintenance, and falls within a linkage peak in specific LLFS families(10). Finally Stevensen et al. (11) found that the LLFS families are not distinguished by lower rates of disease-associated variants. They built a genetic risk score for each of four age-related disease groups: Alzheimer’s disease, cardiovascular disease and stroke, type 2 diabetes, and various cancers and compared the distribution of these scores between older participants of the LLFS, their offspring and their spouses. The analyses showed no significant differences in distribution of the genetic risk scores for these groups. These findings support the hypothesis that the survival and healthy aging advantages that we observe in LLFS families are attributable to protective or resistance variants.

In summary, results of our studies to date support the hypotheses that there are heterogeneous familial protective mechanisms leading to health aging and exceptional longevity, distinct from reduction in common disease related variants.

Quality Assurance (QA), Quality Control (QC), and Data Harmonization

Quality control efforts focus on complete data collection that is as accurate as possible. Completeness of our Data Entry Process (using the Research Electronic Data Capture, REDCap, system), as well as completeness of the Examination Data from Visit 2 were 100% and >95%, respectively. When individual forms/procedures are refused by subjects, we create an entry into our database documenting this refusal (and the reasons why).

Quality Assurance (QA) and Quality Control (QC) are shared responsibilities of all investigators guided by the Phenotyping Core, and the Biospecimen Core, with the DMCC playing a central role. As in Visit 1, for Visit 2 the staff from each Field Center participated in an initial central training session. A training and QC Liaison at each center was responsible for maintaining measurement and training standards, including training new staff, in the event of staff turnover, and re-certification for existing staff. As in Visits 1 and 2, the exams occurred over ~3 years so certification and annual re-certification were required for standardized and reproducible administration of questionnaires, cognitive batteries, physical performance measures, blood pressure, anthropometric measures, spirometry, and data entry, as well as the new carotid ultrasound studies. For all measures, the certification process, which will be repeated prior to Visit 3, involves (a) centralized training sessions, followed by (b) observation of the staff member for all keys steps (checklist) while s/he performs the procedures and (c) assessment and feedback. These procedures will take place in conjunction with the first in-person Steering Committee of this U19. A DMCC weekly report provided to all field centers up-to-date QC statistics on the latest data including completeness/missing data, reliability statistics and protocol drift. These reports will be used to identify problems/concerns requiring further inquiry and/or prompt action. Additionally, data checks have been incorporated into the data entry in REDCap. During Visit 2 data collection, QC reports were distributed monthly to each field site. These reports were a comprehensive list of ‘edit checks’ for the field staff to compare to the original records and submit/correct potential errors as soon as possible in REDCap or mark the questioned data point as valid. Throughout active data collection and data cleaning, we have bi-weekly Data Management Committee meetings with all field site representation and the DMCC. These meetings discuss any issues with data entry that may be ongoing and address potential solutions, which are then discussed and implemented as soon as possible for smooth data entry. This led to high quality data from Visit 2, thus we intend to implement the same process for Visit 3 phenotypic data. The QA and QC of the new OMIC data is a collaboration between the Biospecimen Core and the Analysis Methods Core and is discussed in the Biospecimen Core Research Strategy.

Harmonization of Phenotype Data. We harmonized phenotypes between LLFS and FHS for Visit 2 by comparing the two studies original questionnaires and protocols. We added some questionnaires collected in FHS (for example the physical activity index) and expanded the cognitive data collection. We also added use of a digital pen and digital recording to parallel FHS’s e-metrics data collection and analyses efforts. We will continue to work together to harmonize data collection for Visit 3, including adding the FHS dietary questionnaire. The Administrative Core will lead the harmonization efforts with the Baltimore Longitudinal Study of Aging (BLSA), and the Longevity Consortium-New England Centenarian Study.

Harmonization of Genotype/Sequencing Data: For common variants, we will utilize imputation to achieve a common denominator of SNPs. LLFS and FHS have previously imputed to both 1000 Genomes reference haplotypes and the Haplotype Reference Consortium panel. Most of the proposed replication, comparison and consortia studies have already imputed SNPs. There are still important details, (e.g. make sure the same coded alleles are used and that there is no “strand flipping”), but we are very familiar with these challenges in our meta-analysis consortia activities. For rare SNPs, imputation is more problematic, as the reference panels are not sufficiently large to properly catalogue these, much less tag all of the haplotypes. Complementary sequencing would need to be performed in the replication study, and we would be analyzing for replication of the gene (or region) as a unit, not replication of any particular variant. FHS was a first tier study in TOPMed and has whole genome sequencing on over 4000 participants for three generations. FHS also has existing 450K DNA methylation array data, transcriptomics (both array and RNA-seq), metabolomics, and Aptamer-based proteomics that can be used to replicate findings in LLFS. The Longevity Consortium (LC) plans to use similar platforms for the generation of –omics data on samples from MrOs, SOF and the New England Centenarian Study. The BLSA is currently enrolling individuals 80 years and older who are extremely well-functioning and have exceptional health. This new aspect of the BLSA has been named The IDEAL (Insight into the Determinants of Exceptional Aging and Longevity) Study. Historically, the BLSA has been the gold-standard reference for aging-related changes in multiple systems and like FHS and the LC, multiple omics data are being generated from these subjects.