General Questions


Study Design Questions


Data Structure Questions


Technical Questions


Weights Questions


General Questions


What data are available to the public?

Currently, baseline, one-year, three-year, five-year, and nine-year core follow-up data are available to the public through the Office of Population Research data archive. Three-year, five-year, and nine-year in-home data are also available for a subset of core respondents. Medical records data and geographic identifiers (contextual data, city/state of residence, stratum/psu) are available to the public via a restricted use contract.

How can I access Fragile Families data for my analysis?

You must first visit the Office of Population Research data archive and register to receive the data.

Can I get access to city identifiers?

Geographic identifiers are only available through a restricted use contractual agreement. See the Fragile Families contract data page for more information.

What is the best way to view variable frequencies?

If you want to review frequencies before downloading the data, please review the codebooks available on the documentation page. Frequencies for variables are presented in the same order as the questions were asked in the survey instrument with constructed variables following the appropriate sections.

Where do I send questions about the data, procedures, problems, etc.?

Please email all questions about the data to ffdatahelp@opr.princeton.edu.

Can I distribute the data from the Fragile Families Public Use Files to my colleagues, even though they have not personally registered on the public use web site?

We ask that all users personally register in order to access the data files.

Why am I required to give contact information to register?

The Fragile Families Study receives funding from a number of different sources. We want to be able to provide our funders with information about data usage, such as the number of data users and what the data are being used for. Your contact information will not be used unless you ask to receive mailings about the data, study, etc.

How should I cite the Fragile Families study?

We request that users cite the substantial funding from the Eunice Kennedy Shriver National Institute of Child Health & Human Development in their publications with the following statement: "The authors thank the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) through grants R01HD36916, R01HD39135, and R01HD40421, as well as a consortium of private foundations for their support of the Fragile Families and Child Wellbeing Study."

How is the Fragile Families study sponsored and funded?

The Fragile Families and Child Wellbeing Study has been supported by a number of foundations and agencies. Click here to view the list of those who funded the core study. Collaborative study funding is available on the collaborative studies page.

Study Design Questions


How did you choose your cities and hospitals?

A detailed description of our sample design is contained in Reichman et al 2001, "The Fragile Families and Child Wellbeing Study: Sample and Design" Children and Youth Services Review, 2001, Vol.23, No, 4/5. A brief summary and additional details on data collection and hospital protocols are included in the "Introduction to the Fragile Families Public Use Data".

How did you decide which mothers to interview when you were in the hospital?

Sampling Mothers - Mothers of new babies were sampled at each hospital from maternity ward lists. Once sampled, mothers were asked to complete a screening instrument to determine marital status and eligibility for participation in the study. Quotas were set at each hospital for number of unmarried and married births, based on sample cities’ 1996/1997 unmarried birth rates. If a mother was determined to be above the set quota for a given marital status, the case was coded “over quota” and the mother was not interviewed. Mothers’ eligibility was determined based on the analytic goals, logistical restraints and design of the study, including the need to interview both a mother and father of a child who would be residing with at least one of those parents. Thus, for instance, mothers whose babies would be adopted were considered “ineligible” and were not interviewed.

Sampling Fathers - Once a mother had been determined to be eligible, and had given her signed consent for participation, the baby’s father was also asked to participate in the study.

See the Guide to the Public Use Files (section VII) and the sample design paper.

Is the Fragile Families data nationally representative?

National weights make the data of 16 of the 20 cities representative of births in the 77 U.S. cities with populations over 200,000. See the weights documentation, Sample Design paper, and Introduction to the Fragile Families Public Use Data for extensive discussions of the weights and samples.

What are the response rates for each follow-up?

(See Introduction to Public Use Data Section IV.)

Data Structure Questions


How are the data files structured?

The data are structured as one record per child. Mother and father data are in separate files. There are records for all 4,898 mothers and fathers at each wave, regardless of whether they were interviewed. Mother and father data can be merged using the IDNUM variable. Flag variables ( e.g. CF1FINT, CM2MINT, CF2FINT) indicate whether or not a mother/father was interviewed at a given wave (all mothers were interviewed at baseline so there is no CM1MINT variable). Cases not interviewed are coded as -9 "Not in wave" on all other variables. There are also flag variables (e.g. CM1FINT, CM2FINT and CF2MINT) on the mothers' and fathers' records indicate whether the corresponding mother or father was interviewed at the time of the follow-up.

How can I tell if a question was asked in a given wave?

Questionnaires are available on the documentation page of the data users site. There is also a questionnaire map which maps the section each concept was measured in the mother and father surveys across the first three waves.

Where are the interview date variables?

On each of the baseline files (mothers' and fathers') there are two variables you should use to find out when the respondent was interviewed. M1INTMON / F1INTMON represent the month of interview, and M1INTYR / F1INTYR represent the year of interview. There is also a constructed variable (CM1TDIFF) that is found in the mothers' files and can be used to check the time gap between parent interviews. There are corresponding variables at all waves.

What are the identifiers on the file?

There are three identifiers on the file for merging and sorting. MOTHID is the mother’s identifier. FATHID is the father’s identifier. IDNUM is the family identifier. The identifiers will remain fixed throughout the waves. The IDNUM is a 4 character string variable. The MOTHID consists of the 4 character IDNUM with an additional "0" at the end. The FATHID consists of the 4 character IDNUM with an additional "1" at the end. Each MOTHID or FATHID is followed by a "1," "2," or "3" indicating the baseline, one-year, or three-year follow-up (i.e. MOTHID1 and FATHID1 (baseline) - MOTHID2 and FATHID2 (one-year)).

How do I know if a case was interviewed in a given wave?

Flag variables (e.g. CF1FINT, CM2MINT, CF2FINT) indicate whether or not a mother/father was interviewed at a given wave (all mothers were interviewed at baseline so there is no CM1MINT variable). Cases not interviewed are coded as -9 "not in wave" on all other variables. Flag variables (e.g. CM1FINT, CM2FINT and CF2MINT) on the mothers' and fathers' records indicate whether the corresponding mother or father was interviewed at the time of the follow-up. The CM2SAMP/CF2SAMP and CM3SAMP and CF3SAMP variables provide information about the status of the case at one-year/three-year follow follow-ups. Information such as mother/father/child death between waves, nonresponse and changes in eligibility are coded in these variables.

What do -5 and -6 mean?

"-5" in the data file means the person was not asked a given question because that question was not on the version of the questionnaire used at the time of the interview. "-6" means the respondent was skipped from a question that wasn't appropriate for them to answer.

Technical Questions


I am having trouble opening zip files with WinZip.

If you are having trouble downloading the files simply by clicking on them (please select "Save" and not "Open"), try right-clicking on the file and selecting “Save Target As.” We reccommend using the WinZip Classic interface to open the zip files you downloaded. Users may also want to check with their IT department to make sure you have an up to date copy of WinZip. Click here to download the most recent version of WinZip.

In what formats are the data available (e.g., SAS, SPSS, Stata)?

The data are available in SAS, SPSS, and Stata (for Windows) format. If users need data in other formats, we suggest using a file transfer program such as StatTransfer or DBMS/Copy.

I get an error when I try open SAS files in Windows

Please use the SAS code included in zip files to read the formats. The formats are permanently attached to the variables in each data set. Or users can use the NOFMTERR option when reading in data.

Weights Questions


How are the weights constructed?

The weights were constructed to adjust for sample design (probablility of selection), non-response at baseline, and attrition on observed characteristics over the waves. For a brief introduction to using the weights, please read Fragile Families & Child Wellbeing Study: A Brief Guide to Using the Mother, Father, and Couple Weights for Core Telephone Surveys Waves 1-4. For a detailed account of how the weights were constructed, please read Fragile Families & Child Wellbeing Study: Methodology for Constructing Mother, Father, and Couple Weights for Core Telephone Surveys Waves 1-4.

Why do the national sample flags and city sample flags have different sample sizes than the weights variables?

There are valid weights for 1) interviewed cases and 2) cases in which we determined that the parent or child had died or that the child had been adopted or is living with neither parent. The cases for adoptions/living with neither parent have little or no interview data, they are coded as no in the national sample flags (and interview flags). Data users can, however, estimate the proportion of children/parents who died, etc by applying the weights to the interview sample flags.