A five-question women’s agency index created using machine learning and qualitative interviews
In this post, J-PAL Gender sector chair Seema Jayachandran summarizes her recent paper with Monica Biradavolu and Jan Cooper.
Women’s agency, or their ability to make and act on their choices for their lives, is an important concept in research and policy related to gender equality. Many policies aim to increase women’s agency, which could be a means for them to improve their health, economic security, and decision-making power within their household and community. In addition, increasing women’s agency is typically viewed as an end in itself.
Unlike physical characteristics such as height, women’s agency is a psychological construct, and it cannot be directly observed. For these reasons, it is challenging to measure quantitatively.
While there are several validated long survey measures of women’s agency, in many cases, researchers seek a short measure. The goal of this project was to design a validated short measure of women’s agency, suitable for north India and perhaps applicable elsewhere, through an innovative method for survey module design. The new method combines machine learning and semi-structured interviews, and we refer to it as MASI.
Richer ways to capture women’s agency, such as semi-structured interviews or real-stakes choice experiments, are not practical for most large studies. They are time-intensive, skill-intensive, logistically complex, or expensive. While these techniques can provide in-depth, nuanced data, a shorter, simpler survey module would allow more researchers to measure women’s agency across various contexts, particularly if agency is a secondary focus of the study. In our project, we used these richer ways to measure women’s agency as a “gold standard” to guide the choice of the best five quantitative, or close-ended, survey questions to use.
Preparing the long questionnaire and semi-structured interview guide
We identified the five best questions for measuring women’s agency—specifically agency within her household—from a large set of candidate questions. We constructed the set of candidate questions by combining close-ended questions from longer, existing surveys of women’s agency, and removing redundant questions. Questions were drawn from the Demographic and Health Surveys, Relative Autonomy Index, a J-PAL toolkit on measuring women’s agency, and the Sexual Relationship Power Scale. In total, we included 63 questions.
Next, we developed an interview guide for the semi-structured interviews. The questions covered six domains of women’s agency: education, fertility, mobility, health, employment, and household expenditures. Trained qualitative researchers conducted the interviews. We then applied qualitative coding methods to score each woman in each agency domain. We averaged these scores to arrive at an overall agency score. In the data analysis, we used this qualitative score as the benchmark against which we assessed the candidate quantitative questions.
The study took place in Kurukshetra district in Haryana, India, a context with sizable gender gaps. For example, both overall in Haryana and within our sample, the female employment rate is below 20 percent.
Our sample of 209 women from 21 villages completed a semi-structured interview and close-ended survey module between February and May 2019. The average age of study participants was 30, and they had on average 10 years of education. All women in the study were married and had a child under the age of 10. This allowed us to compare women’s answers to similar questions, for example about their relationships with their husbands or decisions over children’s health. Note that this choice of a sample means that the measure of women’s agency is appropriate for partnered women with children, and not adolescents or other groups.
Data analysis: Narrowing down 63 questions to a five-question module
Which survey questions best predicted a woman’s “true” agency, as measured by her qualitative score? To find out, we used two statistical methods techniques to select the best ones. The primary method is LASSO stability selection. By running many LASSO regressions on random subsets of the data, this algorithm selects the five questions that correspond most closely with the qualitative score. (As a brief primer, LASSO is a “supervised machine learning” technique that differs from a standard regression in that the estimator sets some coefficients to zero to avoid the model over-fitting the data. From a full set of explanatory variables, only a subset are selected for inclusion in the statistical model). The proposed index of women’s agency combines the five questions into an index (by normalizing each to have a standard deviation of 1 and then averaging them).
To understand how sensitive the selected questions were to the statistical method, we also used a second method, called backward sequential selection. This method starts with the full list of survey questions and iteratively drops the one question that causes the smallest decrease in explanatory power over women’s agency when the included questions are combined into an index. The procedure stops when five questions remain in the index. These statistical methods for variable selection are similar to standard machine learning techniques except that the number of questions chosen is constrained to be five.
We find that both LASSO and backward selection arrived at an index of women’s agency that is strongly correlated with the qualitative score derived from the semi-structured interviews. Table 1 reports the best set of survey questions to measure women’s agency, chosen based on their correspondence with the qualitative score. LASSO stability selection (column 1) is the preferred statistical technique. Overall, both of the statistical methods used selected a similar set of five questions that correspond quite closely to the qualitative score, with a correlation over 0.5.
|Opinion heard when expensive item like a bicycle or cow is purchased?||1||2|
|Need permission from other household members to buy clothing for self?||2||1|
|Allowed to buy things in the market without asking partner?||3|
|Permitted to visit women in other neighborhoods to talk with them?||4||4|
|Who do you consult with for decisions regarding your children’s health care?||5|
|Allowed to go alone to meet your friends for any reason?||3|
|Who in household decides to pay school fees for a relative from your side of family?||5|
|Five-Question Index R2||0.289||0.287|
Notes: The table lists the top five survey questions selected. The numbers in the cells in columns (1) and (2) indicate the selection order, with 1 referring to the best or most predictive question.
Three out of the five questions selected by LASSO stability selection (among 63 candidates) were also chosen by backward selection. When we consider ten-question versions of the modules, seven of the chosen questions overlap between the methods. This suggests that the results are quite robust to the specific method used. Moreover, the best fourth to tenth questions perform reasonably similar, so the biggest gains are from identifying and using the best three questions plus identifying the best ten questions and drawing the rest of the module from this set.
The five-question index is much more correlated with the “truth” than if one chose five questions randomly. More strikingly, the five-question index has more explanatory power than indices constructed from all 63 candidate questions, by averaging them or using principal component analysis to combine them. When we used the methods to choose the best N-question module, the performance of the LASSO stability index peaked at N=15 questions. Thus, when deciding whether to include three or ten questions, there is a tradeoff between survey length and quality of the measure, but after a certain point, adding more questions is not helpful. What is key is to identify the best survey questions, those that are information-rich, rather than adding more.
Interestingly, none of the general questions that ask a woman to assess her overall agency or perception of her power were selected. The top three questions chosen by LASSO stability selection ask about the woman’s role in specific purchase decisions: large household purchases, clothing for the woman, and items in the market. The other two questions pertain to agency over her physical mobility, specifically whether she can visit women in her neighborhood without permission, and her children’s health care.
Real-stakes choice experiment inadequate for measuring women’s agency
In addition to the qualitative interviews, we tested a “lab-in-the-field” game to measure women’s agency over household income. The game entailed a series of real-stakes choices a woman makes between money for herself or her husband. A potential advantage of a real-stakes choice is that it might be less susceptible to bias from respondents giving disingenuous answers they deem to be socially desirable.
The game did not work effectively in our study. The premise of this game is that a woman with less agency will more often choose money for herself because she would not have influence in how money given to her husband is spent. The key assumption is that women with low agency want more agency. However, some women with very low agency never want money for themselves out of a belief that money is men’s domain or fear of their husband’s reaction. This made behavior in the lab game a noisy measure of women’s agency. While we do not draw general conclusions about the effectiveness of lab games versus qualitative interviews, in our study, the qualitative approach proved superior. Its other advantage is that it covers more domains of agency than financial agency.
Implications and Recommendations
Benefits of the module
The five-question module of women’s agency, validated against semi-structured interviews, is a valuable new resource for measuring women’s agency. Some of the questions, for example on the ability to influence household purchases, seem fairly universal and conceivably would apply in contexts outside north India. Others related to the ability to visit friends might be more relevant in India than contexts with fewer restrictions on women’s mobility. Two of the questions are specific to married women, or women with children, the population for whom we designed the measure.
A natural direction for future work is to replicate the study to create short modules appropriate for other contexts and to assess the extent to which the same questions are selected elsewhere. One could also apply our method to design a “universal” module based on how robustly it predicts qualitative interview scores across multiple contexts. Widespread adoption of a common five-question module would allow for better comparisons of women’s agency across data sets; researchers, of course, could add many other questions tailored to their needs.
Another next step is to apply the five-question measure in impact evaluations to assess if it captures changes in agency that come about through policy interventions.
Benefits of the methodology
The study introduces a novel, mixed-methods way of developing a survey measure by using statistical methods to choose quantitative questions benchmarked against a qualitative measure. This new method, called MASI, could be applied to create survey modules for other complex concepts besides women’s agency. Many concepts—financial insecurity, cultural assimilation, trust in authority—are best measured with open-ended questions, yet there is a practical need for close-ended measures of them. Future research could apply the new MASI method to create survey modules for other nuanced constructs.
- Using qualitative interviews as a statistical benchmark can be valuable when designing short modules to measure complex concepts like women’s agency.
- More specific survey questions were more correlated with women’s agency as expressed in the qualitative interviews than were women’s overall assessments of their power.
- For measuring women’s agency through surveys, the key is to choose the right survey questions. Our analysis finds that after using the best 15 questions, adding more questions does not improve the measure. (With fewer than 15 questions, there is a tradeoff between a shorter survey module and a richer measure.)
- Lab games or other measures that assume that women with low agency have a desire to increase their agency might not work well in some contexts, such as north India.
Read the full paper for more information on this study and a list of all references.