Many thanks to M D Madhusudan, Akshay S Dinesh and Nandini Velho for fundamental conceptual help when I was early stage conceptualising this work in 2017-18 & for continous support in various aspects of this work as it continues to unfold. Full list of people involved are listed either as co-authors or under acknowledgements of the THETA Protocol (link)
As part of the Towards Health Equity and Transformative Action on Tribal Health (THETA) project, we have generated a fairly large and complex dataset that includes settlements, households, and individual level data, across five sites (proteted areas) in four states in southern, central & NE India. The data spans demography, livelihoods, household conditions, maternal and child health, nutrition, non-communicable diseases, and health system interactions.
The full dataset is archived on Figshare. But to enable ongoing engagement by others while I continue makign sense of the full dataset, thogut of trying out a way of allowing people to easily query/engage with the data so this is notes for that effort.
The key is to allow meaningful exploration without flattenign the 3-level (individual data, household level data & village level data).
I’ve now set up a first working version of a THETA-specific public data exploration app using Streamlit, hosted on Streamlit Community Cloud:
👉 https://theta-data-explorer.streamlit.app/
At this stage, the app is barely set up…still ironing out the kinks and testing. Hopefully the data will start speaking…and perhaps even tell a story…a story that perhaps answers the question posed here in the protocol.
Right now, the app allows a user to:
- Load structured tabular data through a browser
- Filter variables interactively (categorical, numeric, boolean)
- Subset the data without writing any code
- Download filtered slices for further analysis
- See the deployment and code transparently via GitHub
The data is structured across three linked levels:
-
Settlement-level data Unit of analysis: village / settlement Key: deidentified_village
-
Household-level data Unit of analysis: household Keys: deidentified_village, fulcrum_id_parent
-
Individual-level data Unit of analysis: person Keys: fulcrum_id_people, fulcrum_id_parent
The next iteration of the THETA data explorer will reflect this structure much more explicitly.
-
Separate views for each dataset: The app will be organised into three sections (likely as tabs):
- Settlements – village-level characteristics and context
- Households – socio-economic conditions and household-level variables
- Individuals – MCH, nutrition, NCDs, behaviours, anthropometry
-
Multi-level exploration
It should be possible to: - explore households within a selected settlement, or - explore individuals within selected households,
Current status
- ✅ Public Streamlit app deployed
- ✅ Reproducible GitHub → Streamlit deployment
- 🔄 Work in progress on dataset-specific views …
Last updated: 2025-12-30 16:03