Spaces:

jiyachachan
/

fp2

Sleeping

App Files Files Community

jiyachachan commited on Dec 8, 2024

Commit

c1fe312

verified ·

1 Parent(s): c2ad736

Rename pages/Global_map.py to pages/Child Mortality VS GDP.py

Browse files

Files changed (2) hide show

pages/Child Mortality VS GDP.py +71 -0
pages/Global_map.py +0 -120

pages/Child Mortality VS GDP.py ADDED Viewed

	@@ -0,0 +1,71 @@

+import pandas as pd
+import altair as alt
+import streamlit as st
+# Load the data
+child_mortality = pd.read_csv("https://huggingface.co/spaces/jiyachachan/fp2/resolve/main/child_mortality_0_5_year_olds_dying_per_1000_born.csv")  # Format: Country, Year, Value
+gdp_per_capita = pd.read_csv("https://huggingface.co/spaces/jiyachachan/fp2/resolve/main/gdp_pcap.csv")    # Format: Country, Year, Value
+# Melt datasets to tidy format
+child_mortality = child_mortality.melt(id_vars=["country"], var_name="year", value_name="child_mortality")
+gdp_per_capita = gdp_per_capita.melt(id_vars=["country"], var_name="year", value_name="gdp_per_capita")
+# Merge the datasets
+merged_data = pd.merge(child_mortality, gdp_per_capita, on=["country", "year"])
+merged_data["year"] = merged_data["year"].astype(int)  # Ensure 'year' is an integer
+# Drop rows with missing or undefined country values
+merged_data = merged_data.dropna(subset=["country"])
+merged_data = merged_data[merged_data["country"] != "undefined"]
+# Convert gdp_per_capita and child_mortality to numeric
+merged_data["gdp_per_capita"] = pd.to_numeric(merged_data["gdp_per_capita"], errors="coerce")
+merged_data["child_mortality"] = pd.to_numeric(merged_data["child_mortality"], errors="coerce")
+# Drop rows with missing or invalid data
+merged_data = merged_data.dropna(subset=["gdp_per_capita", "child_mortality"])
+# Streamlit app
+st.title("Interactive Visualization: GDP vs. Child Mortality")
+st.text(" ")
+st.text("The dataset represents global development indicators related to child mortality and GDP per capita for multiple countries over several years. Each row corresponds to a unique country-year combination, with the key fields being country (categorical, representing the country name), year (integer, indicating the year of data collection), child_mortality (numeric, showing the number of children under five dying per 1,000 live births), and gdp_per_capita (numeric, representing GDP per capita in constant 2017 international dollars). The dataset spans a wide range of years and countries, making it suitable for temporal and regional analyses. Missing values are present in some fields, particularly for earlier years or less-developed countries, and were handled during the data cleaning process. The values in child_mortality range from 2.24 to 756.0, while gdp_per_capita spans from $354.00 to $10,000.00, reflecting significant disparities in economic and health outcomes across countries and regions.")
+st.text(" ")
+# Filter data for a specific year
+year = st.slider("Select Year", min_value=int(merged_data["year"].min()), max_value=int(merged_data["year"].max()), value=2020)
+filtered_data = merged_data[merged_data["year"] == year]
+# Select number of countries to display
+num_countries = st.slider("Select Number of Countries to Display", min_value=5, max_value=50, value=10, step=5)
+# Get top N countries by GDP per capita
+top_countries = filtered_data.nlargest(num_countries, "gdp_per_capita")
+# Create scatter plot with regression line
+scatter_plot = alt.Chart(top_countries).mark_circle(size=60).encode(
+    x=alt.X("gdp_per_capita:Q", scale=alt.Scale(type="log"), title="GDP per Capita (Log Scale)"),
+    y=alt.Y("child_mortality:Q", title="Child Mortality (per 1,000 live births)"),
+    color=alt.Color("country:N"),
+    tooltip=["country", "gdp_per_capita", "child_mortality"]
+).properties(
+    title=f"Relationship Between GDP Per Capita and Child Mortality ({year})",
+    width=800,
+    height=500
+)
+# Add regression line
+regression_line = scatter_plot.transform_regression(
+    "gdp_per_capita", "child_mortality", method="linear"
+).mark_line(color="red")
+# Combine scatter plot and regression line
+final_chart = scatter_plot + regression_line
+# Display chart in Streamlit
+st.altair_chart(final_chart, use_container_width=True)
+st.text("To build the observatory, I began by preparing the dataset, which involved merging child mortality and GDP per capita data based on common fields: country and year. I ensured that the data was cleaned and formatted correctly, converting numerical fields like child_mortality and gdp_per_capita to numeric types and handling missing values by dropping rows with invalid entries. Once the data was ready, I created initial static visualizations using Altair to explore the relationship between GDP per capita and child mortality. The chart shows the relationship between GDP per capita and child mortality rates, highlighting an inverse trend where higher GDP per capita generally corresponds to lower child mortality. Building on this foundation, I added interactivity through Streamlit, allowing users to dynamically filter the dataset by year and select the number of countries to display. To enhance the visual analysis, I overlaid a regression line on the scatter plot, which provides a clear representation of trends. The app's functionality was refined iteratively, incorporating sliders for user interaction and tooltips for exploring country-specific data points.")

pages/Global_map.py DELETED Viewed

@@ -1,120 +0,0 @@
-# Hugging Face's logo
-# Hugging Face
-# Search models, datasets, users...
-# Models
-# Datasets
-# Spaces
-# Posts
-# Docs
-# Enterprise
-# Pricing
-# Spaces:
-# SmeetPatel
-# /
-# FP1
-# like
-# 0
-# App
-# Files
-# Community
-# Settings
-# FP1
-# /
-# app.py
-# SmeetPatel's picture
-# SmeetPatel
-# Update app.py
-# 9c77701
-# verified
-# less than a minute ago
-# raw
-# Copy download link
-# history
-# blame
-# edit
-# delete
-# 3.59 kB
-import os
-import subprocess
-import sys
-# Install plotly if not already installed
-try:
-    import plotly
-except ImportError:
-    subprocess.check_call([sys.executable, "-m", "pip", "install", "plotly"])
-# import subprocess
-# import sys
-import pandas as pd
-import streamlit as st
-import plotly.express as px
-# Load Dataset
-# Example: 'country' column for country names, other columns for years
-data = pd.read_csv("https://huggingface.co/spaces/jiyachachan/fp2/resolve/main/child_mortality_0_5_year_olds_dying_per_1000_born.csv")
-# Melt the data to long format for easier filtering
-data_melted = data.melt(id_vars=["country"], var_name="year", value_name="mortality_rate")
-data_melted["year"] = pd.to_numeric(data_melted["year"])
-# Streamlit App
-st.title("Global Child Mortality Rate (per 1000 children born)")
-st.write("""This interactive visualization provides an insightful overview of child mortality rates (number of deaths per 1,000 live births) across countries for a selected year.
-        The data highlights disparities in healthcare, socioeconomic conditions, and development across the globe, making it a valuable tool for understanding global health challenges.""")
-# Add year selection
-# years = sorted(data_melted["year"].unique())  # Extract unique years from the dataset
-# selected_year = st.selectbox("Select Year", years)
-# Add year selection with a slider
-min_year = int(data_melted["year"].min())
-max_year = int(data_melted["year"].max())
-st.subheader("Child Mortality Trends around the Globe")
-st.write("""
-This Chart reveals an important trend of how the child mortality rate have been changing across the years.
-This gives us a very important insight on how the present developed countries have successfully reduced the rate, and underdeveloped countries still faces challenges to curb child mortality successfully.
-We can utilise the trends in the graph to understand the factors which might be the responsible for high mortality or low mortality.
-This will help the policymakers in developing/under-developed countries to develop data-driven policy to reduce child mortality.
-""")
-selected_year = st.slider("Select Year", min_value=min_year, max_value=max_year, value = 2024, step = 5)
-# Filter data for the selected year
-filtered_data = data_melted[data_melted["year"] == selected_year]
-# Create the map
-fig = px.choropleth(
-    filtered_data,
-    locations="country",  # Country names or ISO 3166-1 Alpha-3 codes
-    locationmode="country names",  # Use 'ISO-3' if you have country codes
-    color="mortality_rate",
-    title=f"Child Mortality Rate in {selected_year}",
-    color_continuous_scale=px.colors.sequential.OrRd,  # Customize the color scale
-)
-# Display the map
-st.plotly_chart(fig)
-st.write("""I began by acquiring a dataset on child mortality rates, with countries as rows and years as columns. The dataset contained child mortality rates as the number of deaths per 1,000 live births.
-To make the dataset suitable for visualization, I transformed it into a long format using pandas.melt(), creating three columns: country, year, and mortality_rate. This step allowed for efficient filtering and visualization.
-I chose a choropleth map because it effectively communicates regional differences using a color gradient. Each country is color-coded based on its mortality rate for a selected year, offering immediate visual insights.
-I implemented a slider widget for year selection, enabling users to dynamically explore mortality rates over time.
-This required ensuring that the year column was properly formatted as numeric data, and filtering the dataset based on the slider’s value.""")