top of page

Agriculture Data Processing with Python

Roches Chaulo

Updated: Nov 26, 2024


















Cleaned, processed, and visualized agricultural datasets using Python and pandas, identifying key trends that informed productivity improvements and sales forecasting

This project has two files:

crop_production.csv











Sales_Data










Cleaning Data

Provide summary details of sales and production data as shown below:

This helps me know how the data is and how many rows and columns it has.

I also check for quick statistics in our data frames in the numerical columns:

sales.describe()

production.describe()

I then check for total missing values in sales and production as shown below:

sales.isnull().sum()

production.isnull().sum()


I then check for duplicates in both data frames:

sales.duplicated()

production.duplicated()


I now remove duplicates to ensure the data frames are clean:

remove_sales_duplicates = sales.drop_duplicates()

remove_production_duplicates = production.drop_duplicates()

remove_production_duplicates

remove_sales_duplicates


Exploratory Data Analysis (EDA)

  1. To find top crops by yield.

top_crops = production.groupby("Crop")["Yield_per_Hectare"].mean().sort_values(ascending=False)

top_crops









  1. Monthly trends in sales volume:

monthly_sales = sales.groupby("Month")["Sales_Volume"].sum()

monthly_sales

















  1. Seasonal trends:

seasons_yield = production.groupby(['Season','Crop'])["Yield_per_Hectare"].mean().unstack()

seasons_yield










Visualization

  1. Plotting Bar chart of average yield per crop and we used a bar graph for this:

top_crops.plot(kind="bar", figsize=(8, 5),colormap="Set2")

plt.ylabel("Yield (tons per hectare)",labelpad=10)

plt.xlabel("Crop",labelpad=10)

plt.title("Average Yield per Crop",y=1.02)















  1. Plotting Line chart of Monthly sales

This line plot shows the total sales volume for each month, making it easy to spot any seasonal patterns in sales

monthly_sales.plot(kind="line", figsize=(6, 4),marker="o")

plt.ylabel("Sales Volume (tons)",labelpad=10)

plt.xlabel("Month",labelpad=10)

plt.title("Monthly Sales Volume",y=1.02)

plt.grid(True)

plt.xticks(range(1,13))












  1. Plotting Seasonal Yield Trends for all the crops

    seasons_yield.plot(kind='bar',figsize=(8,5),colormap="Set2")

    plt.xlabel('Crop',labelpad=10)

    plt.ylabel('Yield(Tonnes per Hectares)',labelpad=10)

    plt.title('Seasonal Yield per Hectares',y=1.02)

    plt.xticks(rotation=45)

    plt.tight_layout()

    plt.show()


  2. Insights and Report

    Having done the analysis and visualization we concluded two major things:

    1. The crop with the highest yield was Maize and Soybean was the lowest
    2. The months with the highest sales volumes were months 6 and 4.

Comments


DON'T MISS THE FUN.

FOLLOW ME ELSEWHERE

  • Facebook
  • Instagram

SHOP MY LOOK

POST ARCHIVE

bottom of page