Air quality Analysis using Machine learning

Sai Prakash Jallu
3 min readOct 13, 2019

--

AIM

The goal of this project is to train a Machine Learning algorithm capable of doing Air quality analysis using some gases that are present in the air.

By doing this project we get to know how air pollution is causing in the environment.

OVERVIEW

This machine learning algorithm uses regression model to predict air pollution analysis.

DATASET

Air quality database is presented, composed by a set of dangerous gases present in our environment. The database is composed by different set of gases, temperature, humidity etc…

IMPORT PACKAGES

Packages for this Project

Load Data

Loading .csv file is done and let’s check data is present or not in the dataset

Let’s see information about the dataset

Preprocessing of data

Checking null values in the dataset

From the above figure we get to know about the null values are present in the data set.

-Description of the data set

As you see in min row for every feature it’s showing -200 values. Here data cleaning is important for this data set. Replacing with NaN value.

As you can see that NHMC(GT) feature is waste to develop the algorithm

So, we are dropping NHMC(GT) column and replacing with NaN value

In the pre-processing data removing outliers is also important to get accurate results for our model.

Finally ,Fill NaN values with ‘ffill’ function.

Check null values

Data cleaning is completed for our data set. let’s see description of the data to check any errors are present or not.

Pre processing of data is completed.

Linear Regression

  1. Taking X and Y arrays

2. Train_Test_Split

3. Training our Model

Model Evalution

Prediction Model

Linear regression Score

MSE, MAE, RMSE

CONCLUSION

Based on the

--

--