1.4 KiB
1.4 KiB
A1 - Data Wrangling-1
✅ Tested and working as intended.
Pre-requisites
- Install required libraries:
pandas&numpy
pip install pandas numpy
- Save the dataset iris.csv in the same directory as this Jupyter notebook.
Code blocks
- Import libraries:
import pandas as pd
import numpy as np
- Load the dataset from a CSV file into a pandas DataFrame:
df=pd.read_csv('iris.csv')
df.describe() # Print description of DataFrame
- Print first and last 5 values:
print("First 5 values:\n", df.head())
print ("Last 5 values:\n", df.tail())
- Print duplicated values:
df.duplicated()
- Print null values true/false:
df.isnull()
- Print summary of DataFrame:
df.info()
- Print shape, i.e. rows + columns:
df.shape
- Print null (true/false) values in
sepal.lengthcolumn:
df["sepal.length"].isnull()
- Delete/Drop
petal.lengthcolumn:
y = df.drop(["petal.length"], axis=1) # axis=1 column. For row, axis=0
print(y)
- In
varietycolumn, replaceSetosawith0andVirginicawith1:
df['variety'].replace(['Setosa', 'Virginica'], [0,1], inplace=True)
print(df)
- Print sum of NULL values in each column:
df.isnull().sum()