Pandas is a powerful and widely-used open-source library in Python that provides high-level data structures and tools designed to make data analysis and manipulation fast and easy. It is especially well-suited for working with structured data, such as tables, spreadsheets, and SQL databases. Built on top of NumPy, Pandas enables rich data operations with convenient indexing, grouping, filtering, merging, and time series capabilities.
pip install pandas
import pandas as pd
A Series is a one-dimensional labeled array capable of holding any data type.
import pandas as pd
s = pd.Series([10, 20, 30, 40])
print(s)
A DataFrame is a two-dimensional labeled data structure with columns of potentially different types.
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'Salary': [50000, 60000, 70000]
}
df = pd.DataFrame(data)
print(df)
data = {
'Product': ['Apple', 'Banana', 'Mango'],
'Price': [100, 40, 60]
}
df = pd.DataFrame(data)
print(df)
data = [['Tom', 25], ['Jerry', 30]]
df = pd.DataFrame(data, columns=['Name', 'Age'])
print(df)
df = pd.read_csv('data.csv')
print(df.head())
print(df.head()) # First 5 rows
print(df.tail()) # Last 5 rows
print(df.shape)
print(df.info())
print(df.describe())
print(df['Name'])
print(df.loc[0]) # Access by label
print(df.iloc[0]) # Access by position
print(df[1:3])
df['Tax'] = df['Salary'] * 0.1
print(df)
df.rename(columns={'Salary': 'Income'}, inplace=True)
print(df)
df.drop('Tax', axis=1, inplace=True) # Drop column
df.drop(1, axis=0, inplace=True) # Drop row
print(df[df['Age'] > 28])
df['Age'] = df['Age'].astype(float)
print(df.isnull())
print(df.isnull().sum())
df.fillna(value=0, inplace=True)
df.dropna(inplace=True)
grouped = df.groupby('Age')
print(grouped['Income'].mean())
print(df.agg({'Income': ['sum', 'mean'], 'Age': ['min', 'max']}))
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
result = pd.concat([df1, df2])
print(result)
df1 = pd.DataFrame({'key': ['K0', 'K1'], 'A': ['A0', 'A1']})
df2 = pd.DataFrame({'key': ['K0', 'K1'], 'B': ['B0', 'B1']})
merged = pd.merge(df1, df2, on='key')
print(merged)
df1 = df1.set_index('key')
df2 = df2.set_index('key')
joined = df1.join(df2)
print(joined)
df.sort_values(by='Age', ascending=False, inplace=True)
print(df)
df.sort_index(inplace=True)
df['Rank'] = df['Income'].rank()
print(df)
df['JoinDate'] = pd.to_datetime(['2023-01-01', '2023-02-15', '2023-03-01'])
df['Year'] = df['JoinDate'].dt.year
df['Month'] = df['JoinDate'].dt.month
print(df[df['JoinDate'] > '2023-01-31'])
df = pd.read_csv('data.csv')
df.to_csv('output.csv', index=False)
df = pd.read_excel('data.xlsx')
df.to_excel('output.xlsx', index=False)
pivot = df.pivot_table(values='Income', index='Age', aggfunc='mean')
print(pivot)
df.plot(x='Age', y='Income', kind='line')
df.plot(x='Name', y='Income', kind='bar')
df['Age'].plot(kind='hist')
arrays = [['bar', 'bar', 'baz'], ['one', 'two', 'three']]
index = pd.MultiIndex.from_arrays(arrays, names=('first', 'second'))
df = pd.DataFrame({'A': [1, 2, 3]}, index=index)
print(df)
df_unstacked = df.unstack()
df_stacked = df_unstacked.stack()
df['RollingMean'] = df['Income'].rolling(window=2).mean()
print(df)
Pandas is a cornerstone tool in the Python data science ecosystem. It provides intuitive, flexible, and efficient methods for data analysis and preprocessing, enabling rapid insights and modeling. With features such as powerful indexing, missing data handling, groupby and aggregation, file I/O, and time series support, Pandas is indispensable for both beginners and professionals working with real-world data.
Learning Pandas thoroughly equips you to handle almost any kind of tabular data manipulation or analysis task in Python. As you continue exploring, combining Pandas with libraries like NumPy, matplotlib, seaborn, and scikit-learn will further expand your data science capabilities.
Python is commonly used for developing websites and software, task automation, data analysis, and data visualisation. Since it's relatively easy to learn, Python has been adopted by many non-programmers, such as accountants and scientists, for a variety of everyday tasks, like organising finances.
Learning Curve: Python is generally considered easier to learn for beginners due to its simplicity, while Java is more complex but provides a deeper understanding of how programming works.
The point is that Java is more complicated to learn than Python. It doesn't matter the order. You will have to do some things in Java that you don't in Python. The general programming skills you learn from using either language will transfer to another.
Read on for tips on how to maximize your learning. In general, it takes around two to six months to learn the fundamentals of Python. But you can learn enough to write your first short program in a matter of minutes. Developing mastery of Python's vast array of libraries can take months or years.
6 Top Tips for Learning Python
The following is a step-by-step guide for beginners interested in learning Python using Windows.
Best YouTube Channels to Learn Python
Write your first Python programStart by writing a simple Python program, such as a classic "Hello, World!" script. This process will help you understand the syntax and structure of Python code.
The average salary for Python Developer is βΉ5,55,000 per year in the India. The average additional cash compensation for a Python Developer is within a range from βΉ3,000 - βΉ1,20,000.
Copyrights © 2024 letsupdateskills All rights reserved