Pandas provides a function called `apply()` that lets you run a custom function on each item in a column or row. This helps in transforming data quickly.
                                
# Function to double the value
def double(x):
  return 2 * x
# Apply this function to each value in a column
df['column1'] = df['column1'].apply(double)
# You can also use lambda functions for quick operations
df['column2'] = df['column2'].apply(lambda x: 3 * x)
# Apply a function across rows to create a new column
df['newColumn'] = df.apply(lambda row: row['column1'] * 1.5 + row['column2'], axis=1)
                              
     You can add new columns to a DataFrame in Pandas by simply assigning values to a new column name. This can be done for each value or for the entire column at once.
                                
# Add a new column with specific values
df['newColumn'] = [1, 2, 3, 4]
# Set all values in a new column to the same value
df['newColumn'] = 1
# Create a new column by calculating from an existing column
df['newColumn'] = df['oldColumn'] * 5
                              
     You can create a Pandas DataFrame using various methods like dictionaries, lists, or by reading from files. This is useful for starting with new data.
                                
# Create DataFrame from a dictionary
data = {'name': ['Anthony', 'Maria'], 'age': [30, 28]}
df = pd.DataFrame(data)
# Create DataFrame from a list of lists
data = [['Tom', 20], ['Jack', 30], ['Meera', 25]]
df = pd.DataFrame(data, columns=['Name', 'Age'])
# Create DataFrame by reading a CSV file
df = pd.read_csv('students.csv')
                              
     A Pandas DataFrame is a table-like structure where data is organized into rows and columns. It's useful for handling and analyzing data efficiently.
                                
import pandas as pd
                              
     Pandas allows you to group data by one or more columns and then apply statistical functions to each group. This helps in summarizing data effectively.
                                
# Create a DataFrame
df = pd.DataFrame([
  ['Amy', 'Assignment 1', 75],
  ['Amy', 'Assignment 2', 35],
  ['Bob', 'Assignment 1', 99],
  ['Bob', 'Assignment 2', 35]
], columns=['Name', 'Assignment', 'Grade'])
# Group by 'Name' and calculate the average grade
df.groupby('Name')['Grade'].mean()
# Output:
# | Name | Grade |
# | ---  | ---   |
# | Amy  | 55    |
# | Bob  | 67    |
                              
     Pandas provides functions to calculate statistics such as average, standard deviation, and median for each column in a DataFrame. This helps in understanding data trends.
                                
# Calculate different statistics for a column
df['columnName'].mean()    # Average
df['columnName'].std()     # Standard deviation
df['columnName'].median()  # Median
df['columnName'].max()     # Maximum value
df['columnName'].min()     # Minimum value
df['columnName'].count()   # Count of values
df['columnName'].nunique() # Number of unique values
df['columnName'].unique()  # List of unique values
                              
     In practice, related data is often split into multiple tables to organize and manage it more efficiently. This is common in databases.
Pandas allows you to combine data from multiple tables using merges. This is useful when you need to integrate information from different sources.
Welcome to our comprehensive collection of programming language cheatsheets! Whether you're a seasoned developer or a beginner, these quick reference guides provide essential tips and key information for all major languages. They focus on core concepts, commands, and functions—designed to enhance your efficiency and productivity.
ManageEngine Site24x7, a leading IT monitoring and observability platform, is committed to equipping developers and IT professionals with the tools and insights needed to excel in their fields.
Monitor your IT infrastructure effortlessly with Site24x7 and get comprehensive insights and ensure smooth operations with 24/7 monitoring.
Sign up now!