Skip to content

sanyog-chavhan/Basic-Statistics-using-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Exploring Basic Statistical Concepts with Python

This repository hosts a Google Colab notebook aimed at providing a hands-on introduction to fundamental statistical concepts using Python. Through a real dataset and clear explanations, it'll guide you through key measures like mean, median, mode, correlation, and standard deviation.

Table of Contents

  1. Dataset Description
  2. Exploring Statistical Concepts
  3. How to Use the Notebook

Dataset Description

The dataset is from https://www.baseball-reference.com/

The variables are as follows :

1. Team
2. League
3. Year
4. Runs Scored (RS)
5. Runs Allowed (RA)
6. Wins (W)
7. On-Base Percentage (OBP)
8. Slugging Percentage (SLG)
9. Batting Average (BA)
10. Playoffs (binary)
11. RankSeason
12. RankPlayoffs
13. Games Played (G)
14. Opponent On-Base Percentage (OOBP)
15. Opponent Slugging Percentage (OSLG)

The above dataset contains details of different Baseball Teams and their yearly stats in different leagues.

Exploring Statistical Concepts

Mean:

The mean is like the average of a group of numbers. It's calculated by adding up all the numbers and then dividing by how many there are. It gives us an idea of what a "typical" value in the group might be.

Median:

The median is the middle value in a list of numbers when they're arranged in order. If there's an odd number of values, it's the one right in the middle. If there's an even number of values, it's the average of the two middle numbers. The median helps us find a value that's right in the middle, ignoring extreme values.

Mode:

The mode is like the most common number in a set. It's the value that appears most often. Think of it as the number that shows up the most times. The mode helps us identify the number that pops up frequently.

Correlation:

Correlation measures how two things change together. If they tend to increase or decrease at the same time, they have a positive correlation. If one goes up while the other goes down, they have a negative correlation. If there's no clear pattern, they're not correlated. Correlation helps us understand how connected two sets of data are.

Standard Deviation:

The standard deviation shows us how much the numbers in a group spread out from the average. If the standard deviation is small, the numbers are close to the average. If it's big, the numbers are more spread out. It gives us a sense of how varied the data is and whether it's clustered around the mean or spread widely.

How to Use the Notebook

Follow these steps to start exploring the statistical concepts using the provided Google Colab notebook:

  1. Clone or Download: Clone this repository to your local machine using Git or download the repository as a ZIP file.

  2. Open in Google Colab: If you're using Google Colab, you can directly open the notebook by clicking on the "Open in Colab" button at the top of the notebook file.

  3. Upload to Google Colab: If you prefer to use Google Colab via your browser, upload the downloaded notebook (notebook.ipynb) and the csv file to your Google Drive. Then, open it with Google Colab by right-clicking the file and selecting "Open with" > "Google Colaboratory".

  4. Interact with the Notebook: Once the notebook is open, you can follow along with the code cells. Execute the code cells by clicking the "Run" button or using the keyboard shortcut (Shift + Enter).

  5. Explore and Learn: Engage with the interactive exercises and explanations to grasp the concepts of mean, median, mode, correlation, and standard deviation.

Feel free to experiment, modify the code, and further explore statistical measures on your own!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published