diff --git a/guide/english/data-science-tools/pandas/index.md b/guide/english/data-science-tools/pandas/index.md index 10604a09758..42a5183c98a 100644 --- a/guide/english/data-science-tools/pandas/index.md +++ b/guide/english/data-science-tools/pandas/index.md @@ -14,17 +14,17 @@ import pandas as pd ``` ## Data frames -A data frame consists of a number of rows and columns. Each column represents a feature of the data set, and so, has a name and a data type. Each row represents a data point through associated feature values. The pandas library allows you to manipulate the data in a data frame in various ways. pandas has a lot of possibilities, so the following is merely scratching the surface to give you a feel for the library. +A data frame consists of a number of rows and columns. Each column represents a feature of the data set, and so has a name and a data type. Each row represents a data point through associated feature values. The pandas library allows you to manipulate the data in a data frame in various ways. pandas has a lot of possibilities, so the following is merely scratching the surface to give a feel for the library. ## Series -Series is the basic data-type in pandas. A Series is very similar to an array (NumPy array, in fact it is built on top of the NumPy array object). A Series can have axis labels, as it can be indexed by a label with no number indexing for the location of data. It can hold any valid Python Object like List, Dictionary etc. +Series is the basic data-type in pandas. A Series is very similar to an array (NumPy array) (in fact it is built on top of the NumPy array object). A Series can have axis labels, as it can be indexed by a label with no number indexing for the location of data. It can hold any valid Python Object like List, Dictionary, etc. ## Loading data from a csv file A `.csv` file is a *comma separated value* file. A very common way to store data. To load such data into a pandas data frame use the `read_csv` method: ```python df = pd.read_csv(file_path) ``` -Here, `file_path` can be a local path to a csv file on you computer, or a url pointing to one. The column names may be included in the csv file, or the may be passed as an argument. For more on this, and much more, take a look at the [documentation](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html?highlight=read_csv#pandas.read_csv). +Here, `file_path` can be a local path to a csv file on you computer, or an url pointing to one. The column names may be included in the csv file, or may be passed as an argument. For more on this, and much more, take a look at the [documentation](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html?highlight=read_csv#pandas.read_csv). ## Getting an overview of a data frame To show the first few rows of a data frame, the `head` method is useful (once more this should sound familiar to R programmers): @@ -33,7 +33,7 @@ df.head() ``` This will show the first 5 rows of the data frame. -To show more than first 5 rows simply put the number of rows you want to print out inside the `head` method. +To show more than the first 5 rows simply put the number of rows you want to print out inside the `head` method. ```python df.head(10) ```