pandas.iloc vs pandas.loc
DataFrame.loc vs DataFrame.iloc
DataFrame.loc
Access a group of rows and columns by label(s) or boolean array.
.loc[]
is primarily label based, but may also be used with a boolean array
Select rows and columns by labels
df.loc[rows, columns]
Inputs
- A single label (e.g.
5
or'a'
)- Note, Never as an integer position along the index
df.loc['country']
- Note, Never as an integer position along the index
- A list or array of labels (e.g.
['a', 'b', 'c']
)df.loc[['country', 'points']]
- Note,
df.loc[['a', 'b']]
≠df.loc['a', 'b']
-
df.loc['a', 'b']
→ ‘a’: row, ‘b’: column -
df.loc[['a', 'b']]
→['a', 'b']
is list of labels.
-
- Note,
- A slice object with labels (e.g.
'a':'f'
)- Note, contrary to usual Python slices, both the start and the stop are included.
df.loc['country':'price']
- Note, contrary to usual Python slices, both the start and the stop are included.
- A boolean array of the same length as the axis being sliced (e.g.
[True, False, True]
)df.loc[[True, False, True]]
- An alignable boolean Series. The index of the key will be aligned before masking
df.loc[pd.Series([True, False, False], index=['country', 'points', 'price'])]
- An alignable index. The index of the returned selection will be the input.
- A
callable
function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above)df.loc[lambda df:df['points'] == 30]
- Conditional
df.loc[df['points'] > 60]
df.loc[df['points'] > 60, ['country']]
Also, can set value for all items matching the list of labels
df.loc[0, ['country']] = Italy
Or, set value for an entire row or column
df.loc[1] = 40
Error
- KeyError: if any items are not found
- IndexingError: if an indexed key is passed and its index is unalignable to the frame index
DataFrame.iloc
Purely integer-location based indexing for selection by position.
.iloc[]
is primarily integer position based (from0
tolength-1
of the axis), but may also be used with a boolean array.
Select rows and columns by index
df.iloc[rows, columns]
Inputs
- An integer
df.iloc[0]
- A list or array of integers
df.iloc[[0, 1]]
- A slice object with ints
df.iloc[:3] df.iloc[1:6]
- A boolean array
df.iloc[[True, True, False]]
- A
callable
function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above). This is useful in method chains, when do not have a reference to calling object, but would like to base your selection on some value.df.iloc[lambda x:x.index % 2 == 0] df.iloc[:, lambda df: [0, 2]]
- The x passed to the
lambda
is the DataFrame being sliced. This selects the rows whose index label even.
- The x passed to the
- A tuple of row and column indexes. The tuple elements consist of one of the above inputs
Error
- IndexError: if a requested indexer is out of bounds, except slice indexers which allow out of bounds indexing
Reference
Enjoy Reading This Article?
Here are some more articles you might like to read next: