pandas.iloc vs pandas.loc

DataFrame.loc vs DataFrame.iloc

DataFrame.loc

Access a group of rows and columns by label(s) or boolean array. .loc[] is primarily label based, but may also be used with a boolean array

Select rows and columns by labels

df.loc[rows, columns]

Inputs

  • A single label (e.g. 5 or 'a' )
    • Note, Never as an integer position along the index
      df.loc['country']
      
  • A list or array of labels (e.g. ['a', 'b', 'c'])
    df.loc[['country', 'points']]
    
    • Note, df.loc[['a', 'b']]df.loc['a', 'b']
      • df.loc['a', 'b'] → ‘a’: row, ‘b’: column
      • df.loc[['a', 'b']]['a', 'b'] is list of labels.
  • A slice object with labels (e.g. 'a':'f')
    • Note, contrary to usual Python slices, both the start and the stop are included.
      df.loc['country':'price']
      
  • A boolean array of the same length as the axis being sliced (e.g. [True, False, True])
    df.loc[[True, False, True]]
    
  • An alignable boolean Series. The index of the key will be aligned before masking
    df.loc[pd.Series([True, False, False],
    			index=['country', 'points', 'price'])]
    
  • An alignable index. The index of the returned selection will be the input.
  • A callable function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above)
    df.loc[lambda df:df['points'] == 30]
    
  • Conditional
    df.loc[df['points'] > 60]
    
    df.loc[df['points'] > 60, ['country']]
    

Also, can set value for all items matching the list of labels

df.loc[0, ['country']] = Italy

Or, set value for an entire row or column

df.loc[1] = 40

Error

  • KeyError: if any items are not found
  • IndexingError: if an indexed key is passed and its index is unalignable to the frame index

DataFrame.iloc

Purely integer-location based indexing for selection by position. .iloc[] is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array.

Select rows and columns by index

df.iloc[rows, columns]

Inputs

  • An integer
    df.iloc[0]
    
  • A list or array of integers
    df.iloc[[0, 1]]
    
  • A slice object with ints
    df.iloc[:3]
    df.iloc[1:6]
    
  • A boolean array
    df.iloc[[True, True, False]]
    
  • A callable function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above). This is useful in method chains, when do not have a reference to calling object, but would like to base your selection on some value.
    df.iloc[lambda x:x.index % 2 == 0]
    df.iloc[:, lambda df: [0, 2]]
    
    • The x passed to the lambda is the DataFrame being sliced. This selects the rows whose index label even.
  • A tuple of row and column indexes. The tuple elements consist of one of the above inputs

Error

  • IndexError: if a requested indexer is out of bounds, except slice indexers which allow out of bounds indexing

Reference

  • pandas.DataFrame.iloc — pandas 2.0.3 documentation. (n.d.). document
  • pandas.DataFrame.loc — pandas 2.0.3 documentation. (n.d.). document



    Enjoy Reading This Article?

    Here are some more articles you might like to read next:

  • Machine Learning Final Preparation
  • Queue
  • Stack
  • Amortized Analysis
  • Discrete Math & Proposition and Logical operation 1