【Python】About the argument key when sorting


This article is available in: 日本語

A reminder about the argument key, which can be specified when sorting in python.

Sort by key

By specifying the element to be sorted using a lambda expression in the key argument of the sorted function, a list sorted by that element can be obtained.

before = [
    ['Chris', 90], ['Bob', 70], ['Alice', 80]

# Sort by score
after = sorted(before, key=lambda x: x[1])

print(f'before: {before}')
print(f'after: {after}')
# Output
before: [['Chris', 90], ['Bob', 70], ['Alice', 80]]
after: [['Bob', 70], ['Alice', 80], ['Chris', 90]]

Sort by 2 keys

You can also sort by specifying two keys. That is, when sorting by the first element and sorting by the second element when they are the same. In this case, the lambda expression is enclosed in a tuple ().

before = [
    ['Bob', 70], ['Alice', 80], ['Chris', 90], ['Alice', 95], ['Chris', 69], ['Bob', 94]

# Sort by name, and if the names are the same, sort by score
after = sorted(before, key=lambda x: (x[0], x[1]))

print(f'before: {before}')
print(f'abter: {after}')
# Output
before: [['Bob', 70], ['Alice', 80], ['Chris', 90], ['Alice', 95], ['Chris', 69], ['Bob', 94]]
after: [['Alice', 80], ['Alice', 95], ['Bob', 70], ['Bob', 94], ['Chris', 69], ['Chris', 90]]

Applying the above, sort the list [‘20210514_2.csv’, ‘20210512_1.csv’, ‘20210516_1.csv’, ‘20210514_1.csv’] by the following rules

  1. Sort the file names in the list in order of oldest to newest based on date.
  2. For files with the same date, sort in order of decreasing suffix.
from datetime import datetime

before = [
    '20210514_2.csv', '20210512_1.csv', '20210516_1.csv', '20210514_1.csv'

def get_datetime(filename):
    date_str = filename.split('_')[0]
    return datetime.strptime(date_str, '%Y%m%d')

def get_suffix(filename):
    suffix = filename.split('.')[0].split('_')[1]
    return suffix

after = sorted(before, key=lambda x: (get_datetime(x), get_suffix(x)))

print(f'before: {before}')
print(f'after: {after}')
# Output
before: ['20210514_2.csv', '20210512_1.csv', '20210516_1.csv', '20210514_1.csv']
after: ['20210512_1.csv', '20210514_1.csv', '20210514_2.csv', '20210516_1.csv']