Jump to content

COVID-19/All-cause deaths/Scripts

From Wikiversity

The following Python scripts were used to produce charts. They are small and very uncomplicated, very straightforward.

plotHmd.py

[edit | edit source]
import sys, csv, datetime, argparse

parser = argparse.ArgumentParser(description="Plots weekly all-cause death data from HMD.")
addarg = parser.add_argument
addarg("fileName", help="e.g. stmf.csv")
addarg("countryCode", help="Country code, e.g. USA")
addarg("figureFieldName", help="e.g. D0_14 (deaths for 0-14y), DTotal (total deaths)")
addarg("smoothingCount", help="e.g. 3", type=int, default=1, nargs="?")
addarg("--svg", help="Output to a SCG file using matplotlib.", action="store_true")
addarg("--show", help="Show the svg rather than saving it.", action="store_true")
args = parser.parse_args()

countryNames = {"AUS": "Australia", "AUT": "Austria", "BEL": "Belgium", "BGR": "Bulgaria", "CAN": "Canada",
  "CHE": "Switzerland", "CHL": "Chile", "CZE": "Czechia", "DEUTNP": "Germany", "DNK": "Denmark", "ESP": "Spain",
  "EST": "Estonia", "FIN": "Finland", "FRATNP": "France", "GBRTENW": "England and Wales",
  "GBR_NIR": "Northern Ireland", "GBR_SCO": "Scotland", "GRC": "Greece", "HRV": "Croatia",
  "HUN": "Hungary", "ISL": "Iceland", "ISR": "Israel", "ITA": "Italy", "KOR": "Republic of Korea",
  "LTU": "Lithuania", "LUX": "Luxembourg", "LVA": "Latvia", "NLD": "Netherlands",
  "NOR": "Norway", "NZL_NP": "New Zealand", "POL": "Poland", "PRT": "Portugal", "RUS": "Russia",
  "SVK": "Slovakia", "SVN": "Slovenia", "SWE": "Sweden", "TWN": "Taiwan", "USA": "U.S."}

data = []
xtickDates, xtickDatesStr = [], []
prevYear = 0
with open(args.fileName) as file1:
  file1.readline() # Two intro lines without data
  file1.readline()
  for line in csv.DictReader(file1):
    if line and line["CountryCode"] == args.countryCode and line["Sex"] == "b":
      deaths = int(float(line[args.figureFieldName]))
      date = line["Year"] + " " + line["Week"] + " 0"
      date = datetime.datetime.strptime(date, "%Y %U %w")
      data.append( (date, deaths) )
      if line["Year"] != prevYear:
        xtickDates.append(date)
        xtickDatesStr.append(str(date.year))
      prevYear = line["Year"]

data.pop() # Drop last two weeks for too big registration delay effect
data.pop()

def movingAverage(values, itemCount):
  average = 0
  averages = []
  idx = -1
  for val in values:
    idx += 1
    average += val - values[idx - itemCount] if idx >= itemCount else val
    if idx >= itemCount - 1:
      averages.append(average / float(itemCount))
    else:
      averages.append(None)
  return averages

values = [v for k, v in data]
if args.smoothingCount > 1:
  values = movingAverage(values, args.smoothingCount)

maxValue = max([float(v) for v in values if v is not None])
valueFormatString = "%.1f" if maxValue < 100 else "%.0f"

if args.svg:
  from matplotlib import pyplot as plt
  fig, biax = plt.subplots()
  figSize = fig.get_size_inches()
  fig.set_size_inches(figSize[0] * 4/3, figSize[1])  
  dates = [k for k, v in data]
  plt.xticks(xtickDates, xtickDatesStr, rotation=70)
  plt.plot([k for k, v in data], values, linewidth=1)
  plt.grid(True)
  bottom, top = plt.ylim()
  plt.ylim(0, top)
  countryName = countryNames[args.countryCode]
  titlePart = ""
  if args.smoothingCount:
    titlePart = "\nMoving average week count: " + str(args.smoothingCount)
  if args.figureFieldName != "DTotal":
    titlePart += "\n" if titlePart == "" else "       "
    titlePart += "Figure field: " + str(args.figureFieldName)
  plt.title("Weekly all-cause deaths for " + countryName + " per Human Mortality Database" + titlePart)
  if args.show:
    plt.show()
  else:
    figureFieldAddition = "" if args.figureFieldName == "DTotal" else "_" + args.figureFieldName
    plt.savefig("AllCauseDeaths_" + args.countryCode + figureFieldAddition + ".svg")
else:
  # Output useful for the graphing plugin/extension for MediaWiki
  sys.stdout.write("|x = " + ", ".join([k.strftime("%Y-%m-%d") for k, v in data]) + "\n")
  sys.stdout.write("|y = " + ", ".join([valueFormatString % v if v is not None else "" for v in values]) + "\n")

Usage:

plotHmd.py stmf.csv USA DTotal --svg
plotHmd.py stmf.csv USA D0_14
plotHmd.py stmf.csv USA D0_14 3

To obtain stmf.csv, go to mpidr.shinyapps.io/stmortality and at the bottom of the left pane click on the icon to the right of "CSV".

Country codes: AUS (Australia), AUT (Austria), BEL (Belgium), BGR (Bulgaria), CAN (Canada), CHE (Switzerland), CHL (Chile), CZE (Czechia), DEUTNP (Germany), DNK (Denmark), ESP (Spain), EST (Estonia), FIN (Finland), FRATNP (France), GBRTENW (England and Wales), GBR_NIR (Northern Ireland), GBR_SCO (Scotland), GRC (Greece), HRV (Croatia), HUN (Hungary), ISL (Iceland), ISR (Israel), ITA (Italy), KOR (Republic of Korea), LTU (Lithuania), LUX (Luxembourg), LVA (Latvia), NLD (Netherlands), NOR (Norway), NZL_NP (New Zealand), POL (Poland), PRT (Portugal), RUS (Russia), SVK (Slovakia), SVN (Slovenia), SWE (Sweden), TWN (Taiwan), USA (U.S.). See also W:ISO 3166-1 alpha-3; note that some codes in stmf.csv have custom suffix such as the custom codes for parts of the United Kingdom (GBRTENW, etc.) or FRATNP instead of FRA. See also HMD-countries-codes.pdf, mortality.org; however, this list contains multiple codes absent from stmf.csv, such as FRACNP and UKR.

Field codes: D0_14 (deaths for 0-14y), D15_64 (deaths for 15-64y), D65_74 (deaths for 65-74y), D75_84 (deaths for 75-84y), D85p (deaths for 85+y), DTotal (deaths total).

The script drops last two data points to prevent the worst effect of registration delay; for some countries, the last two weeks were obviously very badly affected by registration delay.

plotHmdPerYear.py

[edit | edit source]
import sys, csv, datetime

fileName = sys.argv[1] # e.g. stmf.csv
countryCode = sys.argv[2] # e.g. USA
figureFieldName = sys.argv[3] # e.g. D0_14 (deaths for 0-14y), DTotal (total deaths)

data = []
file1 = open(fileName)
file1.readline() # Two intro lines without data
file1.readline()
for line in csv.DictReader(file1):
  if line and line["CountryCode"] == countryCode and line["Sex"] == "b":
    deaths = int(float(line[figureFieldName]))
    data.append( (int(line["Year"]), int(line["Week"]), deaths) ) 

data.pop() # Drop last two weeks for too big registration delay effect
data.pop()

years = sorted(list({year for year, week, deaths in data}))
maxYear = max(years)
maxYearWeeks = [week for year, week, deaths in data if year == maxYear]
maxWeek = maxYearWeeks[-1]

deathsUpToMaxWeek = []
for year in years:
  deathsUpToMaxWeek1 = 0
  for year1, week, deaths in data:
    if year1 == year and week <= maxWeek:
      deathsUpToMaxWeek1 += deaths
  deathsUpToMaxWeek.append(deathsUpToMaxWeek1)
       
yearsOut = ", ".join([str(year) for year in years])
deathsOut = ", ".join([str(deaths) for deaths in deathsUpToMaxWeek])
sys.stdout.write("Last week in %i: %i\n" % (maxYear, maxWeek))
sys.stdout.write("|x = " + yearsOut + "\n")
sys.stdout.write("|y = " + deathsOut + "\n")

Usage: similar to plotHmd.py.

plotHmdPerSeason.py

[edit | edit source]
import sys, csv, datetime

# Plot deaths per season: week 40 of year before to week x of the year

fileName = sys.argv[1] # e.g. stmf.csv
countryCode = sys.argv[2] # e.g. USA
figureFieldName = sys.argv[3] # e.g. D0_14 (deaths for 0-14y), DTotal (total deaths)
seasonStartWeek = 40

data = []
file1 = open(fileName)
file1.readline() # Two intro lines without data
file1.readline()
for line in csv.DictReader(file1):
  if line and line["CountryCode"] == countryCode and line["Sex"] == "b":
    deaths = int(float(line[figureFieldName]))
    data.append( (int(line["Year"]), int(line["Week"]), deaths) ) 

data.pop() # Drop last two weeks for too big registration delay effect
data.pop()

years = sorted(list({year for year, week, deaths in data}))
maxYear = max(years)
maxYearWeeks = [week for year, week, deaths in data if year == maxYear]
maxWeek = maxYearWeeks[-1]
if maxWeek >= seasonStartWeek:
  maxWeek = seasonStartWeek - 1

deathsInSeason = []
for year in years[1:]:
  deathsInSeason1 = 0
  for year1, week, deaths in data:
    if year1 == year and week <= maxWeek:
      deathsInSeason1 += deaths
    if year1 == (year - 1) and week >= seasonStartWeek:
      deathsInSeason1 += deaths      
  deathsInSeason.append(deathsInSeason1)
       
yearsOut = ", ".join([str(year) for year in years[1:]])
deathsOut = ", ".join([str(deaths) for deaths in deathsInSeason])
write = sys.stdout.write
write("Last week in %i: %i\n" % (maxYear, maxWeek))
write("All-cause deaths in weeks %i+ year before and weeks 1-%i of the year, year by year:\n" % (seasonStartWeek, maxWeek))
write("|x = " + yearsOut + "\n")
write("|y = " + deathsOut + "\n")

Usage: similar to plotHmd.py.

plotHmdExcessDeathPercPerYear.py

[edit | edit source]
import sys, csv, argparse

parser = argparse.ArgumentParser(description="Plots all-cause excess death percentage from HMD.")
addarg = parser.add_argument
addarg("fileName", help="e.g. stmf.csv")
addarg("countryCode", help="Country code, e.g. USA")
addarg("figureFieldName", help="e.g. D0_14 (deaths for 0-14y), DTotal (total deaths)")
addarg("baseYearRangeLen", help="e.g. 5", type=int)
addarg("--svg", help="Output to a SCG file using matplotlib.", action="store_true")
addarg("--show", help="Show the svg rather than saving it.", action="store_true")
args = parser.parse_args()

countryNames = {"AUS": "Australia", "AUT": "Austria", "BEL": "Belgium", "BGR": "Bulgaria", "CAN": "Canada",
  "CHE": "Switzerland", "CHL": "Chile", "CZE": "Czechia", "DEUTNP": "Germany", "DNK": "Denmark", "ESP": "Spain",
  "EST": "Estonia", "FIN": "Finland", "FRATNP": "France", "GBRTENW": "England and Wales",
  "GBR_NIR": "Northern Ireland", "GBR_SCO": "Scotland", "GRC": "Greece", "HRV": "Croatia",
  "HUN": "Hungary", "ISL": "Iceland", "ISR": "Israel", "ITA": "Italy", "KOR": "Republic of Korea",
  "LTU": "Lithuania", "LUX": "Luxembourg", "LVA": "Latvia", "NLD": "Netherlands",
  "NOR": "Norway", "NZL_NP": "New Zealand", "POL": "Poland", "PRT": "Portugal", "RUS": "Russia",
  "SVK": "Slovakia", "SVN": "Slovenia", "SWE": "Sweden", "TWN": "Taiwan", "USA": "U.S."}

data = []
with open(args.fileName) as file1:
  file1.readline() # Two intro lines without data
  file1.readline()
  for line in csv.DictReader(file1):
    if line and line["CountryCode"] == args.countryCode and line["Sex"] == "b":
      deaths = int(float(line[args.figureFieldName]))
      data.append( (int(line["Year"]), int(line["Week"]), deaths) ) 

data.pop() # Drop last two weeks for too big registration delay effect
data.pop()

years = sorted(list({year for year, week, deaths in data if week == 1}))
years = years[:-1] # Drop the last year since it is usually incomplete
minYear = min(years)
maxYear = max(years)
calculableYears = [year for year in years if year >= minYear + args.baseYearRangeLen]

deathsPerYear = {}
for year, week, deaths in data:
  if year <= maxYear: 
    if not year in deathsPerYear:
      deathsPerYear[year] = 0
    deathsPerYear[year] += deaths

excessPercentPerYear = {}
for year in calculableYears:
  baseYearRangeMin = None
  for yearOffset in range(args.baseYearRangeLen, 0, -1):
    if baseYearRangeMin is None:
      baseYearRangeMin = deathsPerYear[year - yearOffset]
    else:
      baseYearRangeMin = min(baseYearRangeMin, deathsPerYear[year - yearOffset])
  excessPercentPerYear[year] = deathsPerYear[year] / float(baseYearRangeMin) - 1

if args.svg:
  from matplotlib import pyplot as plt
  import matplotlib.ticker as mtick
  fig, biax = plt.subplots()
  figSize = fig.get_size_inches()
  fig.set_size_inches(figSize[0] * 4/3, figSize[1])  
  plt.xticks(calculableYears, rotation=70)
  plt.grid(True, axis="y", zorder=0)
  plt.bar(calculableYears, [100 * excessPercentPerYear[year] for year in calculableYears], linewidth=1, zorder=3)
  biax.yaxis.set_major_formatter(mtick.PercentFormatter(decimals=0))
  countryName = countryNames[args.countryCode]
  plt.title("All-cause death excess percentage for " + countryName + " per Human Mortality Database" +
            "\nBaseline year count: " + str(args.baseYearRangeLen))
  if args.show:
    plt.show()
  else:
    plt.savefig("AllCauseDeathsExcessPerc_" + args.countryCode + ".svg")
else:
  yearsOut = ", ".join([str(year) for year in calculableYears])
  excessPercOut = ", ".join([("%.3f" % excessPercentPerYear[year]) for year in calculableYears])
  sys.stdout.write("|x = " + yearsOut + "\n")
  sys.stdout.write("|y = " + excessPercOut + "\n")

Usage (similar to plotHmd.py).

  • plotHmdExcessDeathPercPerYear.py stmf.csv BGR DTotal 5

Rationale for using minimum of a range of years as the baseline: visual inspection of the weekly charts shows that significant year-specific variation occurs in the upward direction but not in the downward direction. Thus, the year for which as little year-specific variation occurred as possible is taken to be the baseline; if we took the average, we would include previous year-specific upward variation into the baseline.

plotHmdRatioAllCountries.py

[edit | edit source]
import sys, csv, datetime, argparse

parser = argparse.ArgumentParser(description="For all countries: Calculate all-cause deaths in 2020-2022" +
                                 " divided by those in 2017-2019; or similar ranges for different year block length.")
addarg = parser.add_argument
addarg("fileName", help="e.g. stmf.csv")
addarg("yearCount", help="e.g. 2, 3 or 4", type=int, default=3, nargs="?")
addarg("--svg", help="Output to a SVG file using matplotlib.", action="store_true")
addarg("--show", help="Show the SVG rather than saving it.", action="store_true")
args = parser.parse_args()

figureFieldName = "DTotal"

countryNames = {"AUS": "Australia", "AUT": "Austria", "BEL": "Belgium", "BGR": "Bulgaria", "CAN": "Canada",
  "CHE": "Switzerland", "CHL": "Chile", "CZE": "Czechia", "DEUTNP": "Germany", "DNK": "Denmark", "ESP": "Spain",
  "EST": "Estonia", "FIN": "Finland", "FRATNP": "France", "GBRTENW": "England and Wales",
  "GBR_NIR": "Northern Ireland", "GBR_SCO": "Scotland", "GRC": "Greece", "HRV": "Croatia",
  "HUN": "Hungary", "ISL": "Iceland", "ISR": "Israel", "ITA": "Italy", "KOR": "Republic of Korea",
  "LTU": "Lithuania", "LUX": "Luxembourg", "LVA": "Latvia", "NLD": "Netherlands",
  "NOR": "Norway", "NZL_NP": "New Zealand", "POL": "Poland", "PRT": "Portugal", "RUS": "Russia",
  "SVK": "Slovakia", "SVN": "Slovenia", "SWE": "Sweden", "TWN": "Taiwan", "USA": "U.S."}
  
sourceData = []
countries = set()
yearsAndWeeks = []
with open(args.fileName) as file1:
  file1.readline() # Two intro lines without data
  file1.readline()
  for line in csv.DictReader(file1):
    if line and line["Sex"] == "b":
      deaths = int(float(line[figureFieldName]))
      date = line["Year"] + " " + line["Week"] + " 0"
      date = datetime.datetime.strptime(date, "%Y %U %w")
      sourceData.append( (line["CountryCode"] , date, deaths) )
      yearsAndWeeks.append( (line["CountryCode"], int(line["Year"]), int(line["Week"])) )
      countries.add(line["CountryCode"])
countries = sorted(countries)

countryRatio = []
for country in countries:
  countrySourceData = [(date, deaths) for (countryCode, date, deaths) in sourceData
                       if countryCode == country]

  # Require that the boundary years are 100% completely covered
  countryYearsAndWeeks = sorted([(year, week) for country1, year, week in yearsAndWeeks if country1 == country])  
  if (countryYearsAndWeeks[-1][0] < 2020 + args.yearCount - 1 or
      countryYearsAndWeeks[-1][0] == 2020 + args.yearCount - 1 and countryYearsAndWeeks[-1][1] < 53):
    print("Not enough years for " + countryNames[country] +
           "; max year and week: " + str(countryYearsAndWeeks[-1]))
    continue
  if (countryYearsAndWeeks[0][0] > 2019 - args.yearCount + 1 or
      countryYearsAndWeeks[0][0] == 2019 - args.yearCount + 1 and countryYearsAndWeeks[0][1] > 1):
    print("Not enough years for " + countryNames[country] +
           "; min year and week: " + str(countryYearsAndWeeks[0]))
    continue

  yearRange1Data = [(date, deaths) for (date, deaths) in countrySourceData if (2019 - args.yearCount < date.year <= 2019)]
  yearRange2Data = [(date, deaths) for (date, deaths) in countrySourceData if (2020 <= date.year < 2020 + args.yearCount)]

  yearRange1Deaths = sum([deaths for (date, deaths) in yearRange1Data])
  yearRange2Deaths = sum([deaths for (date, deaths) in yearRange2Data])
  countryRatio.append( (country, yearRange2Deaths/yearRange1Deaths) )
  
countryRatio.sort(key=lambda x: x[1], reverse=True)

if args.svg:
  from matplotlib import pyplot as plt
  import matplotlib.ticker as mtick
  fig, biax = plt.subplots()
  figSize = fig.get_size_inches()
  fig.set_size_inches(figSize[0] * 4/3, figSize[1] * 1.5)
  countryNamesToPlot = [countryNames[x[0]] for x in countryRatio]
  plt.grid(True, axis="x", zorder=0)
  plt.barh(countryNamesToPlot, [100 * x[1] - 100 for  x in countryRatio], linewidth=1, zorder=3)
  biax.xaxis.set_major_formatter(mtick.PercentFormatter(decimals=0))
  titlePart2 = "\nYear range 2020-%i / year range %i-2019" % (2020 + args.yearCount - 1, 2019 - args.yearCount + 1)
  titlePart3 = "\nData source: Human Mortality Database"
  plt.title("All-cause death ratio of Covid and pre-Covid year ranges" +
            titlePart2 + titlePart3)
  plt.tight_layout()
  if args.show:
    plt.show()
  else:
    plt.savefig("AllCauseDeathsHmdAllCountryRatios_YearCount_%i.svg" % args.yearCount)
else:
  raise Exception("Output for MediaWiki graphing plugin not implemented")

plotUsCdc.py

[edit | edit source]
import sys, csv, datetime

fileName = sys.argv[1] # e.g. "Excess_Deaths_Associated_with_COVID-19.csv"
jurisdiction = sys.argv[2] # e.g. "New York City" or "Alabama"
data = []
file1 = open(fileName)
firstThree = file1.read(3) # Drop BOM if present
if firstThree == "Wee": # No BOM
  file1.seek(0)
for line in csv.DictReader(file1):
  if line["State"] == jurisdiction and line["Type"] == "Unweighted":
    date =  datetime.datetime.strptime(line["Week Ending Date"], "%Y-%m-%d")
    deaths = line["Observed Number"].replace(",", "")
    data.append( (date, deaths) )
    
data.sort(key=lambda x: x[0])

datesOut = ", ".join([k.strftime("%Y-%m-%d") for k, v in data])
deathsOut = ", ".join([v for k, v in data])
sys.stdout.write("|x =" + datesOut + "\n")
sys.stdout.write("|y =" + deathsOut + "\n")

Usage:

plotUsCdc.py Excess_Deaths_Associated_with_COVID-19.csv "New York City"

To obtain Excess_Deaths_Associated_with_COVID-19.csv:

  • 1) Visit CDC[1].
  • 2) In section "Download Data:", click on "National and State Estimates of Excess Deaths".
  • 3) Save file "Excess_Deaths_Associated_with_COVID-19.csv", which contains data for all jurisdictions.

plotWmd.py

[edit | edit source]
import sys, csv, datetime, argparse

parser = argparse.ArgumentParser(description='Plot Wmd for wiki.')
parser.add_argument("fileName") # e.g. world_mortality.csv from akarlinsky, hithub.com
parser.add_argument("countryCode") # e.g. USA
parser.add_argument("smoothingCount", nargs="?", type=int, default=1) # e.g. 3
parser.add_argument("--dlti", dest="dropLastTwoItems", action="store_true",
                    help="Drop last two items in the data to prevet registration delay")
parser.add_argument("--ep", dest="excessDeathRangeLen", type=int,
                    help="Excess death percentage: length of ref. year range")
args = parser.parse_args()

def fillDataFromWmdFile(fileName, countryCode, dropLastTwoItems):
  data = []
  with open(fileName) as file1:
    for line in csv.DictReader(file1):
      if line["iso3c"] == countryCode:
        if line["time_unit"] == "weekly":
          week = line["time"]
          date = line["year"] + " " + week + " 0"
          date = datetime.datetime.strptime(date, "%Y %U %w")
        elif line["time_unit"] == "monthly":
          month = line["time"]
          date = line["year"] + " " + month + " 1"
          date = datetime.datetime.strptime(date, "%Y %m %d")
        else:
          sys.stderr.write("Unexpected time unit. Aborting.")
          sys.exit(1)         

        deaths = int(float(line["deaths"]))
        data.append( (date, deaths) )
  if dropLastTwoItems:
    data.pop() # Drop last two weeks for too big registration delay effect
    data.pop()
  return data

def movingAverage(values, itemCount):
  average = 0
  averages = []
  idx = -1
  for val in values:
    idx += 1
    average += val - values[idx - itemCount] if idx >= itemCount else val
    if idx >= itemCount - 1:
      averages.append(average / float(itemCount))
    else:
      averages.append(None)
  return averages

def outputWeeklyOrMonthlyTimeSeriesForWikiChart(data, smoothingCount):
  values = [v for k, v in data]
  if smoothingCount > 1:
    values = movingAverage(values, smoothingCount)

  maxValue = max([float(v) for v in values if v is not None])
  valueFormatString = "%.1f" if maxValue < 100 else "%.0f"
  
  sys.stdout.write("|x = " + ", ".join([k.strftime("%Y-%m-%d") for k, v in data]) + "\n")
  sys.stdout.write("|y = " + ", ".join([valueFormatString % v if v is not None else "" for v in values]) + "\n")

def outputYearlyExcessDeathPercentForWikiChart(data, baseYearRangeLen): # data is a list of (date, death) pairs
  years = sorted(list({date.year for date, deaths in data}))
  years = years[:-1] # Drop last year as incomplete
  minYear = min(years)
  maxYear = max(years)
  calculableYears = [year for year in years if year >= minYear + baseYearRangeLen]

  deathsPerYear = {}
  for date, deaths in data:
    year = date.year
    if year <= maxYear: 
      if not year in deathsPerYear:
        deathsPerYear[year] = 0
      deathsPerYear[year] += deaths

  excessPercentPerYear = {}
  for year in calculableYears:
    baseYearRangeMin = None
    for yearOffset in range(baseYearRangeLen, 0, -1):
      if baseYearRangeMin is None:
        baseYearRangeMin = deathsPerYear[year - yearOffset]
      else:
        baseYearRangeMin = min(baseYearRangeMin, deathsPerYear[year - yearOffset])
    excessPercentPerYear[year] = deathsPerYear[year] / float(baseYearRangeMin) - 1

  yearsOut = ", ".join([str(year) for year in calculableYears])
  excessPercOut = ", ".join([("%.3f" % excessPercentPerYear[year]) for year in calculableYears])
  sys.stdout.write("|x = " + yearsOut + "\n")
  sys.stdout.write("|y = " + excessPercOut + "\n")

data = fillDataFromWmdFile(args.fileName, args.countryCode, args.dropLastTwoItems)
if args.excessDeathRangeLen is None:
  outputWeeklyOrMonthlyTimeSeriesForWikiChart(data, args.smoothingCount)
else:
  outputYearlyExcessDeathPercentForWikiChart(data, args.excessDeathRangeLen)

Usage:

  • plotWmd.py world_mortality.csv PER 3
  • plotWmd.py world_mortality.csv PER --ep 3

Moving average via awk

[edit | edit source]

You can calculate the 7-day moving average using awk on Windows:

echo 1, 0, 4, 5, 18, 15, 28, 26, 64, 77, 101 | awk -F, -vn=7 "{for(i=1;i<=NF; i++) {s+=i>n?$i-$(i-n):$i; if(i>=n){printf \"%.0f, \", s/n}else{printf \", \"}}}"

You can put the result into clipboard:

echo 1, 0, 4, 5, 18, 15, 28, 26, 64, 77, 101 | awk -F, -vn=7 "{for(i=1;i<=NF; i++) {s+=i>n?$i-$(i-n):$i; if(i>=n){printf \"%.0f, \", s/n}else{printf \", \"}}}" | clip

You can do the calculation on Linux:

echo 1, 0, 4, 5, 18, 15, 28, 26, 64, 77, 101 | awk -F, -vn=7 '{for(i=1;i<=NF; i++) {s+=i>n?$i-$(i-n):$i; if(i>=n){printf "%.0f, ", s/n}else{printf ", "}}}'

If you are on Linux or a modern Mac, you already have awk. For Windows, you can install awk from ezwinports or GnuWin32 project.