Line chart 1 – data from a row#

This template is made for you to customize.#

Sometimes we want to work with specific rows in a data set, especially when the data set appears to be in a list format. This type of data set is common when we are working on with objects that we can easily compare with each other, such as information between different countries, CO\(^2\) emissions of different food supplies etc. In this template we use a data set that can be found from here, showing country-specific energy consumption per capita from 1960 to 2015.

Go ahead and use this template if you want to make a line chart from a row of your data file (or you can use the example data set). You just have to replace the data set and variable names, and add some text. If you have any problems and you cannot find an answer from Google, please do contact us! We are happy to help you.

# Import libraries 

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Read csv file, which is downloaded to GitHub.
# Head-command shows the data set, check that everything looks good.

energy_per_capita = pd.read_csv('https://raw.githubusercontent.com/opendata-education/Harjoittelu/main/data/energy_use_per_person.csv')
energy_per_capita.head()
country 1960 1961 1962 1963 1964 1965 1966 1967 1968 ... 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
0 Angola NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 459 472 492 515 521 522 552 534 545 NaN
1 Albania NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 707 680 711 732 729 765 688 801 808 NaN
2 United Arab Emirates NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 8720 8130 8370 7570 7220 7190 7480 7600 7650 NaN
3 Argentina NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 1850 1860 1940 1870 1930 1950 1940 1970 2030 NaN
4 Armenia NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 865 973 1030 904 863 944 1030 1000 1020 NaN

5 rows × 57 columns

# With this command we can check the row of the wanted country, so we can use that information later.

energy_per_capita.loc[energy_per_capita['country'] == 'Finland']
country 1960 1961 1962 1963 1964 1965 1966 1967 1968 ... 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
51 Finland 2200 2250 2360 2480 2680 2890 3070 3130 3340 ... 7110 6970 6670 6270 6830 6540 6280 6120 6210 5920

1 rows × 57 columns

# Prepare the data for x-axis - collect the column headers into a list
# Use np.int_ or np.float_ commands. This changes the objects in the list to values
# It is working, if it removes these '' from around the values

# The number in square brackets in .values[1:] meand that we are choosing columns from nr. 1 till the last one
# First column is nr. 0, so this drops off the 'country' column.
# If you'd like to choose years 1960-1969, the command would be .values[1:11].

year = list(np.int_(energy_per_capita.columns.values[1:]))
print(year)
[1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015]
# Prepare the data for y-axis.
# Insert the number of the row you are interested in into the square brackets, for Finland it is 51.

finland = energy_per_capita.loc[51]
finland = list(np.int_(finland[1:]))
print(finland)
[2200, 2250, 2360, 2480, 2680, 2890, 3070, 3130, 3340, 3670, 3870, 3940, 4190, 4510, 4380, 4180, 4460, 4520, 4660, 4960, 5150, 4930, 4810, 4820, 4910, 5270, 5480, 5960, 5590, 5750, 5690, 5740, 5380, 5620, 5970, 5660, 6070, 6280, 6320, 6280, 6260, 6410, 6730, 7080, 7130, 6560, 7110, 6970, 6670, 6270, 6830, 6540, 6280, 6120, 6210, 5920]
# All the data has been prepared, so we can create the plot with plt.plot(x,y) command

plt.plot(year, finland)
plt.title('Title')
plt.xlabel('x-axis label')
plt.ylabel('y-axis label')
plt.show()
../_images/linechart1_7_0.png

If you want to add another line into the chart, here is the code:#

Let’s compare the energy consumption between Finns and Indians.

# Finding India from the data set.

energy_per_capita.loc[energy_per_capita['country'] == 'India']
country 1960 1961 1962 1963 1964 1965 1966 1967 1968 ... 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
72 India NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 466 485 502 545 562 578 599 606 637 NaN

1 rows × 57 columns

# Prepare the second data for y-axis, in this case both have the same x-axis.

india = energy_per_capita.loc[72]
india = list(np.float_(india[1:]))
print(india)
[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 267.0, 267.0, 269.0, 273.0, 276.0, 280.0, 282.0, 279.0, 286.0, 286.0, 294.0, 298.0, 301.0, 306.0, 315.0, 319.0, 324.0, 334.0, 343.0, 350.0, 357.0, 363.0, 365.0, 371.0, 385.0, 389.0, 397.0, 399.0, 415.0, 417.0, 416.0, 421.0, 424.0, 440.0, 450.0, 466.0, 485.0, 502.0, 545.0, 562.0, 578.0, 599.0, 606.0, 637.0, nan]
# Plotting both lines, adding colors and names for the lines.

plt.plot(year, finland, color='blue', label='Finland')
plt.plot(year, india, color='red', label='India')

plt.title('Title')
plt.xlabel('x-axis label')
plt.ylabel('y-axis label')

plt.legend() # Prints names of the lines
plt.show()
../_images/linechart1_11_0.png