Friday, October 12, 2007

Plotting Data



Data is almost useless without a proper view. After storing the temperature data I need a way to plot this information so I can infer something. Let´s face the 'artistic' stage of the project.
Find a python plotting library

In python I found to main options, both with the same problem: at least 80MB of installation packages.

  • Matplotlib: Almost a python standard that mimics matlab way of plotting. Easy to install in Debian (150MB of installation packages in Debian)
  • pyX: A nice alternative for Matplotlib, but harder to install and only supports PDF outputs (80MB of installation packages in Debian because it depends on Tex libs)
After several days thinking if I should get my linux server dirty with more than 80MB just to be able to plot info, I finally decided that plotting worth it, and I installed Matplotlib, because it was very easy to install and has lots of tutorial and documentation.

Installing Matplotlib


apt-get install python-matplotlib


Intraday plot

I faced my first Matplotlib plot with the objective of showing the temperature values stored in the database per day. I wanted temperature in the Y-axis and hours of the day in the X-axis. Since I started coding I realized that I had forgotten all my matlab knowledge and that Matplotlib API wasn't too intuitive. At the end of day I found that Matplotlib at least was very functional.



import MySQLdb
import datetime
import time
import os

import matplotlib
matplotlib.use('Agg')
from pylab import *
from matplotlib.dates import HourLocator, MinuteLocator
from matplotlib.ticker import Locator, FormatStrFormatter

def plotTemperature(init, end, outputFile):
   """ init and end must be timestamps """

  # Get temperatures from db
  db = mysqlDB("database", "user", "password")
  query = "select temperature,timestamp from temperatureLog
              where sensorId=1 and timestamp > %d and timestamp < %d" % (init, end)
  (cursor, data) = db.Query(query)

  # Convert timestamps to matplotlib dates
  temperatureValues = []
  timeValues = []
  for row in data:
    temperatureValues.append(row[0])
    timeValues.append(date2num(datetime.datetime.fromtimestamp(row[1])))

  # Stop if we don't have enough values
  if len(timeValues) < 2:
    return

  # Plot temperatures

  # Configure figure aspect ratio
  figure(1, figsize=[8,3])
  ax = subplot(111)

  # Plot temperature values Vs time values
  plot_date(timeValues, temperatureValues,ls='-',marker='.')

  # Configure axis info formatters and locators
  hourLocator = HourLocator()
  minuteLocator = MinuteLocator(arange(0,60,15))
  dateFormatter = DateFormatter("%H:%M")
  temperatureFormatter = FormatStrFormatter('%.1f C')

  ax.xaxis.set_major_locator(hourLocator)
  ax.xaxis.set_major_formatter(dateFormatter)
  ax.xaxis.set_minor_locator(minuteLocator)

  ax.yaxis.set_major_formatter(temperatureFormatter)

  # Enable grid just for Y-axis
  ax.yaxis.grid(True, linewidth=0.3)

  # Define plot ranges. Temperatures from 20ºC to 30ºC
  axis([min(timeValues),max(timeValues),20,30])

  # Rotate x labels
  xlabels = ax.get_xticklabels()
  setp(xlabels,'rotation', 45, fontsize=8)
  # Set y labels font 8
  ylabels = ax.get_yticklabels()
  setp(ylabels,fontsize=8)

  # Set title
  title("Temperatures %s" % (datetime.datetime.fromtimestamp(init).strftime('%d/%m/%y')))

  # Save plot to image
  savefig(outputFile, format='jpg', dpi=80)

1 comment:

  1. Alberto, have you tried RRDtool?

    http://oss.oetiker.ch/rrdtool/

    ReplyDelete